Solved

How to split excel file by columns from irs.gov website

Posted on 2014-02-03
8
686 Views
Last Modified: 2014-02-12
Want to split large text file into multiple excel spreadsheets by column.

The end goal is to parse the file so it can be uploaded to Access 2013 which limits the number of columns in a table to 250.

I would like to know how to:
1.) use the field layout to turn the text file into a tab separated text file
2.) how to split the file by columns and insert each section of columns into a separate sheet inside the existing workbook.

Thanks!
dwillpower
10eo.zip
10eoderl.xlsx
0
Comment
Question by:dwillpower
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
8 Comments
 
LVL 37

Expert Comment

by:PatHartman
ID: 39830591
I presume the point of this exercise is to have data you can use for analysis.  Importing all those columns into 4 tables with ~ 250 columns each will simply leave you with a worse nightmare.  I would use OLE automation to open the spreadsheet and column by column, insert rows into an Access table.  I didn't spend a lot of time with it but there are several repeating groups and I would use a separate many-side table for each.  So you would end up with a "main" table and several related tables.  The related tables will each have only 4 columns - UniqueID (autonumber) ParentID (FK to main table), DataPointName, DataPointValue.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39831824
I'm running this code in Access, since it seems like Excel isn't necessary for this to import correctly from the text file source.  Before running this code, you should create a table in your Access database with the following fields:
Key: text
Name: text(255)
Value: text(255)
I did not specify any primary key for this table.

You will need to change the path string literals associated with the two constants (cSpec, cData).

Option Explicit

Public Sub Q_28355228()
    Dim oFS As Object
    Dim oTS As Object
    Dim strLine As String
    Dim vSpecs As Variant
    Dim oXL As Object
    Dim owkb As Object
    Dim owks As Object
    Dim rs As Recordset
    Dim strKeys As String
    Dim lngLoop As Long
    Dim vKey As Variant
    Dim oDic As Object
    Dim vItem As Variant
    Const cSpec As String = "C:\Users\Mark\Downloads\10eoderl.xlsx"
    Const cData As String = "C:\Users\Mark\Downloads\10eo.txt"
    Set oXL = CreateObject("excel.application")
    Set owkb = oXL.Workbooks.Open(cSpec)
    Set owks = owkb.Worksheets("EO990_10")
    vSpecs = owks.Range(owks.Range("A5"), owks.Range("A5").End(xlDown).Offset(0, 3))
    Set owks = Nothing
    owkb.Close
    Set owkb = Nothing
    Set oXL = Nothing
    Set rs = DBEngine(0)(0).OpenRecordset("Q_28355228", dbOpenTable)
    Set oDic = CreateObject("scripting.dictionary")
    Set oFS = CreateObject("scripting.filesystemobject")
    Set oTS = oFS.OpenTextFile(cData)
    Do Until oTS.AtEndOfStream
        strLine = oTS.ReadLine
        For lngLoop = LBound(vSpecs, 1) To UBound(vSpecs, 1)
            oDic(vSpecs(lngLoop, 1)) = Mid$(strLine, vSpecs(lngLoop, 3), vSpecs(lngLoop, 4))
        Next
        strKeys = vbNullString
        For Each vKey In Array("ein", "taxpd")
            strKeys = strKeys & oDic(vKey)
        Next
        For Each vItem In oDic
            rs.AddNew
                rs!Key = strKeys
                rs!Name = vItem
                rs!Value = Trim(oDic(vItem))
            rs.Update
        Next
        oDic.RemoveAll
    Loop
    oTS.Close

End Sub

Open in new window


Once the data has been imported, you can use a crosstab query to look at the data in columns.  You may use the crosstabs to make virtual tables or populate real tables.
Crosstab Example:
TRANSFORM First(Q_28355228.Value) AS FirstOfValue
SELECT Q_28355228.Key
FROM Q_28355228
WHERE Name in ("name","dba_name","state","zip","pt1_num_vtng_gvrn_bdy_mems","pt1_num_ind_vtng_mems","tot_num_empls",
"tot_num_vlntrs","tot_gro_ubi","net_unrltd_bus_txbl_incm","contri_grnts_py","prog_srvc_rev_py","invst_incm_py","oth_rev_py",
"contri_grnts_cy","prog_srvc_rev_cy","invst_incm_cy","oth_rev_cy","tot_rev_cy","grnts_and_smlr_amts_cy","bnfts_pd_to_mems_cy",
"slrs_etc_cy","tot_prof_fndrsng_exp_cy","tot_fndrsng_exp_cy","oth_expns_cy","tot_expns_cy","rev_less_expns_cy","pt1_tot_asts_eoy",
"pt1_tot_liab_eoy","net_asts_or_fund_bals_eoy","desc_ln_5021c3","sch_b_rqrd","pltcl_actvs","lbbyng_actvs","subj_to_prxy_tx",
"donr_advsd_fnd","cnsrv_easemnt","colls_of_art","crdt_cnslng","term_of_perm_endwmts","bal_sht_amts_reptd","aud_fincl_stmts",
"cnsld_audt_fin_stmt","school","frgn_offc","frgn_actvs","more_than_5000k_to_orgs","more_than_5000k_to_inds","prof_fndrsng",
"fndrsng_actvs","gaming","hospital","grnts_to_orgs","grnts_to_inds","sch_j_rqrd","tx_exmpt_bnds","invst_tx_exmpt_bnds","escrw_acct",
"on_behalf_of_issr","excss_bnft_trans","pr_excss_bnft_trans","ln_to_ofcr_or_dqp","grnt_to_rltd_prsn","bus_rltnshp_with_org",
"bus_rltnshp_thru_fam","ofcr_ent_with_bus_rltnshp","ded_non_csh_contri","ded_contris_of_art","terminated","partl_liqdtn","disrgd_entity",
"rltd_entity","rltd_org_cntrld_entity","trnsfr_to_non_chrtbl_org ","actvs_cndctd_prtshp","sch_o_req","num_with_1096","num_w2g_incld",
"cmplnc_with_bkup_wthldng","num_of_empls","emplmn_tx_rets_fld","unrltd_bus_incm","frm990t_fld","frgn_fincl_acct","prohib_tx_shltr_trans",
"txbl_prty_notif")
GROUP BY Q_28355228.Key
PIVOT Q_28355228.Name;

Open in new window


Note: you can't get all the columns in a single crosstab query result, because 826 is just too many for Access to handle.
0
 

Author Comment

by:dwillpower
ID: 39843397
Thanks! Great!

One problem so far,  

vSpecs = owks.Range(owks.Range("A5"), owks.Range("A5").End(xlDown).Offset(0, 3))

it errors on (xldown) saying it not defined?

Thanks!
D
0
Microsoft Certification Exam 74-409

VeeamĀ® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.

 
LVL 45

Expert Comment

by:aikimark
ID: 39843644
Oops.  Try this.
vSpecs = owks.Range(owks.Range("A5"), owks.Range("A5").End(xlDown).Offset(0, 3)).Value

Open in new window

0
 

Author Comment

by:dwillpower
ID: 39848760
Still getting the variable not defined xldown.....
0
 
LVL 45

Accepted Solution

by:
aikimark earned 500 total points
ID: 39849464
replace xldown with -4121
0
 

Author Closing Comment

by:dwillpower
ID: 39854136
Rockstar!
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

How to get Spreadsheet Compare 2016 working with the 64 bit version of Office 2016
Access custom database properties are useful for storing miscellaneous bits of information in a format that persists through database closing and reopening.  This article shows how to create and use them.
This Micro Tutorial demonstrates in Microsoft Excel how to consolidate your marketing data by creating an interactive charts using form controls. This creates cool drop-downs for viewers of your chart to choose from.
This Micro Tutorial will demonstrate how to create pivot charts out of a data set. I also added a drop-down menu which allows to choose from different categories in the data set and the chart will automatically update.

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question