Solved

How to split excel file by columns from irs.gov website

Posted on 2014-02-03
8
676 Views
Last Modified: 2014-02-12
Want to split large text file into multiple excel spreadsheets by column.

The end goal is to parse the file so it can be uploaded to Access 2013 which limits the number of columns in a table to 250.

I would like to know how to:
1.) use the field layout to turn the text file into a tab separated text file
2.) how to split the file by columns and insert each section of columns into a separate sheet inside the existing workbook.

Thanks!
dwillpower
10eo.zip
10eoderl.xlsx
0
Comment
Question by:dwillpower
  • 3
  • 3
8 Comments
 
LVL 36

Expert Comment

by:PatHartman
ID: 39830591
I presume the point of this exercise is to have data you can use for analysis.  Importing all those columns into 4 tables with ~ 250 columns each will simply leave you with a worse nightmare.  I would use OLE automation to open the spreadsheet and column by column, insert rows into an Access table.  I didn't spend a lot of time with it but there are several repeating groups and I would use a separate many-side table for each.  So you would end up with a "main" table and several related tables.  The related tables will each have only 4 columns - UniqueID (autonumber) ParentID (FK to main table), DataPointName, DataPointValue.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39831824
I'm running this code in Access, since it seems like Excel isn't necessary for this to import correctly from the text file source.  Before running this code, you should create a table in your Access database with the following fields:
Key: text
Name: text(255)
Value: text(255)
I did not specify any primary key for this table.

You will need to change the path string literals associated with the two constants (cSpec, cData).

Option Explicit

Public Sub Q_28355228()
    Dim oFS As Object
    Dim oTS As Object
    Dim strLine As String
    Dim vSpecs As Variant
    Dim oXL As Object
    Dim owkb As Object
    Dim owks As Object
    Dim rs As Recordset
    Dim strKeys As String
    Dim lngLoop As Long
    Dim vKey As Variant
    Dim oDic As Object
    Dim vItem As Variant
    Const cSpec As String = "C:\Users\Mark\Downloads\10eoderl.xlsx"
    Const cData As String = "C:\Users\Mark\Downloads\10eo.txt"
    Set oXL = CreateObject("excel.application")
    Set owkb = oXL.Workbooks.Open(cSpec)
    Set owks = owkb.Worksheets("EO990_10")
    vSpecs = owks.Range(owks.Range("A5"), owks.Range("A5").End(xlDown).Offset(0, 3))
    Set owks = Nothing
    owkb.Close
    Set owkb = Nothing
    Set oXL = Nothing
    Set rs = DBEngine(0)(0).OpenRecordset("Q_28355228", dbOpenTable)
    Set oDic = CreateObject("scripting.dictionary")
    Set oFS = CreateObject("scripting.filesystemobject")
    Set oTS = oFS.OpenTextFile(cData)
    Do Until oTS.AtEndOfStream
        strLine = oTS.ReadLine
        For lngLoop = LBound(vSpecs, 1) To UBound(vSpecs, 1)
            oDic(vSpecs(lngLoop, 1)) = Mid$(strLine, vSpecs(lngLoop, 3), vSpecs(lngLoop, 4))
        Next
        strKeys = vbNullString
        For Each vKey In Array("ein", "taxpd")
            strKeys = strKeys & oDic(vKey)
        Next
        For Each vItem In oDic
            rs.AddNew
                rs!Key = strKeys
                rs!Name = vItem
                rs!Value = Trim(oDic(vItem))
            rs.Update
        Next
        oDic.RemoveAll
    Loop
    oTS.Close

End Sub

Open in new window


Once the data has been imported, you can use a crosstab query to look at the data in columns.  You may use the crosstabs to make virtual tables or populate real tables.
Crosstab Example:
TRANSFORM First(Q_28355228.Value) AS FirstOfValue
SELECT Q_28355228.Key
FROM Q_28355228
WHERE Name in ("name","dba_name","state","zip","pt1_num_vtng_gvrn_bdy_mems","pt1_num_ind_vtng_mems","tot_num_empls",
"tot_num_vlntrs","tot_gro_ubi","net_unrltd_bus_txbl_incm","contri_grnts_py","prog_srvc_rev_py","invst_incm_py","oth_rev_py",
"contri_grnts_cy","prog_srvc_rev_cy","invst_incm_cy","oth_rev_cy","tot_rev_cy","grnts_and_smlr_amts_cy","bnfts_pd_to_mems_cy",
"slrs_etc_cy","tot_prof_fndrsng_exp_cy","tot_fndrsng_exp_cy","oth_expns_cy","tot_expns_cy","rev_less_expns_cy","pt1_tot_asts_eoy",
"pt1_tot_liab_eoy","net_asts_or_fund_bals_eoy","desc_ln_5021c3","sch_b_rqrd","pltcl_actvs","lbbyng_actvs","subj_to_prxy_tx",
"donr_advsd_fnd","cnsrv_easemnt","colls_of_art","crdt_cnslng","term_of_perm_endwmts","bal_sht_amts_reptd","aud_fincl_stmts",
"cnsld_audt_fin_stmt","school","frgn_offc","frgn_actvs","more_than_5000k_to_orgs","more_than_5000k_to_inds","prof_fndrsng",
"fndrsng_actvs","gaming","hospital","grnts_to_orgs","grnts_to_inds","sch_j_rqrd","tx_exmpt_bnds","invst_tx_exmpt_bnds","escrw_acct",
"on_behalf_of_issr","excss_bnft_trans","pr_excss_bnft_trans","ln_to_ofcr_or_dqp","grnt_to_rltd_prsn","bus_rltnshp_with_org",
"bus_rltnshp_thru_fam","ofcr_ent_with_bus_rltnshp","ded_non_csh_contri","ded_contris_of_art","terminated","partl_liqdtn","disrgd_entity",
"rltd_entity","rltd_org_cntrld_entity","trnsfr_to_non_chrtbl_org ","actvs_cndctd_prtshp","sch_o_req","num_with_1096","num_w2g_incld",
"cmplnc_with_bkup_wthldng","num_of_empls","emplmn_tx_rets_fld","unrltd_bus_incm","frm990t_fld","frgn_fincl_acct","prohib_tx_shltr_trans",
"txbl_prty_notif")
GROUP BY Q_28355228.Key
PIVOT Q_28355228.Name;

Open in new window


Note: you can't get all the columns in a single crosstab query result, because 826 is just too many for Access to handle.
0
 

Author Comment

by:dwillpower
ID: 39843397
Thanks! Great!

One problem so far,  

vSpecs = owks.Range(owks.Range("A5"), owks.Range("A5").End(xlDown).Offset(0, 3))

it errors on (xldown) saying it not defined?

Thanks!
D
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 45

Expert Comment

by:aikimark
ID: 39843644
Oops.  Try this.
vSpecs = owks.Range(owks.Range("A5"), owks.Range("A5").End(xlDown).Offset(0, 3)).Value

Open in new window

0
 

Author Comment

by:dwillpower
ID: 39848760
Still getting the variable not defined xldown.....
0
 
LVL 45

Accepted Solution

by:
aikimark earned 500 total points
ID: 39849464
replace xldown with -4121
0
 

Author Closing Comment

by:dwillpower
ID: 39854136
Rockstar!
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Excel can be a tricky bit of software to get your head around. Whilst you’ll be able to eventually get to grips with the basic understanding of how to get by, there are a few Excel tips that not everybody will even know about let alone know how to d…
In Part II of this series, I will discuss how to identify all open instances of Excel and enumerate the workbooks, spreadsheets, and named ranges within each of those instances.
Excel styles will make formatting consistent and let you apply and change formatting faster. In this tutorial, you'll learn how to use Excel's built-in styles, how to modify styles, and how to create your own. You'll also learn how to use your custo…
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …

680 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question