We help IT Professionals succeed at work.

Hoist the colors! We’ve added location flags to usernames sitewide, so it's easier to connect with the global community on EE. View My Profile

x

How to split excel file by columns from irs.gov website

976 Views
Last Modified: 2014-02-12
Want to split large text file into multiple excel spreadsheets by column.

The end goal is to parse the file so it can be uploaded to Access 2013 which limits the number of columns in a table to 250.

I would like to know how to:
1.) use the field layout to turn the text file into a tab separated text file
2.) how to split the file by columns and insert each section of columns into a separate sheet inside the existing workbook.

Thanks!
dwillpower
10eo.zip
10eoderl.xlsx
Comment
Watch Question

CERTIFIED EXPERT
Distinguished Expert 2017

Commented:
I presume the point of this exercise is to have data you can use for analysis.  Importing all those columns into 4 tables with ~ 250 columns each will simply leave you with a worse nightmare.  I would use OLE automation to open the spreadsheet and column by column, insert rows into an Access table.  I didn't spend a lot of time with it but there are several repeating groups and I would use a separate many-side table for each.  So you would end up with a "main" table and several related tables.  The related tables will each have only 4 columns - UniqueID (autonumber) ParentID (FK to main table), DataPointName, DataPointValue.
aikimarkSocial distance; Wear a mask; Don't touch your face; Wash your hands for 20 seconds
CERTIFIED EXPERT
Top Expert 2014

Commented:
I'm running this code in Access, since it seems like Excel isn't necessary for this to import correctly from the text file source.  Before running this code, you should create a table in your Access database with the following fields:
Key: text
Name: text(255)
Value: text(255)
I did not specify any primary key for this table.

You will need to change the path string literals associated with the two constants (cSpec, cData).

Option Explicit

Public Sub Q_28355228()
    Dim oFS As Object
    Dim oTS As Object
    Dim strLine As String
    Dim vSpecs As Variant
    Dim oXL As Object
    Dim owkb As Object
    Dim owks As Object
    Dim rs As Recordset
    Dim strKeys As String
    Dim lngLoop As Long
    Dim vKey As Variant
    Dim oDic As Object
    Dim vItem As Variant
    Const cSpec As String = "C:\Users\Mark\Downloads\10eoderl.xlsx"
    Const cData As String = "C:\Users\Mark\Downloads\10eo.txt"
    Set oXL = CreateObject("excel.application")
    Set owkb = oXL.Workbooks.Open(cSpec)
    Set owks = owkb.Worksheets("EO990_10")
    vSpecs = owks.Range(owks.Range("A5"), owks.Range("A5").End(xlDown).Offset(0, 3))
    Set owks = Nothing
    owkb.Close
    Set owkb = Nothing
    Set oXL = Nothing
    Set rs = DBEngine(0)(0).OpenRecordset("Q_28355228", dbOpenTable)
    Set oDic = CreateObject("scripting.dictionary")
    Set oFS = CreateObject("scripting.filesystemobject")
    Set oTS = oFS.OpenTextFile(cData)
    Do Until oTS.AtEndOfStream
        strLine = oTS.ReadLine
        For lngLoop = LBound(vSpecs, 1) To UBound(vSpecs, 1)
            oDic(vSpecs(lngLoop, 1)) = Mid$(strLine, vSpecs(lngLoop, 3), vSpecs(lngLoop, 4))
        Next
        strKeys = vbNullString
        For Each vKey In Array("ein", "taxpd")
            strKeys = strKeys & oDic(vKey)
        Next
        For Each vItem In oDic
            rs.AddNew
                rs!Key = strKeys
                rs!Name = vItem
                rs!Value = Trim(oDic(vItem))
            rs.Update
        Next
        oDic.RemoveAll
    Loop
    oTS.Close

End Sub

Open in new window


Once the data has been imported, you can use a crosstab query to look at the data in columns.  You may use the crosstabs to make virtual tables or populate real tables.
Crosstab Example:
TRANSFORM First(Q_28355228.Value) AS FirstOfValue
SELECT Q_28355228.Key
FROM Q_28355228
WHERE Name in ("name","dba_name","state","zip","pt1_num_vtng_gvrn_bdy_mems","pt1_num_ind_vtng_mems","tot_num_empls",
"tot_num_vlntrs","tot_gro_ubi","net_unrltd_bus_txbl_incm","contri_grnts_py","prog_srvc_rev_py","invst_incm_py","oth_rev_py",
"contri_grnts_cy","prog_srvc_rev_cy","invst_incm_cy","oth_rev_cy","tot_rev_cy","grnts_and_smlr_amts_cy","bnfts_pd_to_mems_cy",
"slrs_etc_cy","tot_prof_fndrsng_exp_cy","tot_fndrsng_exp_cy","oth_expns_cy","tot_expns_cy","rev_less_expns_cy","pt1_tot_asts_eoy",
"pt1_tot_liab_eoy","net_asts_or_fund_bals_eoy","desc_ln_5021c3","sch_b_rqrd","pltcl_actvs","lbbyng_actvs","subj_to_prxy_tx",
"donr_advsd_fnd","cnsrv_easemnt","colls_of_art","crdt_cnslng","term_of_perm_endwmts","bal_sht_amts_reptd","aud_fincl_stmts",
"cnsld_audt_fin_stmt","school","frgn_offc","frgn_actvs","more_than_5000k_to_orgs","more_than_5000k_to_inds","prof_fndrsng",
"fndrsng_actvs","gaming","hospital","grnts_to_orgs","grnts_to_inds","sch_j_rqrd","tx_exmpt_bnds","invst_tx_exmpt_bnds","escrw_acct",
"on_behalf_of_issr","excss_bnft_trans","pr_excss_bnft_trans","ln_to_ofcr_or_dqp","grnt_to_rltd_prsn","bus_rltnshp_with_org",
"bus_rltnshp_thru_fam","ofcr_ent_with_bus_rltnshp","ded_non_csh_contri","ded_contris_of_art","terminated","partl_liqdtn","disrgd_entity",
"rltd_entity","rltd_org_cntrld_entity","trnsfr_to_non_chrtbl_org ","actvs_cndctd_prtshp","sch_o_req","num_with_1096","num_w2g_incld",
"cmplnc_with_bkup_wthldng","num_of_empls","emplmn_tx_rets_fld","unrltd_bus_incm","frm990t_fld","frgn_fincl_acct","prohib_tx_shltr_trans",
"txbl_prty_notif")
GROUP BY Q_28355228.Key
PIVOT Q_28355228.Name;

Open in new window


Note: you can't get all the columns in a single crosstab query result, because 826 is just too many for Access to handle.

Author

Commented:
Thanks! Great!

One problem so far,  

vSpecs = owks.Range(owks.Range("A5"), owks.Range("A5").End(xlDown).Offset(0, 3))

it errors on (xldown) saying it not defined?

Thanks!
D
aikimarkSocial distance; Wear a mask; Don't touch your face; Wash your hands for 20 seconds
CERTIFIED EXPERT
Top Expert 2014

Commented:
Oops.  Try this.
vSpecs = owks.Range(owks.Range("A5"), owks.Range("A5").End(xlDown).Offset(0, 3)).Value

Open in new window

Author

Commented:
Still getting the variable not defined xldown.....
Social distance; Wear a mask; Don't touch your face; Wash your hands for 20 seconds
CERTIFIED EXPERT
Top Expert 2014
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION

Author

Commented:
Rockstar!

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.