Solved

Parse A Text File Into An Excel Spreadsheet

Posted on 2013-12-22
25
549 Views
Last Modified: 2014-01-04
Please help anyone.  I have a text file that is over 6MB and has proven to be quite difficult to parse.  It does not have the same format throughout but is close.  The beginning column header is "NSN:" and everything following this should remain on the same row.  The next row should start at the next instance of the word "NSN:" I have been trying over and over to figure this out but to no avail.  I have to get this out to my company sometimes tomorrow if anyone can help me out.  I am a total newbie at a scripting and this is a first for me.  Please help, anyone! I have provided a portion of the text file that needs to be parsed as well as the VBA script I have so far.  I am almost certain that there is a better way to do this.  Thanks in advance for all of the help!
0
Comment
Question by:Christopher Wright
  • 11
  • 7
  • 3
  • +2
25 Comments
 
LVL 80

Expert Comment

by:byundt
ID: 39735391
Could you please post a sample text file. It doesn't need to be 6 MB in size--five or so records should be sufficient for testing.
0
 
LVL 30

Expert Comment

by:hnasr
ID: 39735543
Yes, a sample text file is needed. Is text file like:
NSN:aaaaaaaNSN:bbbbbbbbbbbbbb
NSN:ccccNSN:ddddNSN:eeeeeeeeeeeeeeee
0
 
LVL 80

Expert Comment

by:byundt
ID: 39735919
When you upload a file, you need to:
1.  Click the Attach File link
2.  Click the Browse... button to find the file in your computer or network
3.  After you have selected the file, click the "Open" button in the file browser
4.  Click the Upload File button in the Experts Exchange Comment (or question)
5.  Add some descriptive text in the field with light gray font that says "Enter a brief description of your file (required)"

Omitting step 5 is a common mistake. I usually say "Sample file" because the field doesn't display very many words.
0
 

Author Comment

by:Christopher Wright
ID: 39736649
My apologies everyone. I was trying to do post this by phone but obviously failed.  Here are the missing pieces. Please accept my apologies for leaving those out everyone.  :(
Test-File.txt
TransposeFasteners.xlsm
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39737219
Do you only need the data from the sections that have "IDENTIFY TO:" in them?

Do you need all of the amendments data?
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39737287
What should happen to the extra references? (when there are more than three)
0
 

Author Comment

by:Christopher Wright
ID: 39737306
They can be combined in the same cell. I will attach an example of the finished excel file.
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39737308
what was the workbook you posted, if not the result example?
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39737310
Is the text file you posted a representative sample of the actual data?
0
 

Author Comment

by:Christopher Wright
ID: 39737312
0
 

Author Comment

by:Christopher Wright
ID: 39737316
Yes. The text file is a representation of the actual data.
0
 

Author Comment

by:Christopher Wright
ID: 39737318
The workbook I originally posted contains the VBA script. The format is acceptable but I only sent the last workbook as an additional reference. I am okay with leaving the format the same as the original file I posted. Sorry for any confusion.
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:Christopher Wright
ID: 39737319
Also, I do need all of the amendments data.
0
 

Author Comment

by:Christopher Wright
ID: 39738186
Have I provided everything you need? Sorry, don't mean to sound pushy, I was only checking to make sure that I provided all info necessary here. Thanks again for all of your help!
0
 
LVL 45

Assisted Solution

by:aikimark
aikimark earned 50 total points
ID: 39739201
@xfitguru

Yes. You have.  I was already in the process of writing an article about parsing and this problem is a good example to tackle.
0
 

Author Comment

by:Christopher Wright
ID: 39739413
Awesome aikimark.  I am glad that this can pose a good example here.  I am eager to see your article.  Please let me know once you've posted it. Thanks
0
 
LVL 30

Assisted Solution

by:hnasr
hnasr earned 100 total points
ID: 39739592
Planning for data entry is very important fro future analysis.
That being said, this question is a nice one to tickle the brain.

It is a work for an article as aikimark suggested.

Here is a try to display records starting by NSN: and rest of characters.
Hope that will be of help to you.
In immediate window, type readText then enter.
Code is in Module1.

Your uploaded text file renamed D:\t.txt
Public Sub readText()
    'suthor hnasr
    'date Dec 26 2013
    Dim aChar As String ' character to be read from file
    Open "d:\t.txt" For Input As #1
    Dim f1 As String 'concatenated characters to search for a phrase
    Dim cCount As Long 'character count in file
    Dim arr() 'array to hold start of occurence, last element holds end of file count
    Dim nCount As Long 'holds occurence of NSN:
    Do Until EOF(1)
        aChar = Input(1, #1)
        cCount = cCount + 1
         
        If Asc(aChar) <> 10 Then
            f1 = f1 + aChar
        End If
        If InStr("NSN:", f1) = 0 Then
            f1 = ""
        Else
            If f1 = "NSN:" Then
                nCount = nCount + 1
                ReDim Preserve arr(nCount)
                arr(nCount) = cCount - 4
            End If
        End If
    Loop
    Close (1)
    nCount = nCount + 1
    ReDim Preserve arr(nCount)
    arr(nCount) = cCount
    Open "d:\t.txt" For Input As #1
    For nCount = 1 To UBound(arr) - 1
        Debug.Print
        Debug.Print nCount, arr(nCount), Input(arr(nCount + 1) - arr(nCount), #1)
    Next
    Close (1)
End Sub

Open in new window

TransposeFasteners-2.xlsm
0
 

Author Comment

by:Christopher Wright
ID: 39740376
Thank you for the script hnasr.  I am able to read the information in the Immediate Window, however, is there a way to have this parsed on the spreadsheet?  Thanks again!!
0
 
LVL 3

Expert Comment

by:CvD
ID: 39740812
Sub importMe()
  Dim arr As Variant
  Dim lFree As Long
  Dim i As Long
  Dim j As Long
  Dim nextRow As Long
  Dim nextCol As Long
  Dim inputLine As Variant
 
  lFree = FreeFile
  nextRow = 0   ' The first row number where to start inserting on the target sheet
  nextCol = 0
  Open "datafile.txt" For Input As #lFree
  Do While Not EOF(lFree)
    Line Input #lFree, arr
    ' If next row does not start with NSN: lower the row counter to make
    ' sure we stay on the same row. A shift in column is assumed in this case.
    If nextRow > 0 And Mid(a, 1, 4) <> "NSN:" Then
      nextRow = nextRow - 1
      nextCol = nextCol + 1
    End If
    arr = Split(inputLine, "NSN:")
    j = IIf(Trim(arr(0)) = "", 1, 0) ' Prevents a first blank line
    For i = j To UBound(arr)
      Worksheets("sheet1").Cells(nextRow, nextCol) = arr(i)
      nextRow = nextRow + 1
      nextCol = 0
    Next
    arr = ""
  Loop
  Close #lFree
End Sub
0
 
LVL 30

Expert Comment

by:hnasr
ID: 39740890
When the text file was inspected, it appeared to be a stream of text.
After reading the file character wise, it showed being stored as lines.
Hence CvD's comment is fine for one column NSN
You need to repeat the process many times for each column name. You need to watch for repeated column names in same line.

Here is a file modified to store the NSN:.... in column 1.

type readtext in immediate window and enter.
TransposeFasteners-3.xlsm
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39741917
@hnasr

This file was probably produced in a Unix/Linux system, where the lines are terminated with a Line Feed.  In Windows, there would be a Carriage Return - Line Feed pair of characters.
0
 
LVL 3

Accepted Solution

by:
CvD earned 350 total points
ID: 39742064
Ok. I've tried to hackup something. As far as I could see it worked well. I've used the datafile and spreadsheet you supplied. Please test and let me know.

There is a file conversion routine inside, but it's rather slow because you have to read character by character in order to find the eo-of-line character and convert it.

So my advise would be to use other simple methods to do the conversion before importing it. There are several tools available on the internet to do so. In case you do so, you should deactivate the call to the function "convertMe" in the sub "startImport" in module1.
POT---Reference-Numbers.xlsm
0
 

Assisted Solution

by:Christopher Wright
Christopher Wright earned 0 total points
ID: 39747139
Thanks CvD!! This worked perfectly.  Might I add this it processed much quicker than I anticipated.  I sincerely appreciate your help here!!

Aikimark,
I am still looking forward to your article pertaining to this. I am curious to see your findings on this as well.  Thanks for all of your help and support here too!

Have a Happy New Year!!!
0
 
LVL 45

Expert Comment

by:aikimark
ID: 39747167
I'll post a comment in this thread when it is published.
0
 

Author Closing Comment

by:Christopher Wright
ID: 39755835
Thanks CvD!! This worked perfectly.  Might I add this it processed much quicker than I anticipated.  I sincerely appreciate your help here!!

Aikimark,
I am still looking forward to your article pertaining to this. I am curious to see your findings on this as well.  Thanks for all of your help and support here too!

Have a Happy New Year!!!
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

Enums (shorthand for ‘enumerations’) are not often used by programmers but they can be quite valuable when they are.  What are they? An Enum is just a type of variable like a string or an Integer, but in this case one that you create that contains…
This article describes some techniques which will make your VBA or Visual Basic Classic code easier to understand and maintain, whether by you, your replacement, or another Experts-Exchange expert.
The view will learn how to download and install SIMTOOLS and FORMLIST into Excel, how to use SIMTOOLS to generate a Monte Carlo simulation of 30 sales calls, and how to calculate the conditional probability based on the results of the Monte Carlo …
This Micro Tutorial demonstrates in Microsoft Excel how to consolidate your marketing data by creating an interactive charts using form controls. This creates cool drop-downs for viewers of your chart to choose from.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now