• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 357
  • Last Modified:

Parsing data in excel / vba

I'm collecting data offinstruments that's just basically doing straight dumps that look like below.  The square, #6 all precede the column headers, then there is either a _________ or a - that indicates the data for that hedaer.  So for example the first line abelow would read as :  D/C CTRL as the column header and the row data, under that header woudl be the "ON"  can anyne gove me some pointers on how I can accomplish this?

#6D/C CTRL_________ON  STARTER CTRL____OFF
                                       
#6   RPM___________0#6BATT(V)________11.4  ALTERNATOR(%)_____0
                                       
#6A/C SW__________OFF  A/C CLUTCH______OFF
                                       
#6AF FB AVG______0.00  AF FB__________0.00
                                       
#6AF FB CMD(A/F)_99.9  INJ (mS)_______0.00
                                       
#6HO2S S1(mA)___-36.0  HO2S S2(V)_____0.00
                                       
#6HO2S S1 HTR_____OFF  HO2S S2 HTR_____OFF
                                       
#6CAT MON TEMP B1__NG  CAT MON TEMP B2__NG
                                       
#6ELD(A)_________10.1  STARTER SW______OFF
                                       
#6BRAKE SW________OFF  SHIFT LOCK______LOW
                                       
#6SUB BRAKE SW____OFF  SPARK ADV(ß)____0.0
                                       
#6EGR POS(V)_____1.11  IAC_______________0
                                       
#6IMMOBILIZER_____RUN  VTEC SOL________OFF
                                       
#6VTEC PS SW_______ON  EVAP BYPASS SOL_OFF
                                       
#6EVAP CVS VALVE__OFF  FTP SENSOR(kPa)-0.7
                                       
#6EVAP PC DUTY(%)___0  KNOCK RETARD(ß)_0.0
0
Mosquitoe
Asked:
Mosquitoe
  • 6
  • 3
1 Solution
 
mvidasCommented:
Hi mosquitoe,

It looks as though this is coming from a text file, so I've changed my code a little bit to allow you to choose the text file during runtime.  Give this macro a try, if it isn't what you're looking for please let me know how you want it different.  Paste the following into a standard code module in excel vba:

Sub MosquitoeParseTextFile()
 Dim RegEx As Object, RegC As Object, RegM As Object, TempArr
 Dim vStr As String, Results() As String, i As Long, vTextFile As String
 
 vTextFile = Application.GetOpenFilename("Text Files,*.txt,All Files,*.*")
 If LCase(vTextFile) = "false" Then Exit Sub 'user hit cancel
 vStr = LoadTextFileIntoString(vTextFile)
 
 Set RegEx = CreateObject("vbscript.regexp")
 With RegEx
  .IgnoreCase = True
  .Global = True
 End With
 
 'Replace ANSI Codes (ex. "[1;1H") with line feed
 RegEx.Pattern = "(\x1b\[\d+;\d+H)"
 vStr = RegEx.Replace(vStr, vbLf)
 
 'Change double spaces, carriage returns, and chr(27)#6 with line feeds for formatting
 vStr = Replace(Replace(Replace(vStr, "  ", vbLf & vbLf), vbCr, vbLf), Chr(27) & "#6", vbLf)
 
 'Get all data sets from string and place into Results array
 RegEx.Pattern = "\n(.*?)(_|-)+(.*?)\n"
 Set RegC = RegEx.Execute(vStr)
 ReDim Preserve Results(RegC.Count - 1)
 i = 0
 For Each RegM In RegC
  Results(i) = Replace(RegM.SubMatches(0) & "#" & RegM.SubMatches(2), vbLf, "")
  i = i + 1
 Next
 
 'Place results into excel
 If ActiveWorkbook Is Nothing Then Workbooks.Add 1 Else Sheets.Add 'either new sheet or book
 For i = 0 To UBound(Results)
  TempArr = Split(Results(i), "#")
  Range(Cells(1, i + 1), Cells(2, i + 1)) = Application.Transpose(TempArr)
 Next i
 Columns.AutoFit
 
 Set RegEx = Nothing
 Set RegC = Nothing
 Set RegM = Nothing
End Sub

Public Function LoadTextFileIntoString(ByVal vTextFilePath As String) As String
 Dim vFF As Long, vContents() As String, i As Long, vTempStr As String
 vFF = FreeFile
 i = 0
 Open vTextFilePath For Input As #vFF
 Do While Not EOF(vFF)
  Line Input #vFF, vTempStr
  ReDim Preserve vContents(i)
  vContents(i) = vTempStr
  i = i + 1
 Loop
 Close #vFF
 LoadTextFileIntoString = Join(vContents, vbCr)
End Function

Matt
0
 
MosquitoeAuthor Commented:
Hey There - Sorry for the delay in responding - this is actually great and almost exactly the type of parsing I need done.  I guess I'd like to tweak it in that, if the column already exists, it doesn't add a new column but just a row for that data in the pre-existing column.  Also, as I was running through the code from yes, you were right, a text file -  It creates all these columns, but one of the problems it runs into is not enough columns and then generates an error .. I assume that by altreing the code to just add a data row for any duplicate column will help this - but maybe not - these are rather large text files.. is there any way to add a column if the default number is used?  And finally, in some cases, the data is actually a negative number - but it doens't save as negative, I assume it because there is a check to for _ - to deliminate data..but what if the underscore is followed by -36 ..?
I know, I know - but much appreciated and honestly fantastic!!  Thanks for your expertise!
0
 
mvidasCommented:
I see what you mean with the entry     HO2S S1(mA)___-36.0, but what about the entry FTP SENSOR(kPa)-0.7 ? Is that supposed to be -0.7, or is the - a separator there?  The reason the - is being treated as a delimiter there is in the line

 RegEx.Pattern = "\n(.*?)(_|-)+(.*?)\n"

If the - is a delimiter on the FTP SENSOR line, but will be a negative if preceeded by the _ delimiter like in the HO2S S1 line, then change your pattern to

 RegEx.Pattern = "\n(.*?)[_-]+(.*?)\n"

And that will take either as a delimiter but not both.

As for putting them into a pre-existing column, that will take some minor adjustments, which I don't have time for at this very moment.  I'll try and get an updated sub for you within the next couple of hours though.

If you might have more than 256 fields in the text file (if it is quite large), would you consider putting the header into column A and the values into the subsequent columns? For example, with the field

  D/C CTRL_________ON

D/C CTRL would go into A1, and ON would go into B1.  That way you could have 65536 fields in a text file, and you wouldn't have to put multiple values into the same field (unless you do want that still--makes sense that you would).
If you did want them in rows instead of columns, and not split up, you would change the line

  Range(Cells(1, i + 1), Cells(2, i + 1)) = Application.Transpose(TempArr)

to

  Range(Cells(i + 1, 1), Cells(i + 1, 2)) = TempArr

That still wouldn't solve the multiple values for the same field issue, but would allow the code to run without erroring out due to column limitations.
I'll post an updated code later for the multiple-value fields (still using the column idea you first asked, though it should be easy to change if need be)
matt
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
mvidasCommented:
Please disregard my statement about the new RegEx.Pattern for the different delimiters, it was incorrect
0
 
mvidasCommented:
I'm really at a loss here, with the "-" being used as both a negative symbol and a delimiter.  Even if the - isn't ever used as a delimiter and only as a negative symbol, then there is no delimiter in FTP SENSOR(kPa)-0.7 and it fails to show up.  I don't know of a way to basically say "use _ as a delimiter, unless there is no _ in which case use - but only if there is no _"

By changing the .Pattern to
 RegEx.Pattern = "\n(.*?)_+(.*?)\n"
and changing the results(i) line to
  Results(i) = Replace(RegM.SubMatches(0) & "#" & RegM.SubMatches(1), vbLf, "")

It will only treat the _ as a delimiter, but as I said it would skip the FTP SENSOR(kPa) line
I will wait until this issue is figured out before I proceed further with the macro.  I'm going to ask a friend of mine to take a look at this question as well, as he is more of a regular expressions expert than I.
0
 
MosquitoeAuthor Commented:
What if, in that case if a  ) preceeds it ?  The data is really hard to read so in cases like this, I'm going to have to be very scenario specific.
0
 
mvidasCommented:
That will work, though the field header for your FTP SENSOR line will be:

FTP SENSOR(kPa

(without the closing parenthesis).  That can be fixed modifying the Results(i) line again.

Change the pattern line to:
 RegEx.Pattern = "\n(.*?)(_|\)-)+(.*?)\n"

and change the results(i) to:
  Results(i) = Replace(IIf(RegM.SubMatches(0) Like "*(*" And Not RegM.SubMatches(0) _
   Like "*)*", RegM.SubMatches(0) & ")", RegM.SubMatches(0)) & "#" & _
   RegM.SubMatches(2), vbLf, "")

That should take care of the delimiter issue (good idea with the ")" symbol).  As for the multiple-values-per-field thing, I am just going to post an entirely new sub (still keep the LoadTextFileIntoString function, no changes there) which should encompass every change so far.  Give this a try, works in my tests!

Sub MosquitoeParseTextFile()
 Dim RegEx As Object, RegC As Object, RegM As Object, TempArr
 Dim vStr As String, Results() As String, i As Long, vTextFile As String
 Dim fieldHead As String, j As Long
 
 vTextFile = Application.GetOpenFilename("Text Files,*.txt,All Files,*.*")
 If LCase(vTextFile) = "false" Then Exit Sub 'user hit cancel
 vStr = LoadTextFileIntoString(vTextFile)
 
 Set RegEx = CreateObject("vbscript.regexp")
 With RegEx
  .IgnoreCase = True
  .Global = True
 End With
 
 'Replace ANSI Codes (ex. "[1;1H") with line feed
 RegEx.Pattern = "(\x1b\[\d+;\d+H)"
 vStr = RegEx.Replace(vStr, vbLf)
 
 'Change double spaces, carriage returns, and chr(27)#6 with line feeds for formatting
 vStr = Replace(Replace(Replace(vStr, "  ", vbLf & vbLf), vbCr, vbLf), Chr(27) & "#6", vbLf)
 
 'Get all data sets from string and place into Results array
 RegEx.Pattern = "\n(.*?)(_|\)-)+(.*?)\n"
 Set RegC = RegEx.Execute(vStr)
 ReDim Results(0)
 i = 0
 For Each RegM In RegC
  fieldHead = IIf(RegM.SubMatches(0) Like "*(*" And Not RegM.SubMatches(0) Like "*)*", _
   RegM.SubMatches(0) & ")", RegM.SubMatches(0))
  For j = 0 To i - 1
   If Left(Results(j), InStr(1, Results(j), "#") - 1) = fieldHead Then
    Results(j) = Results(j) & "#" & Replace(RegM.SubMatches(2), vbLf, "")
    Exit For
   End If
  Next j
  If i = j Then
   ReDim Preserve Results(i)
   Results(i) = Replace(fieldHead & "#" & RegM.SubMatches(2), vbLf, "")
   i = i + 1
  End If
 Next
 
 'Place results into excel
 If ActiveWorkbook Is Nothing Then Workbooks.Add 1 Else Sheets.Add 'either new sheet or book
 For i = 0 To UBound(Results)
  TempArr = Split(Results(i), "#")
  Range(Cells(1, i + 1), Cells(UBound(TempArr) + 1, i + 1)) = Application.Transpose(TempArr)
 Next i
 Columns.AutoFit
 
 Set RegEx = Nothing
 Set RegC = Nothing
 Set RegM = Nothing
End Sub

Matt

0
 
MosquitoeAuthor Commented:
This seems to be exactly what I was trying to achieve.  Thank you
0
 
mvidasCommented:
Glad I could help! If you still run out of columns and want to display the data in rows, just change the line

  Range(Cells(1, i + 1), Cells(UBound(TempArr) + 1, i + 1)) = Application.Transpose(TempArr)

to:

  Range(Cells(i + 1, 1), Cells(i + 1, UBound(TempArr) + 1)) = TempArr

Thanks!
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 6
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now