Excel VBA HL7 Parser

I am trying to write a crude parser for HL7 for our internal testing.

What happens is the user pastes the HL7 message in the Data Sheet.
Each segment is preceded by a set of identifiers like MSH, PID and many more (which are standard and do not change). There are sheets created for each of these segments.
In each segment, all fields are SEPARATED by a "|" (pipe).

Upon click of a button or as soon as the message is pasted, I would like to split each of the segment and paste the data (pipe separated) into the corresponding sheets. I am not sure if we can use VBA / Macros to accomplish the task.

Sample Hl7 Message.

PID||123456|123456^^^^MR||Test^Sam^||20100101||||M||1002-5^American Indian or Alaska Native^HL70005|2 Dr^^Houston^TX^01321|US|332-508-5080|||Single|||999999999|||2135-2^Hispanic or Latino|||||||||||||||||

Attached is the excel file containing the prototype and how the fields should be parsed.

Any help is appreciated.
Who is Participating?
Frank WhiteConnect With a Mentor Commented:
Alright, I think I've got something that works.

The code below below should be put in a VBA module in the workbook.

Option Explicit


Sub DoHL7Parsing(sMessage As String)
    Dim vSegments As Variant, vCurSeg As Variant
    Dim vFields As Variant, rCurField As Range, iIter As Integer
    Dim wsSeg As Worksheet
    vSegments = VBA.Split(sMessage, HL7_DELIMITER_SEGMENT)
    For Each vCurSeg In vSegments
        vFields = VBA.Split(vCurSeg, HL7_DELIMITER_FIELD)
        If WorksheetExists(vFields(0), ThisWorkbook) Then
            For iIter = 1 To UBound(vFields)
                Set rCurField = ThisWorkbook.Worksheets(vFields(0)).Range("A65536").End(xlUp).Offset(1, 0)
                rCurField.Value = vFields(0)
                rCurField.Offset(0, 1).Value = (rCurField.Row - 1)
                rCurField.Offset(0, 2).NumberFormat = "@"
                rCurField.Offset(0, 2).Value = vFields(iIter)
            Next iIter
            MsgBox "Invalid or unkown segment: " & vFields(0)
        End If
    Next vCurSeg
End Sub

Public Function WorksheetExists(ByVal WorksheetName As String, Optional InWorkbook As Workbook) As Boolean
    Dim Sht As Worksheet
    WorksheetExists = False
    If Not InWorkbook Is Nothing Then
        For Each Sht In InWorkbook.Worksheets
            If Sht.Name = WorksheetName Then WorksheetExists = True
        Next Sht
        For Each Sht In ActiveWorkbook.Worksheets
            If Sht.Name = WorksheetName Then WorksheetExists = True
        Next Sht
    End If
End Function

Open in new window

To use it, add this code to the Data sheet's VBA code (right click tab -> View Code):
Private Sub Worksheet_Change(ByVal Target As Range)
    If Not (Application.Intersect(Target, [PasteField]) Is Nothing) Then
        DoHL7Parsing [PasteField].Value
    End If
End Sub

Open in new window

Each time the cell is changed, its contents will be parsed and distributed to the relevant worksheets. The sheets have to already exist and have the same exact name as the segment name, and the segment name must be the first field in a newline segment. I'm assuming that this is already how HL7 standards work (based on what little I've read of them), which is why I felt unnecessary to make more code to handle different situations.

If some of my assumptions were incorrect or you want to have another method to decide when to parse the message, don't hesitate to say so.

Also, here's the code put into the prototype book, though since I'm using Excel 2002 at the moment it's not guaranteed to open properly. I've had to change it to a macro-enabled format for it to save right in the first place:

Frank WhiteCommented:
Perhaps it would help if you read: this page on the Split() function and an enhanced version for parsing.

The example function lets you split a string with a specific delimiter.

After that, the most obvious solution is to loop through the resulting array, comparing for your list of keywords and creating a new sheet if one is found, and then sending each element of the array to the right cell, also setting the current keyword in the first column and the value of an iteration counter for the second column (going by what I see ni the example workbook you posted).

If you want "empty fields" (e.g. |20110802122112||VXU| being three fields, with an empty one in the middle encased by two pipe characters) to also be counted, that makes the code even lighter.

Bit short on free time, so I won't be providing sample code right away, but if this question's still open in a few hours I'd be glad to give it a shot.

EDIT: If your pasted messages are always going to be HL7-compliant, then they're always going to have a newline between each segment, right? If so, you could run a first split pass using the newline character(s) as the delimiter, and then each element of that split array would be one segment, which can be assigned its own worksheet and *then* split and blasted. This would further simplify resulting code.
nainilAuthor Commented:
Apologies for getting back so late. This works perfectly. I am only concerned about one thing: Carriage Returns. There will be times when there will be no carriage returns available in the message. How or what can we do about that?
Frank WhiteCommented:
Well, if you're talking about the Carriage Return character specifically ("CR", or vbCR in VBA constants), no issues there, since this code detects the Line Feed character (which is after the Carriage Return, either way).

If you're talking about newlines in general, if there's going to be a replacement character for newlines, you can change the line "Const HL7_DELIMITER_SEGMENT = vbLf", changing vbLF for whichever character will act as separation between segments.

If there's going to be no actual replacement and the only thing between segments will be another pipe character (|), then the line "vSegments = VBA.Split(sMessage, HL7_DELIMITER_SEGMENT)" will have to be replaced with something more complex that draws from a list of segment IDs and separates the string each time it encounters one.

An alternative to that would be to have another sub written and run before this one that would do the same thing and replace detected identifiers with 'vbCRLF & identifier' before running this sub.
This VBA script is great! It's going to save me so much time. I realize this is a fairly old thread, but I'm hoping you all can help me.

Often I am working with HL7 messages with multiple segments of the same type. Is it possible to create a new tab for each new line, instead of looking for a tab that already has the same name?
For example, here is one with two OBX segments:
PID|1|14321076|14321076||Test^MEGAN||19880921|F|||7546 Old Spec Rd^^Peyton^CO^80831||6515834017||||||452111111|
OBR|1|142153|MOL14-010125|3020^CT/GC Combo|||20141117144800||||||||||||||||||F||||
OBX|1|ST|3023^Chlamydia trachomatis||Negative||Negative|
OBX|2|ST|3026^Neisseria gonorrhoeae||Negative||Negative|
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.