How to split, re-arrange in order HL7 message

I receive an HL7 message which looks like:
MSH|fdsas43423|fdsfs|432423|fdfads|423423|
PID|43243|fdfds|654|HGDF|76554|HGDS
PV1|5434|gfsd|543534|hgdfz|
ORC||||F|RE
OBR|20140709|gffsd|425643|hgd|543
OBX|65465|GFFSD|7664|aHGFF|4654
NTE|GHFDSGF|543|GFSFD|654|HGDF
OBX|423|FDSAD|432423|FDAS|423423
ORC||||p|RE
OBR|20140706|gffsd|425643|hgd|543
OBX|65465|GFFSD|7664|aHGFF|4654
NTE|GHFDSGF|543|GFSFD|654|HGDF
OBX|423|FDSAD|432423|FDAS|423423
ORC||||F|RE
OBR|20140710|gffsd|425643|hgd|543
OBX|65465|GFFSD|7664|aHGFF|4654
NTE|GHFDSGF|543|GFSFD|654|HGDF
OBX|423|FDSAD|432423|FDAS|423423
ORC||||F|RE
OBR|20140711|gffsd|425643|hgd|543
OBX|65465|GFFSD|7664|aHGFF|4654
NTE|GHFDSGF|543|GFSFD|654|HGDF
OBX|423|FDSAD|432423|FDAS|423423

I need to split the message into sections which start at ORC and end at the next one.  the field next to OBR is a timestamp which is alwasy unique and once the message is split into sections it needs to be rebuilt in order of the time stamp from latest to earliest.  

So it should look like:
MSH|fdsas43423|fdsfs|432423|fdfads|423423|
PID|43243|fdfds|654|HGDF|76554|HGDS
PV1|5434|gfsd|543534|hgdfz|
ORC||||p|RE
OBR|20140706|gffsd|425643|hgd|543
OBX|65465|GFFSD|7664|aHGFF|4654
NTE|GHFDSGF|543|GFSFD|654|HGDF
OBX|423|FDSAD|432423|FDAS|423423
ORC||||F|RE
OBR|20140709|gffsd|425643|hgd|543
OBX|65465|GFFSD|7664|aHGFF|4654
NTE|GHFDSGF|543|GFSFD|654|HGDF
OBX|423|FDSAD|432423|FDAS|423423
ORC||||F|RE
OBR|20140710|gffsd|425643|hgd|543
OBX|65465|GFFSD|7664|aHGFF|4654
NTE|GHFDSGF|543|GFSFD|654|HGDF
OBX|423|FDSAD|432423|FDAS|423423
ORC||||F|RE
OBR|20140711|gffsd|425643|hgd|543
OBX|65465|GFFSD|7664|aHGFF|4654
NTE|GHFDSGF|543|GFSFD|654|HGDF
OBX|423|FDSAD|432423|FDAS|423423

Thanks.
Nick MaloneAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

aikimarkCommented:
It might be more helpful if you posted an example with different data in each 'section'.
0
CEHJCommented:
Don't use homebrewed techniques - there are parsers available,  e.g. http://hl7api.sourceforge.net/
0
aikimarkCommented:
Since you are using Python, you might want to consider a Python-centric parser, such as hl7apy.
http://sourceforge.net/projects/hl7apy/
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Nick MaloneAuthor Commented:
I've requested that this question be deleted for the following reason:

no answers.
0
aikimarkCommented:
You have seen two different potential solutions and received a request to clarify your data.  Your "no answers" assertion is bogus.  I object to your deletion request.  Please work with the participating experts.
0
CEHJCommented:
Your "no answers" assertion is bogus.  I object to your deletion request.

.. as do I
0
aikimarkCommented:
The following seems to work in my Windows environment using your posted data.  Be aware that if your posted data does not strongly resemble your actual/production data, this code will not work.
import os

def main():
    #!/usr/local/bin/python2.7
    os.chdir( "\\users\\aikimark\\downloads")
    f = open('HL7input.txt', 'r')

    groups = []
    agroup=[]
    groupkey=''
    recnum=0
    for line in f:
      a=line.split('|')
      if a[0]=='ORC':
          if len(agroup) == 1:
            #add to groups and reset
            agroup.append(recnum-1)
            agroup.append(groupkey)
            groups.append(agroup)
            agroup=[]
            groupkey=''
            agroup.append(recnum)
          else:
            agroup.append(recnum)

      elif a[0]=='OBR':
          #retain group key
          groupkey=a[1]
      recnum=recnum+1

    #add last group to the groups list
    agroup.append(recnum-1)
    agroup.append(groupkey)
    groups.append(agroup)
    firstgroupstart = groups[0][0]

##    print 'before sort'
##    for g in groups:
##      print g

    #sort on third item in tuple -- the date of the group in yyyymmdd format string
    sgroups=sorted(groups, key=lambda g: g[2])

##    print 'after sort'
##    for g in sgroups:
##      print g
    f.close
    f = open('HL7input.txt', 'r')
    linelist = f.readlines()
    f.close
    f = open('HL7output.txt', 'w')

    #print 'sorted file'
    print range(firstgroupstart)
    for l in range(firstgroupstart):
      f.write( linelist[l],)
    for g in sgroups:
      for outline in range(g[0],g[1]+1):
        f.write( linelist[outline],)
    f.close

if __name__ == '__main__':
    main()

Open in new window

0
aikimarkCommented:
It bothered me that I was reading the file twice, so I streamlined the code.
import os

def main():
    #!/usr/local/bin/python2.7
    os.chdir( "\\users\\aikimark\\downloads")
    f = open('HL7input.txt', 'r')
    linelist = f.readlines()
    f.close

    groups = []
    agroup=[]
    groupkey=''
    recnum=0

    for line in linelist:
      a=line.split('|')
      if a[0]=='ORC':
          if len(agroup) == 1:
            #add to groups and reset
            agroup.append(recnum-1)
            agroup.append(groupkey)
            groups.append(agroup)
            agroup=[]
            groupkey=''
            agroup.append(recnum)
          else:
            agroup.append(recnum)

      elif a[0]=='OBR':
          #retain group key
          groupkey=a[1]
      recnum=recnum+1

    #add last group to the groups list
    agroup.append(recnum-1)
    agroup.append(groupkey)
    groups.append(agroup)
    firstgroupstart = groups[0][0]

##    print 'before sort'
##    for g in groups:
##      print g

    #sort on third item in tuple -- the date of the group in yyyymmdd format string
    sgroups=sorted(groups, key=lambda g: g[2])

##    print 'after sort'
##    for g in sgroups:
##      print g
    f = open('HL7output.txt', 'w')

    #print 'sorted file'
    print range(firstgroupstart)
    for l in range(firstgroupstart):
      f.write( linelist[l])
    for g in sgroups:
      for outline in range(g[0],g[1]+1):
        f.write( linelist[outline])
    f.close

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
aikimarkCommented:
accept http:#a40237568 as the solution
0
CEHJCommented:
Points splits
0
aikimarkCommented:
@CEHJ

Your proposed solution only went so far as the parsing.  It didn't address the sorting.
0
CEHJCommented:
a. that's why a points split
b. use of a proper API means that regular Java Collection sorting is trivial
0
aikimarkCommented:
a. although we both proposed a parsing library via a link to different sourceforge projects, I actually posted Python code that did the parsing without any extra parsing libraries.  My comment is a complete solution.
b. Java collection sorting is not a consideration with this Python problem.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Python

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.