[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 302
  • Last Modified:

How do I join multiple lines in a text file with simple criteria of the lines to be joined and make certain the data keeps the correct spacing?

I am working with an interface that passes data from one system to another in XML. One of the systems has a description that can have multiple lines in freeform text. The text is delimited by <ATTRIBUTE NAME="Description"> at the beginning and </ATTRIBUTE> at the end. An example of what a file might look like is this:

        <ATTRIBUTE NAME="Description"> This is the description
of the item in
question. </ATTRIBUTE>

What I want it to end up looking like is the following:

        <ATTRIBUTE NAME="Description"> This is the description of the item in question. </ATTRIBUTE>

Spaces may or may not need to be added to make the data correct. If the lines were simply joined the above data would look like:

        <ATTRIBUTE NAME="Description"> This is the descriptionof the item inquestion. </ATTRIBUTE>

So, I also need to make certain that spaces are in the appropriate locations after the lines are joined.
0
e033343
Asked:
e033343
  • 4
  • 3
1 Solution
 
e033343Author Commented:
Should be multiple lines, instead of two lines.
0
 
Murugesan NagarajanSubject-matter expert at delivery, implementation, and automation at UNIX oriented operating systems (Windows: CYGWIN_NT MINGW32_NT MINGW64_NT)Commented:
#Execute the following command:
awk 'BEGIN {spaceInPreviousLine=-1;ATTRIBUTEsentence="";}
{
    if(substr($0,1,10)=="")
    {
        if(substr($0,1,1)!=" ")
        {
            currSentence=" "$o
        }
        else
        {
            currSentence=$0
        }
        ATTRIBUTEsentence=ATTRIBUTEsentence""currSentence
        spaceInPreviousLine=-1
        print ATTRIBUTEsentence
    }
    else
    {
        if(spaceInPreviousLine==0)
        {
            if(substr($0,1,1)!=" ")
            {
                currSentence=" "$o
            }
            else
            {
            currSentence=$0
            }
        }
        else if(spaceInPreviousLine==-1)
        {
            OtherLines=$0
            print OtherLines
        }
        else
        {
            currSentence=$0
        }
        ATTRIBUTEsentence=ATTRIBUTEsentence""currSentence
    }
}' XMLfile > RequiredFileName

http://www.geocities.com/mukeshgct/technical/shellscripting/awkATTRIBUTE.html
0
 
e033343Author Commented:
I am working with this code. I think I will be able to modify it to get what I want. Thanks for the help.
0
Prep for the ITIL® Foundation Certification Exam

December’s Course of the Month is now available! Enroll to learn ITIL® Foundation best practices for delivering IT services effectively and efficiently.

 
Murugesan NagarajanSubject-matter expert at delivery, implementation, and automation at UNIX oriented operating systems (Windows: CYGWIN_NT MINGW32_NT MINGW64_NT)Commented:
Let us know if any further changes are required for this code.

For the following file:
##################
testing12
testing11
testing11
 First
 of the item in
 of the item in
 of the item in
of the item in
of the item in
 of the item in
of the item in
 of the item in
 of the item in
of the item in
question.
testing9
testing8
testing7
testing7
Second
of the item in
question.
testing5
testing4
testing3
testing3
 Third
of the item in
question.
testing1
##################


this code will have the following output:
##################
testing12
testing11
testing11
 First of the item in of the item in of the item in of the item in of the item in of the item in of the item in of the item in of the item in of the item in question.
testing9
testing8
testing7
testing7
Second of the item in question.
testing5
testing4
testing3
testing3
 Third of the item in question.
testing1
##################
0
 
e033343Author Commented:
The code works great for the data you have in the test file. The data that I have is formatted with preceding spaces to Identify sections. Here is an example of how the data actually looks:
<?xml version="1.0" encoding="US-ASCII"?><OBJECT>
 <OBJECTDATA>
  <CONTROLAREA>
   <BSR>
    <VERB>Save</VERB>
    <NOUN>DIB</NOUN>
    <INTERFACEID>PDMCSAP</INTERFACEID>
   </BSR>
   <SOURCE LOCATION="ABC" DIRECTION="OUT" REQUESTERID="GHI">
    <AUTHID>PDMCSAP</AUTHID>
    <DATE>
     <MONTH>11</MONTH>
     <DAY>05</DAY>
     <YEAR>2008</YEAR>
    </DATE>
    <TIME>13:20</TIME>
   </SOURCE>
  </CONTROLAREA>
  <DATAAREA>
   <PART NAME="ABC" REVISION="Z" VAULT="PPP" POLICY="X" STATE="A" TYPE="Part" NEWDESIGN="Yes">
    <ATTRIBUTELIST>
     <ATTRIBUTE NAME="End Item">XX</ATTRIBUTE>
     <ATTRIBUTE NAME="RoHS Compliant">Unassigned</ATTRIBUTE>
     <ATTRIBUTE NAME="Method of Classification">ODI</ATTRIBUTE>
     <ATTRIBUTE NAME="Controlling Document Revision"></ATTRIBUTE>
     <ATTRIBUTE NAME="Unit of Measure">TTT</ATTRIBUTE>
     <ATTRIBUTE NAME="Op Code">Unknown</ATTRIBUTE>
     <ATTRIBUTE NAME="Serial Indicator">No traceability</ATTRIBUTE>
     <ATTRIBUTE NAME="Material Type">HALB Semi Finished Products</ATTRIBUTE>
     <ATTRIBUTE NAME="ODA Cage Code"></ATTRIBUTE>
     <ATTRIBUTE NAME="Tooling Cross-Reference"></ATTRIBUTE>
     <ATTRIBUTE NAME="Vendor Part and CAGE Code"></ATTRIBUTE>
     <ATTRIBUTE NAME="Related Technology Export Classification"></ATTRIBUTE>
     <ATTRIBUTE NAME="Release Status">A</ATTRIBUTE>
     <ATTRIBUTE NAME="Special Conditions"></ATTRIBUTE>
     <ATTRIBUTE NAME="Spare Part">A</ATTRIBUTE>
     <ATTRIBUTE NAME="Description">TEST
TO CONCATENATE
LINES</ATTRIBUTE>
     <ATTRIBUTE NAME="Weight Unit">X</ATTRIBUTE>
     <ATTRIBUTE NAME="Print Code">Unknown</ATTRIBUTE>
     <ATTRIBUTE NAME="Originator"></ATTRIBUTE>
     <ATTRIBUTE NAME="Pb-Free">J</ATTRIBUTE>
    </ATTRIBUTELIST>
   </PART>
  </DATAAREA>
 </OBJECTDATA>
</OBJECT>
 
As you can see only  the beginning, ending and the data I am trying to fix actually do not have spaces at the begining of the lines.
 
Thank you for your help.
0
 
Murugesan NagarajanSubject-matter expert at delivery, implementation, and automation at UNIX oriented operating systems (Windows: CYGWIN_NT MINGW32_NT MINGW64_NT)Commented:
The following will work as expected:

awk 'BEGIN { previousLineAttribute = 0 ; }
{
      currLine = $0
      if(previousLineAttribute == 0 )
      {
            attributeLine = $0
            attributeLineLen = length(attributeLine) ;
            for( i=0; i <= attributeLineLen ; i++)
            {
                  firstCharacter = substr($0,i,1) ;
                  if( (firstCharacter==" ") || (firstCharacter=="      ") )
                  {
                        continue;
                  }
                  attributeLine = substr($0,i)
                  if(substr(attributeLine,1,11)=="")
                        {
                              print $0
                        }
                        else
                        {
                              previousLineAttribute = 1;
                              attributeLine = $0 ;
                        }
                  }
                  else
                  {
                        print $0
                  }
                  break;
            }
      }
      else
      {
            tmpCurrLine = currLine ;
            if ( substr(currLine,i,1)!=" " )
            {
                  tmpCurrLine = " "currLine
            }
            attributeLine=attributeLine""tmpCurrLine
            attributeLineLen = length(attributeLine) ;
            if(substr(attributeLine, attributeLineLen-11, attributeLineLen )=="")
            {
                  print attributeLine ;
                  previousLineAttribute = 0 ;
            }
      }
}' XMLFileName > ChangedFileName




http://www.geocities.com/mukeshgct/technical/shellscripting/awkATTRIBUTE.html
0
 
e033343Author Commented:
Thanks for your help.
0

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now