How do I join multiple lines in a text file with simple criteria of the lines to be joined and make certain the data keeps the correct spacing?

I am working with an interface that passes data from one system to another in XML. One of the systems has a description that can have multiple lines in freeform text. The text is delimited by <ATTRIBUTE NAME="Description"> at the beginning and </ATTRIBUTE> at the end. An example of what a file might look like is this:

        <ATTRIBUTE NAME="Description"> This is the description
of the item in
question. </ATTRIBUTE>

What I want it to end up looking like is the following:

        <ATTRIBUTE NAME="Description"> This is the description of the item in question. </ATTRIBUTE>

Spaces may or may not need to be added to make the data correct. If the lines were simply joined the above data would look like:

        <ATTRIBUTE NAME="Description"> This is the descriptionof the item inquestion. </ATTRIBUTE>

So, I also need to make certain that spaces are in the appropriate locations after the lines are joined.
e033343Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

e033343Author Commented:
Should be multiple lines, instead of two lines.
0
murugesandinsShell_script Automation /bin/bash /bin/bash.exe /bin/ksh /bin/mksh.exe AIX C C++ CYGWIN_NT HP-UX Linux MINGW32 MINGW64 SunOS Windows_NTCommented:
#Execute the following command:
awk 'BEGIN {spaceInPreviousLine=-1;ATTRIBUTEsentence="";}
{
    if(substr($0,1,10)=="")
    {
        if(substr($0,1,1)!=" ")
        {
            currSentence=" "$o
        }
        else
        {
            currSentence=$0
        }
        ATTRIBUTEsentence=ATTRIBUTEsentence""currSentence
        spaceInPreviousLine=-1
        print ATTRIBUTEsentence
    }
    else
    {
        if(spaceInPreviousLine==0)
        {
            if(substr($0,1,1)!=" ")
            {
                currSentence=" "$o
            }
            else
            {
            currSentence=$0
            }
        }
        else if(spaceInPreviousLine==-1)
        {
            OtherLines=$0
            print OtherLines
        }
        else
        {
            currSentence=$0
        }
        ATTRIBUTEsentence=ATTRIBUTEsentence""currSentence
    }
}' XMLfile > RequiredFileName

http://www.geocities.com/mukeshgct/technical/shellscripting/awkATTRIBUTE.html
0
e033343Author Commented:
I am working with this code. I think I will be able to modify it to get what I want. Thanks for the help.
0
Amazon Web Services

Are you thinking about creating an Amazon Web Services account for your business? Not sure where to start? In this course you’ll get an overview of the history of AWS and take a tour of their user interface.

murugesandinsShell_script Automation /bin/bash /bin/bash.exe /bin/ksh /bin/mksh.exe AIX C C++ CYGWIN_NT HP-UX Linux MINGW32 MINGW64 SunOS Windows_NTCommented:
Let us know if any further changes are required for this code.

For the following file:
##################
testing12
testing11
testing11
 First
 of the item in
 of the item in
 of the item in
of the item in
of the item in
 of the item in
of the item in
 of the item in
 of the item in
of the item in
question.
testing9
testing8
testing7
testing7
Second
of the item in
question.
testing5
testing4
testing3
testing3
 Third
of the item in
question.
testing1
##################


this code will have the following output:
##################
testing12
testing11
testing11
 First of the item in of the item in of the item in of the item in of the item in of the item in of the item in of the item in of the item in of the item in question.
testing9
testing8
testing7
testing7
Second of the item in question.
testing5
testing4
testing3
testing3
 Third of the item in question.
testing1
##################
0
e033343Author Commented:
The code works great for the data you have in the test file. The data that I have is formatted with preceding spaces to Identify sections. Here is an example of how the data actually looks:
<?xml version="1.0" encoding="US-ASCII"?><OBJECT>
 <OBJECTDATA>
  <CONTROLAREA>
   <BSR>
    <VERB>Save</VERB>
    <NOUN>DIB</NOUN>
    <INTERFACEID>PDMCSAP</INTERFACEID>
   </BSR>
   <SOURCE LOCATION="ABC" DIRECTION="OUT" REQUESTERID="GHI">
    <AUTHID>PDMCSAP</AUTHID>
    <DATE>
     <MONTH>11</MONTH>
     <DAY>05</DAY>
     <YEAR>2008</YEAR>
    </DATE>
    <TIME>13:20</TIME>
   </SOURCE>
  </CONTROLAREA>
  <DATAAREA>
   <PART NAME="ABC" REVISION="Z" VAULT="PPP" POLICY="X" STATE="A" TYPE="Part" NEWDESIGN="Yes">
    <ATTRIBUTELIST>
     <ATTRIBUTE NAME="End Item">XX</ATTRIBUTE>
     <ATTRIBUTE NAME="RoHS Compliant">Unassigned</ATTRIBUTE>
     <ATTRIBUTE NAME="Method of Classification">ODI</ATTRIBUTE>
     <ATTRIBUTE NAME="Controlling Document Revision"></ATTRIBUTE>
     <ATTRIBUTE NAME="Unit of Measure">TTT</ATTRIBUTE>
     <ATTRIBUTE NAME="Op Code">Unknown</ATTRIBUTE>
     <ATTRIBUTE NAME="Serial Indicator">No traceability</ATTRIBUTE>
     <ATTRIBUTE NAME="Material Type">HALB Semi Finished Products</ATTRIBUTE>
     <ATTRIBUTE NAME="ODA Cage Code"></ATTRIBUTE>
     <ATTRIBUTE NAME="Tooling Cross-Reference"></ATTRIBUTE>
     <ATTRIBUTE NAME="Vendor Part and CAGE Code"></ATTRIBUTE>
     <ATTRIBUTE NAME="Related Technology Export Classification"></ATTRIBUTE>
     <ATTRIBUTE NAME="Release Status">A</ATTRIBUTE>
     <ATTRIBUTE NAME="Special Conditions"></ATTRIBUTE>
     <ATTRIBUTE NAME="Spare Part">A</ATTRIBUTE>
     <ATTRIBUTE NAME="Description">TEST
TO CONCATENATE
LINES</ATTRIBUTE>
     <ATTRIBUTE NAME="Weight Unit">X</ATTRIBUTE>
     <ATTRIBUTE NAME="Print Code">Unknown</ATTRIBUTE>
     <ATTRIBUTE NAME="Originator"></ATTRIBUTE>
     <ATTRIBUTE NAME="Pb-Free">J</ATTRIBUTE>
    </ATTRIBUTELIST>
   </PART>
  </DATAAREA>
 </OBJECTDATA>
</OBJECT>
 
As you can see only  the beginning, ending and the data I am trying to fix actually do not have spaces at the begining of the lines.
 
Thank you for your help.
0
murugesandinsShell_script Automation /bin/bash /bin/bash.exe /bin/ksh /bin/mksh.exe AIX C C++ CYGWIN_NT HP-UX Linux MINGW32 MINGW64 SunOS Windows_NTCommented:
The following will work as expected:

awk 'BEGIN { previousLineAttribute = 0 ; }
{
      currLine = $0
      if(previousLineAttribute == 0 )
      {
            attributeLine = $0
            attributeLineLen = length(attributeLine) ;
            for( i=0; i <= attributeLineLen ; i++)
            {
                  firstCharacter = substr($0,i,1) ;
                  if( (firstCharacter==" ") || (firstCharacter=="      ") )
                  {
                        continue;
                  }
                  attributeLine = substr($0,i)
                  if(substr(attributeLine,1,11)=="")
                        {
                              print $0
                        }
                        else
                        {
                              previousLineAttribute = 1;
                              attributeLine = $0 ;
                        }
                  }
                  else
                  {
                        print $0
                  }
                  break;
            }
      }
      else
      {
            tmpCurrLine = currLine ;
            if ( substr(currLine,i,1)!=" " )
            {
                  tmpCurrLine = " "currLine
            }
            attributeLine=attributeLine""tmpCurrLine
            attributeLineLen = length(attributeLine) ;
            if(substr(attributeLine, attributeLineLen-11, attributeLineLen )=="")
            {
                  print attributeLine ;
                  previousLineAttribute = 0 ;
            }
      }
}' XMLFileName > ChangedFileName




http://www.geocities.com/mukeshgct/technical/shellscripting/awkATTRIBUTE.html
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
e033343Author Commented:
Thanks for your help.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Shell Scripting

From novice to tech pro — start learning today.