Solved

XML - reformat with end node name </XXXX> for each

Posted on 2014-12-23
5
80 Views
Last Modified: 2015-01-17
Hi

I need to add the end name to each node ( I think that is what its called) in the XML file.

I was able to do this using XML:TreeBuilder but I found out I can install any Perl Modules on the PC
So not modules.  Although I do have XML::Smart  XML::Simple all ready on the PC. Dont know if that helps.

Example

Before

<JOB APPLICATION="TEST_00002" APPL_TYPE="OS" >
      <RULE_BASED_CALENDARS NAME="*"/>
      <QUANTITATIVE NAME="TEST99990" ONFAIL="R" ONOK="R" QUANT="1"/>
      <QUANTITATIVE NAME="TEST99991" ONFAIL="R" ONOK="R" QUANT="1"/>
    </JOB>


needs to be

<JOB APPLICATION="TEST_00002" APPL_TYPE="OS" >
      <RULE_BASED_CALENDARS NAME="*"/></RULE_BASED_CALENDARS>
      <QUANTITATIVE NAME="TEST99990" ONFAIL="R" ONOK="R" QUANT="1"/></QUANTITATIVE>
      <QUANTITATIVE NAME="TEST99991" ONFAIL="R" ONOK="R" QUANT="1"/></QUANTITATIVE>
    </JOB>



Thanks
0
Comment
Question by:mikeysmailbox1
  • 3
5 Comments
 
LVL 26

Accepted Solution

by:
wilcoxon earned 334 total points
ID: 40515187
Why?  The first is perfectly valid XML?  Also, the second is invalid XML (you need to remove the / (changing QUANT="1"/> to QUANT="1">)).

This should do what you want provided each XML element is on one line (and not split across lines).  If it gives weird results, try reversing $1 and $2 (I always forget which order they go in when nested).

perl -i.bak -pe 's{(<(\w+)\b[^>]+)/>}{$1></$2>}g' input.xml

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 40515193
If you are unfamiliar with XML, the following two lines are equivalent:

<QUANTITATIVE NAME="something"/>
<QUANTITATIVE NAME="something></QUANTITATIVE>

The /> at the end of the first line acts as a shortcut to avoid having to do the second line.  Further, the first line is the preferred form (you only need an explicit end tag if the element itself has a value such as <QUANTITATIVE>something</QUANTITATIVE>).
0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 166 total points
ID: 40515199
perl -i.bak -pe 's#(<(\w+)[^>]*/>)(</$2>)?#$1</$2>#g' file.xml
0
 
LVL 26

Assisted Solution

by:wilcoxon
wilcoxon earned 334 total points
ID: 40537951
Both ozo's and my answer have minor problems.

Mine:
Will not work if the start tag does not have any attributes (I used + instead of *).
Will produce invalid XML if there is already an end tag (it will cause there to be two end tags).  However, the XML was already invalid if it both had <tag/> and </tag> anyway.

Ozo's:
Will produce invalid XML if there is not an end tag (the /> closer is left in as well as adding an end tag) which is the primary case you are asking about.

Here's a combined regex that fixes all issues I see:
perl -i.bak -pe 's{(<(\w+)[^>]*)/>(?:</$2>)?}{$1></$2>}g' input.xml

Open in new window

0

Featured Post

ScreenConnect 6.0 Free Trial

At ScreenConnect, partner feedback doesn't fall on deaf ears. We collected partner suggestions off of their virtual wish list and transformed them into one game-changing release: ScreenConnect 6.0. Explore all of the extras and enhancements for yourself!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
delete query using perl dbi 3 102
Any syntax error for this clone.plscript 6 139
Awk Question 2 127
perl syntax 3 4
On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This video shows how to use Hyena, from SystemTools Software, to bulk import 100 user accounts from an external text file. View in 1080p for best video quality.

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question