XML - reformat with end node name </XXXX> for each

Hi

I need to add the end name to each node ( I think that is what its called) in the XML file.

I was able to do this using XML:TreeBuilder but I found out I can install any Perl Modules on the PC
So not modules.  Although I do have XML::Smart  XML::Simple all ready on the PC. Dont know if that helps.

Example

Before

<JOB APPLICATION="TEST_00002" APPL_TYPE="OS" >
      <RULE_BASED_CALENDARS NAME="*"/>
      <QUANTITATIVE NAME="TEST99990" ONFAIL="R" ONOK="R" QUANT="1"/>
      <QUANTITATIVE NAME="TEST99991" ONFAIL="R" ONOK="R" QUANT="1"/>
    </JOB>


needs to be

<JOB APPLICATION="TEST_00002" APPL_TYPE="OS" >
      <RULE_BASED_CALENDARS NAME="*"/></RULE_BASED_CALENDARS>
      <QUANTITATIVE NAME="TEST99990" ONFAIL="R" ONOK="R" QUANT="1"/></QUANTITATIVE>
      <QUANTITATIVE NAME="TEST99991" ONFAIL="R" ONOK="R" QUANT="1"/></QUANTITATIVE>
    </JOB>



Thanks
LVL 1
mikeysmailbox1Asked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
wilcoxonConnect With a Mentor Commented:
Why?  The first is perfectly valid XML?  Also, the second is invalid XML (you need to remove the / (changing QUANT="1"/> to QUANT="1">)).

This should do what you want provided each XML element is on one line (and not split across lines).  If it gives weird results, try reversing $1 and $2 (I always forget which order they go in when nested).

perl -i.bak -pe 's{(<(\w+)\b[^>]+)/>}{$1></$2>}g' input.xml

Open in new window

0
 
wilcoxonCommented:
If you are unfamiliar with XML, the following two lines are equivalent:

<QUANTITATIVE NAME="something"/>
<QUANTITATIVE NAME="something></QUANTITATIVE>

The /> at the end of the first line acts as a shortcut to avoid having to do the second line.  Further, the first line is the preferred form (you only need an explicit end tag if the element itself has a value such as <QUANTITATIVE>something</QUANTITATIVE>).
0
 
ozoConnect With a Mentor Commented:
perl -i.bak -pe 's#(<(\w+)[^>]*/>)(</$2>)?#$1</$2>#g' file.xml
0
 
wilcoxonConnect With a Mentor Commented:
Both ozo's and my answer have minor problems.

Mine:
Will not work if the start tag does not have any attributes (I used + instead of *).
Will produce invalid XML if there is already an end tag (it will cause there to be two end tags).  However, the XML was already invalid if it both had <tag/> and </tag> anyway.

Ozo's:
Will produce invalid XML if there is not an end tag (the /> closer is left in as well as adding an end tag) which is the primary case you are asking about.

Here's a combined regex that fixes all issues I see:
perl -i.bak -pe 's{(<(\w+)[^>]*)/>(?:</$2>)?}{$1></$2>}g' input.xml

Open in new window

0
All Courses

From novice to tech pro — start learning today.