Solved

XML - reformat with end node name </XXXX> for each

Posted on 2014-12-23
5
77 Views
Last Modified: 2015-01-17
Hi

I need to add the end name to each node ( I think that is what its called) in the XML file.

I was able to do this using XML:TreeBuilder but I found out I can install any Perl Modules on the PC
So not modules.  Although I do have XML::Smart  XML::Simple all ready on the PC. Dont know if that helps.

Example

Before

<JOB APPLICATION="TEST_00002" APPL_TYPE="OS" >
      <RULE_BASED_CALENDARS NAME="*"/>
      <QUANTITATIVE NAME="TEST99990" ONFAIL="R" ONOK="R" QUANT="1"/>
      <QUANTITATIVE NAME="TEST99991" ONFAIL="R" ONOK="R" QUANT="1"/>
    </JOB>


needs to be

<JOB APPLICATION="TEST_00002" APPL_TYPE="OS" >
      <RULE_BASED_CALENDARS NAME="*"/></RULE_BASED_CALENDARS>
      <QUANTITATIVE NAME="TEST99990" ONFAIL="R" ONOK="R" QUANT="1"/></QUANTITATIVE>
      <QUANTITATIVE NAME="TEST99991" ONFAIL="R" ONOK="R" QUANT="1"/></QUANTITATIVE>
    </JOB>



Thanks
0
Comment
Question by:mikeysmailbox1
  • 3
5 Comments
 
LVL 26

Accepted Solution

by:
wilcoxon earned 334 total points
ID: 40515187
Why?  The first is perfectly valid XML?  Also, the second is invalid XML (you need to remove the / (changing QUANT="1"/> to QUANT="1">)).

This should do what you want provided each XML element is on one line (and not split across lines).  If it gives weird results, try reversing $1 and $2 (I always forget which order they go in when nested).

perl -i.bak -pe 's{(<(\w+)\b[^>]+)/>}{$1></$2>}g' input.xml

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 40515193
If you are unfamiliar with XML, the following two lines are equivalent:

<QUANTITATIVE NAME="something"/>
<QUANTITATIVE NAME="something></QUANTITATIVE>

The /> at the end of the first line acts as a shortcut to avoid having to do the second line.  Further, the first line is the preferred form (you only need an explicit end tag if the element itself has a value such as <QUANTITATIVE>something</QUANTITATIVE>).
0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 166 total points
ID: 40515199
perl -i.bak -pe 's#(<(\w+)[^>]*/>)(</$2>)?#$1</$2>#g' file.xml
0
 
LVL 26

Assisted Solution

by:wilcoxon
wilcoxon earned 334 total points
ID: 40537951
Both ozo's and my answer have minor problems.

Mine:
Will not work if the start tag does not have any attributes (I used + instead of *).
Will produce invalid XML if there is already an end tag (it will cause there to be two end tags).  However, the XML was already invalid if it both had <tag/> and </tag> anyway.

Ozo's:
Will produce invalid XML if there is not an end tag (the /> closer is left in as well as adding an end tag) which is the primary case you are asking about.

Here's a combined regex that fixes all issues I see:
perl -i.bak -pe 's{(<(\w+)[^>]*)/>(?:</$2>)?}{$1></$2>}g' input.xml

Open in new window

0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Remove Malware code from PHP file 6 74
Perl program to obtain a machine's memory usage 6 22
batch script for automated email 12 100
create a gui in perl 3 70
Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Learn how to create flexible layouts using relative units in CSS.  New relative units added in CSS3 include vw(viewports width), vh(viewports height), vmin(minimum of viewports height and width), and vmax (maximum of viewports height and width).

912 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now