Solved

Replace colon characters in XML tag attribute

Posted on 2015-01-27
15
467 Views
Last Modified: 2015-01-27
Hi All,

I am converting XML to a JavaScript object in Javascript. I am 95% there using an external library for the grunt of the work. But I have noticed that some of the tag names include illegal characters such as the colon (:) if I want to use access their respective properties in Javascript using dot notation.

Here is as example of my XML and the output in JSON:

XML

<c:strRef>
                            <c:f>SheetTwo!$A$1</c:f>
                            <c:strCache>
                                <c:ptCount val="1"/>
                                <c:pt idx="0">
                                    <c:v>Hits</c:v>
                                </c:pt>
                            </c:strCache>
                        </c:strRef>

Open in new window


JSON

{
                            "c:strRef": {
                              "c:f": "SheetTwo!$A$1",
                              "c:strCache": {
                                "c:ptCount": {
                                  "val": 1
                                },
                                "c:pt": {
                                  "idx": 0,
                                  "c:v": "Hits"
                                }
                              }
                            }
                          }

Open in new window


What I would like to do in JavaScript is search and replace the colons from tag names only from my XML based string variable. This need to exclude replace colons from tag values (i.e. content should be preserved).

Please could anyone suggest a solution.

Thank you,

Rit
0
Comment
Question by:rito1
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 9
  • 6
15 Comments
 
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 40572651
Hi,
this should be possible using "simple" replace.
Question: which library are you using to convert XML to JSON?
We will have to adjust this during the processing of the XML.

The colon in XML is not illegal as it is used to address namespaces inside the XML.

HTH
Rainer
0
 
LVL 1

Author Comment

by:rito1
ID: 40572742
Hi Rainer,

Illegal was probably a strong word in my context :-)

I'm using a node package call xml2json.

I read my XML file into a variable and call toString() on it.

I was using the following but think this is a little too aggressive as it will replace all colon within my XML rather than just colons within my XML tag.

myXml.toString().replace(/r:/g, '')

Thanks,
0
 
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 40572752
Hi,
OK, that regex might be a little bit too aggressive :-)

An easy fix would be to replace:
"<c:" with "<c_" and
"</c:" with "</c_"
Then you still have a valid XML document which should then be parsed successfully.

HTH
Rainer
0
Cloud Training Guides

FREE GUIDES: In-depth and hand-crafted Linux, AWS, OpenStack, DevOps, Azure, and Cloud training guides created by Linux Academy instructors and the community.

 
LVL 1

Author Comment

by:rito1
ID: 40572788
Rainer, I appreciate your help.

I had only given you a snippet of my XML... my tags have different forms e.g. may not be <c:.. Ideally I would just want to detect and remove just colons rather than matching longer strings.

thanks,
Rit
0
 
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 40572898
Hi,
OK, this should do the trick:
var regexTagOpen = /(<[a-z]+)(:)([a-z]+>)/gi;
var regexTagClose = /(<\/[a-z]+)(:)([a-z]+>)/gi
var inputstring = "<a:test>just</a:test><even:more>Testing::has:to:come</even:more>";
var newOut = "";
newOut = inputstring.replace(regexTagOpen,"$1$3");
newOut = newOut.replace(regexTagClose,"$1$3");

Open in new window


It uses the matching groups and eliminates both in the opening as in the closing tags all colons and replaces this with "".
Online testground:
http://jsfiddle.net/EE_RainerJ/td88ay27/

HTH
Rainer
0
 
LVL 1

Author Comment

by:rito1
ID: 40573101
Hi Rainer

This XML is failing

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Properties xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties"
            xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes">
    <Application>Microsoft Macintosh Excel</Application>
    <DocSecurity>0</DocSecurity>
    <ScaleCrop>false</ScaleCrop>
    <HeadingPairs>
        <vt:vector size="2" baseType="variant">
            <vt:variant>
                <vt:lpstr>Worksheets</vt:lpstr>
            </vt:variant>
            <vt:variant>
                <vt:i4>4</vt:i4>
            </vt:variant>
        </vt:vector>
    </HeadingPairs>
    <TitlesOfParts>
        <vt:vector size="4" baseType="lpstr">
            <vt:lpstr>SheetUno</vt:lpstr>
            <vt:lpstr>SheetTwo</vt:lpstr>
            <vt:lpstr>AnotherSheet</vt:lpstr>
            <vt:lpstr>fourth sheet</vt:lpstr>
        </vt:vector>
    </TitlesOfParts>
    <LinksUpToDate>false</LinksUpToDate>
    <SharedDoc>false</SharedDoc>
    <HyperlinksChanged>false</HyperlinksChanged>
    <AppVersion>14.0300</AppVersion>
</Properties>

Open in new window


Can you see why?... is it because within 1 XML tag there are multiple colons?

Thanks,

Rit
0
 
LVL 1

Author Comment

by:rito1
ID: 40573106
Squashed for convenience here...

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Properties xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties" xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes"><Application>Microsoft Macintosh Excel</Application><DocSecurity>0</DocSecurity><ScaleCrop>false</ScaleCrop><HeadingPairs><vt:vector size="2" baseType="variant"><vt:variant><vt:lpstr>Worksheets</vt:lpstr></vt:variant><vt:variant><vt:i4>4</vt:i4></vt:variant></vt:vector></HeadingPairs><TitlesOfParts><vt:vector size="4" baseType="lpstr"><vt:lpstr>SheetUno</vt:lpstr><vt:lpstr>SheetTwo</vt:lpstr><vt:lpstr>AnotherSheet</vt:lpstr><vt:lpstr>fourth sheet</vt:lpstr></vt:vector></TitlesOfParts><LinksUpToDate>false</LinksUpToDate><SharedDoc>false</SharedDoc><HyperlinksChanged>false</HyperlinksChanged><AppVersion>14.0300</AppVersion></Properties>

Open in new window

0
 
LVL 1

Author Comment

by:rito1
ID: 40573128
<vt:vector size="2" baseType="variant">...

Open in new window


The colon in the above isn't being replaced.
0
 
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 40573231
Hi,
I think I got it now. One of the elements had some extension to it.
I updated my jsFiddle:
http://jsfiddle.net/EE_RainerJ/rb3q9cc4/

The major change are the new regex:
var regexTagOpen = /(<\w+)(:)([\w\s="]+>)/gi;
var regexTagClose = /(<\/\w+)(:)(\w+>)/gi;

Open in new window

HTH
Rainer
0
 
LVL 1

Author Comment

by:rito1
ID: 40573305
Hi Rainer,

I feel bad as I have a bunch of XML files which I am testing your pattern against and I have come up with the following:

<dcterms:modified xsi:type="dcterms:W3CDTF">2015-01-25T19:57:47Z</dctermsmodified>

Open in new window


... This is failing at the moment.
0
 
LVL 44

Accepted Solution

by:
Rainer Jeschor earned 500 total points
ID: 40573336
Hi,
no problem. I updated the regex for the tagopen (added the colon):
var regexTagOpen = /(<\w+)(:)([\w\s=":]+>)/gi;

Open in new window

and I updated the jsFiddle as well:
http://jsfiddle.net/EE_RainerJ/rb3q9cc4/

HTH
Rainer
0
 
LVL 1

Author Comment

by:rito1
ID: 40573578
Hi Rainer,

I am still struggling to parse the XML. If you don't mind I would like to close this question and award you the points and re-open a new/similar question but making the changes to the JSON instead as this should be simple to parse.

I hope that makes sense.

Thank you for your support, its much appreciated indeed.

Rit
0
 
LVL 1

Author Closing Comment

by:rito1
ID: 40573581
Really supportive and skilled particularly with Regexp.
0
 
LVL 44

Expert Comment

by:Rainer Jeschor
ID: 40573979
Hi Rit,
no problem - thanks a lot. It is always good to dig into Regex on a regular base - so powerful but so complex :-)
0
 
LVL 1

Author Comment

by:rito1
ID: 40574015
Hi Rainer,

If you have any time, you may be able to assist me with this question :-)...

http://mobile.experts-exchange.com/Programming/Languages/Scripting/JavaScript/Q_28605075.html
0

Featured Post

[Live Webinar] The Cloud Skills Gap

As Cloud technologies come of age, business leaders grapple with the impact it has on their team's skills and the gap associated with the use of a cloud platform.

Join experts from 451 Research and Concerto Cloud Services on July 27th where we will examine fact and fiction.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Having worked on larger scale sites, we found out that you are bound to look at more scalable solutions to integrating widgets, code snippets or complete applications and mesh them into functional sites, in any given composition. To share some of…
Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)
Suggested Courses

617 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question