Manipulate large XML file into a small one.

Hi
I have a large XML file from a supplier [over 500mb] but not all the data is relavent to me. So I would like to download the file locally [Windows 7] edit the file by removing all child nodes that I don't need and save to a new XML file by using a software readly available.

It'll be do difficult to go through each node 1 at a time so I would like to search for any subchild nodes that contain "<type_id>17</type_id>" and keep that child node and remove the rest.

Hope this make any sense, any questions please contact me.

Thank you.
LVL 1
ACEAFTYAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ste5anSenior DeveloperCommented:
XSLT is the right tool. You can use the free Visual Studio Community Edition to play with it.
0
Paul SimmonsCommented:
Here is another utility which you may find useful:

http://www.microsoft.com/en-us/download/details.aspx?id=21714
0
ACEAFTYAuthor Commented:
Hi
Thanks for the possible solutions. I'm new to XML and I would like to keep the XML file not change it to XSL file or anything else. I need a script that will remove any child nodes that do not contain a particular subchild node value.

e.g.

<books>
    <book>
        <title>Title 1</title>
        <published>2012</published>
        <author>David</author>
        <isbn>2345234523454</isbn>
    </book>
    <book>
        <title>Title 2</title>
        <published>2014</published>
        <author>Sam</author>
        <isbn>23452345344</isbn>
    </book>
    <book>
        <title>Title 3</title>
        <published>2011</published>
        <author>Keith</author>
        <isbn>566745674</isbn>
    </book>
    <book>
        <title>Title 4</title>
        <published>2015</published>
        <author>David</author>
        <isbn>456745674567</isbn>
    </book>
</books>

Open in new window


using the above example I want any <book> elements removed that doesn't contain the author David.

Hope this explains what I'm trying to achieve.
0
Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

ste5anSenior DeveloperCommented:
XSLT uses a separate style sheet (XSL). This style sheet defines how to parse, split and reassemble your XML file. You normally also specify to create a new file.

E.g. something like

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
  <xsl:output method="XML" version="1.0" encoding="UTF-8" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="book[author='David']">
    <xsl:element name="book">
      <xsl:copy-of select="*" />
    </xsl:element>
  </xsl:template>
  <xsl:template match="book[author!='David']" />

</xsl:stylesheet>
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
  <xsl:output method="XML" version="1.0" encoding="UTF-8" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="book[author='David']">
    <xsl:element name="book">
      <xsl:copy-of select="*" />
    </xsl:element>
  </xsl:template>
  <xsl:template match="book[author!='David']" />

</xsl:stylesheet>

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
ACEAFTYAuthor Commented:
Hi

I apologise for not responding. My father passed away and have been dealing this the loss. I will check through the comments and respond with my feedback.
0
ACEAFTYAuthor Commented:
Hi

I'm new to working with XML and XSLT files. I have no idea were to start and I've read a few tutorial and haven't been able to make much sense of it.

I created the XML file:

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="XSLTFile1.xslt"?>

<books>
  <book>
    <title>Title 1</title>
    <published>2012</published>
    <author>David</author>
    <isbn>2345234523454</isbn>
  </book>
  <book>
    <title>Title 2</title>
    <published>2014</published>
    <author>Sam</author>
    <isbn>23452345344</isbn>
  </book>
  <book>
    <title>Title 3</title>
    <published>2011</published>
    <author>Keith</author>
    <isbn>566745674</isbn>
  </book>
  <book>
    <title>Title 4</title>
    <published>2015</published>
    <author>David</author>
    <isbn>456745674567</isbn>
  </book>
</books>

Open in new window


and the XSLT file:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
  <xsl:output method="XML" version="1.0" encoding="UTF-8" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="book[author='David']">
    <xsl:element name="book">
      <xsl:copy-of select="*" />
    </xsl:element>
  </xsl:template>
  <xsl:template match="book[author!='David']" />

</xsl:stylesheet>

Open in new window


I ran the XML file in the browser and nothing happened.
0
ste5anSenior DeveloperCommented:
Use either Visual Studio or msxsl.exe as Paul wrote. The browser as tool makes only sense, when you use XSLT to create HTML as output.
0
ACEAFTYAuthor Commented:
I installed Visual Studio and created the 2 files mentioned above.

What I don't know how to do is execute the files so a new XML file is create.
0
ste5anSenior DeveloperCommented:
Just open the XSLT without opening or creating a solution. Then you'll see XML menu:

Untitled.png
Press Start XSTL debugging.
0
ACEAFTYAuthor Commented:
I don't have that option.

screenshot1.jpg
0
ste5anSenior DeveloperCommented:
screenshot1.jpg
0
ACEAFTYAuthor Commented:
Yes when I click on XML only option that I have that I can click on is Schemas.... 'Start XSLT Debugging' doesn't appear
0
ste5anSenior DeveloperCommented:
Ah, I'm sorry, that's my fault. I really thought that it is also supported in the community edition. But that's not the case.

Use the command line tool.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
XML

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.