Link to home
Start Free TrialLog in
Avatar of scotdance
scotdanceFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Merging two XML files into one

Hi
I need to merge two XML files into one using my app written in Delphi 6.
The main file(mainfile.xml) will be the one that the second
file(secondfile.xml) needs to be merged into.

If an entry does not exist in mainfile.xml that exists in secondfile.xml
then it should be added to mainfile.xml but if the entry exists in both the
files then it is not to be added to mainfile.xml  
It cannot overwrite any existing values in the mainfile.xml as these can be modified by the end user, the second file is our way of updating the mainfile.xml and adding new values that we have added to our system.

Samples of the xml files are below.

Thanks for any help


Example of input files and output file:
mainfile.xml has:

<?xml version="1.0" encoding="utf-8"?>
<languages ControlCount="5" DynamicCount="1" LangCount="7" DefLang="design">
 <controls>
  <control id="1" TextType="Single" LineCount="1">
   <language code="design" value="Yes"/>
  </control>
 </controls>
</languages>

seconfdile.xml has:

<?xml version="1.0" encoding="utf-8"?>
<languages ControlCount="5" DynamicCount="2" LangCount="7" DefLang="design">
 <controls>
  <control id="1" TextType="Single" LineCount="1">
   <language code="design" value="Yessssss"/>
   <language code="de" value="Ja"/>
  </control>
 </controls>
</languages>

output file should look like this:

<?xml version="1.0" encoding="utf-8"?>
<languages ControlCount="5" DynamicCount="2" LangCount="7" DefLang="design">
 <controls>    
  <control id="1" TextType="Single" LineCount="1">
   <language code="design" value="Yes"/>
   <language code="de" value="Ja"/>
  </control>
 </controls>
</languages>
Avatar of Gertone (Geert Bormans)
Gertone (Geert Bormans)
Flag of Belgium image

Hi scotdance,

you can pull in information from the second file,
using the document() function

a template for control could be
<xsl:template match='control'>
    <xsl:param name="cid" select="@id"/>
    <xsl:copy-of select="document('secondfile.xml')//control[@id = $cid]/language"/>
</xsl:template>

this solution copies all the language nodes
but I guess you can change it to only copy the nodes you want
(eg. by looping over language elements in the second doc, or by creating a key)

Cheers!
Avatar of scotdance

ASKER

Hi,

Sounds simple enough but this is the first time i am using XML and XSL, so any code or hints you can give would be good.

Also the final version of the main xml file has another level after the <controls> level called <dynamictexts> although the structure of them are similar to the control ones apart from a different name, if you need an example of the xml with this in it let me know.

Thanks
Avatar of jkmyoung
jkmyoung

A couple questions.
1. How do you know to update to DynamicCount="2" as opposed to leaving it?

2. Say original file has
<language code="design" value="Yes"/>
second file has
<language code="des2" value="Yes2"/>
<language code="des3" value="Yes3"/>

Would you get
<language code="design" value="Yes"/>
<language code="des2" value="Yes2"/>
<language code="des3" value="Yes3"/>
or
<language code="design" value="Yes"/>
<language code="des3" value="Yes3"/>
?
Does this apply to other nodes as well, eg, based on certain attributes? You may need to create seperate templates if this is the case.
scotdance,

here is the full merging code

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output indent="yes"/>
    <xsl:template match="node() | text()">
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:apply-templates select="node() | text()"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match='control'>
        <xsl:param name="cid" select="@id"/>
        <xsl:param name="mainlang" select="language"/>
        <xsl:param name="seclang" select="document('secondfile.xml')//control[@id = $cid]/language"></xsl:param>
        <xsl:copy>
            <xsl:copy-of select="@*"/>
            <xsl:copy-of select="$mainlang"/>
            <xsl:copy-of select="$seclang[not(@code = $mainlang/@code)]"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

This is an identity copier that copies the whole bunch and brings in the extra merged languages
- so, identity copying is what the first template is about
- the second template overrules the identity copier, for the control element

I saved the nodeset of languages of both documents (with the same control id) in a parameter
That facilitates comparing
Then a simple test to see if the code doesn't already exist:
$seclang[not(@code = $mainlang/@code)]
actually says: all the languages in the seclang parameter that don't have a code attribute that exists in the maindoc languages set
and I simply copy those

If you have similar elements with similar demands, just create an extra template, eg dynamictexts and change what has to be changed
if you dont want to merge dynamic texts, leave it as is, it will be picked up by default processing

if a control can have other elements beside language, add this line
           <xsl:copy-of select="$mainlang"/>
            <xsl:copy-of select="$seclang[not(@code = $mainlang/@code)]"/>
            <xsl:copy-of select="*[not(name() = 'language')]"/>

cheers

Geert
This is a dynamic version, based on position of elements.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
      <xsl:param name="doc2" select="'secondfile.xml'"/>
      <xsl:template match="*">
            <xsl:param name="element1" select="."/>
            <xsl:param name="element2" select="document($doc2)/*"/>
            <xsl:choose>
                  <xsl:when test="$element2">
                        <xsl:copy>
                              <xsl:copy-of select="@*"/><!-- copy all attributes from element 1-->
                              <xsl:for-each select="$element2/@*"><!-- copy remaining attributes from element2 -->
                                          <xsl:variable name="name" select="name()"/>
                                          <xsl:if test="not ($element1/@*[name() = $name])">
                                                      <xsl:copy-of select="."/>
                                          </xsl:if>
                              </xsl:for-each>
                              <xsl:for-each select="*"><!-- copy matching elements -->
                                    <xsl:variable name="name" select="name()"/>
                                    <xsl:variable name="count" select="count(preceding-sibling::*[name() = $name]) + 1"/>
                                    <xsl:apply-templates select=".">
                                          <xsl:with-param name="element2" select="($element2/*[name()=$name])[position() = $count]"/>
                                    </xsl:apply-templates>
                                    <xsl:for-each select="$element2/*"><!-- copy remaining elements from element2 -->
                                          <xsl:variable name="name2" select="name()"/>
                                          <xsl:if test="count($element1/*[name() = $name2]) &lt;= count(preceding-sibling::*[name() = $name2])">
                                                <xsl:copy-of select="."/>
                                          </xsl:if>
                                    </xsl:for-each>
                              </xsl:for-each>
                        </xsl:copy>
                  </xsl:when>
                  <xsl:otherwise>
                        <xsl:copy-of select="."/>
                  </xsl:otherwise>
            </xsl:choose>
      </xsl:template>
</xsl:stylesheet>
Thanks folks,
I'll have a look at them both and get back to you.
Gertone

Yours seems to be working OK for some of my XML I've managed to modify it for the some of my other parts but can't seem to get it to work for one part, well running ok through the xml editor I have, still to run it from Delphi.
also for some reason the encoding is being changed to utf-16
How can I stop the encoding being changed and also how can I add the extra lines for langcodes
new examples are below.
Should probably have given you them at start but didn't think it would have mattered.

Thanks again.

new example of mainfile.xml
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="langmerge.xsl" ?>
<languages ControlCount="5" DynamicCount="3" LangCount="7" DefLang="design">
   <langcodes>
      <langcode code="design" name="Design" />
   </langcodes>
 <controls>
  <control id="1" TextType="Single" LineCount="1">
   <language code="design" value="Yes"/>
  </control>
 </controls>
   <dynamictexts>
      <dynamictext id="1" TextType="Single" LineCount="1">
         <language code="design" value="&quot;Hello&quot;" />
      </dynamictext>
</languages>

new example of secondfile.xml
<?xml version="1.0" encoding="utf-8"?>
<languages ControlCount="5" DynamicCount="3" LangCount="7" DefLang="design">
   <langcodes>
      <langcode code="design" name="Design" />
      <langcode code="de" name="Deutsch" />
   </langcodes>
 <controls>
  <control id="1" TextType="Single" LineCount="1">
   <language code="design" value="Yessssss"/>
   <language code="de" value="Ja"/>
  </control>
 </controls>
   <dynamictexts>
      <dynamictext id="1" TextType="Single" LineCount="1">
         <language code="design" value="&quot;Hello&quot;" />
         <language code="de" value="hallo" />
      </dynamictext>
</languages>


new output file
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="langmerge.xsl" ?>
<languages ControlCount="5" DynamicCount="3" LangCount="7" DefLang="design">
   <langcodes>
      <langcode code="design" name="Design" />
      <langcode code="de" name="Deutsch" />
   </langcodes>
 <controls>
  <control id="1" TextType="Single" LineCount="1">
   <language code="design" value="Yessssss"/>
   <language code="de" value="Ja"/>
  </control>
 </controls>
   <dynamictexts>
      <dynamictext id="1" TextType="Single" LineCount="1">
         <language code="design" value="&quot;Hello&quot;" />
         <language code="de" value="hallo" />
      </dynamictext>
</languages>
jkmyoung
in response to your question about the output it should look like this:

first file has
<language code="design" value="Yes"/>
<language code="des2" value="Yes2"/>
second file has
<language code="des2" value="Yes2a"/>
<language code="des3" value="Yes3"/>

output
<language code="design" value="Yes"/>
<language code="des2" value="Yes2"/>
<language code="des3" value="Yes3"/>
Hi again
Sorry forgot to ask is it possible to write the output back to the mainfile.xml
or will i need to rename it to this in delphi?

Thanks again.
This modification takes into account the first attribute of every element, and matches the elements from both files that way by using variables aName and aVal. If there is no attribute in the element, it defaults to using the position.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format">
      <xsl:param name="doc2" select="'secondfile.xml'"/>
      <xsl:template match="*">
            <xsl:param name="element1" select="."/>
            <xsl:param name="element2" select="document($doc2)/*"/>
            <xsl:choose>
                  <xsl:when test="$element2">
                        <xsl:copy>
                              <xsl:copy-of select="@*"/>
                              <!-- copy all attributes from element 1-->
                              <xsl:for-each select="$element2/@*">
                                    <!-- copy remaining attributes from element2 -->
                                    <xsl:variable name="name" select="name()"/>
                                    <xsl:if test="not ($element1/@*[name() = $name])">
                                          <xsl:copy-of select="."/>
                                    </xsl:if>
                              </xsl:for-each>
                              <xsl:for-each select="*">
                                    <!-- copy matching elements -->
                                    <xsl:variable name="name" select="name()"/>
                                    <xsl:choose>
                                          <xsl:when test="@*">
                                                <xsl:variable name="aName" select="name((@*)[1])"/>
                                                <xsl:variable name="aVal" select="(@*)[1]"/>
                                                <xsl:apply-templates select=".">
                                                      <xsl:with-param name="element2" select="($element2/*[name()=$name][@*[name()=$aName and . = $aVal]])[1]"/>
                                                </xsl:apply-templates>
                                          </xsl:when>
                                          <xsl:otherwise>
                                                <xsl:variable name="count" select="count(preceding-sibling::*[name() = $name]) + 1"/>
                                                <xsl:apply-templates select=".">
                                                      <xsl:with-param name="element2" select="($element2/*[name()=$name])[position() = $count]"/>
                                                </xsl:apply-templates>
                                          </xsl:otherwise>
                                    </xsl:choose>
                                    <xsl:for-each select="$element2/*">
                                          <!-- copy remaining elements from element2 -->
                                          <xsl:variable name="name2" select="name()"/>
                                          <xsl:choose>
                                                <xsl:when test="@*">
                                                      <xsl:variable name="aName" select="name((@*)[1])"/>
                                                      <xsl:variable name="aVal" select="(@*)[1]"/>
                                                      <xsl:if test="not ($element1/*[name()=$name][@*[name()=$aName and . = $aVal]])">
                                                            <xsl:copy-of select="."/>
                                                      </xsl:if>
                                                </xsl:when>
                                                <xsl:otherwise>
                                                      <xsl:if test="count($element1/*[name() = $name2]) &lt;= count(preceding-sibling::*[name() = $name2])">
                                                            <xsl:copy-of select="."/>
                                                      </xsl:if>                                                
                                                </xsl:otherwise>
                                          </xsl:choose>
                                    </xsl:for-each>
                              </xsl:for-each>
                        </xsl:copy>
                  </xsl:when>
                  <xsl:otherwise>
                        <xsl:copy-of select="."/>
                  </xsl:otherwise>
            </xsl:choose>
      </xsl:template>
</xsl:stylesheet>

It's much safer, and probably faster to write your output to a temporary file. Otherwise, all of the file (either before transformation or after transformation) has to be loaded into memory before deletion.
Hi jkmyoung

That script seems to work ok for small files, but when I ran it on my master file which has 5 languages and added 2 new languages from my update file it was adding the new languages in the output file at least 5 times.  It was not just doing this for the language part but all the other nodes too that were being added.
I've listed the two language sections from my file and the output your script is producing below.
Thanks again

example mainfile.xml:
   <langcodes>
      <langcode code="design" name="Design" />
      <langcode code="de" name="Deutsch" />
      <langcode code="fr" name="Francais" />
      <langcode code="it" name="Italiano" />
      <langcode code="ru" name="Russian" />
   </langcodes>

example secondfile.xml:
   <langcodes>
      <langcode code="design" name="Design" />
      <langcode code="de" name="Deutsch" />
      <langcode code="ar" name="Arabic" />
      <langcode code="jp" name="Japanese" />
   </langcodes>

example output:
   <langcodes>
      <langcode code="design" name="Design">
      </langcode>
      <langcode code="ar" name="Arabic" />
      <langcode code="jp" name="Japanese" />
      <langcode code="de" name="Deutsch">
      </langcode>
      <langcode code="ar" name="Arabic" />
      <langcode code="jp" name="Japanese" />
      <langcode code="fr" name="Francais" />
      <langcode code="ar" name="Arabic" />
      <langcode code="jp" name="Japanese" />
      <langcode code="it" name="Italiano" />
      <langcode code="ar" name="Arabic" />
      <langcode code="jp" name="Japanese" />
      <langcode code="ru" name="Russian" />
      <langcode code="ar" name="Arabic" />
      <langcode code="jp" name="Japanese" />
   </langcodes>
Also for some unknown reason the encoding is still being changed to utf-16 from utf-8
ASKER CERTIFIED SOLUTION
Avatar of Gertone (Geert Bormans)
Gertone (Geert Bormans)
Flag of Belgium image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Gertone

That seems to be working fine so far, the XML editor I was using was changing the UTF to 16 but your fix stopped that, although Delphi is still doing it so just need to get it working there now. And test it on the largest files we use.

Cheers
Hi
I do not suppose you know of a good processor that can be used in delphi to stop the utf being changed?

Cheers
well, I assume you create a processor object in Delphi,
it could be a simple attribute to the processor that allows you to select the encoding from within Delphi

I am not a Delphi programmer... so this is based on assumption

In ASP it is simply a matter of using output streams instead of strings to allow the XSLT setting to take precedence
Sounds like you are thinking the same as me there about the attributes, they just don't want to work for me, lol

I'll just put a question on the delphi board about it,

Going to close this question now.  

Thanks for your help
welcome
if you find a solution at the Delphi forum, can you post a link here?

cheers
Sure no problem
Hi,

I may have some more XML questions for you later, looks like they now want a filter and search facility in the system, I'll see if these can be done in delphi first.  But if i do post i'll put a link here for you.

For now thanks