Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 165
  • Last Modified:

Merge Excel file with contents of XML file

Hi guys

Got a bit of an issue one here and hopefully someone will be able to direct me correctly with this.

We have 2 files:

File01-InitialFile
File02-ScanningExtract

The File01-InitialFile is an Excel spreadsheet that contains some info like Persons name, ID, DoB, etc.

The second document - File02-ScanningExtract - is an extract file from a scanner once we have scanned a paper based copy of the persons paper files. It's output as either a .DAT or .XML file so I've included the XML file as it's probably easier to work with.

I need to find a way to merge both these files...that's easy isn't it? Well herein lies my problem...

File02_ScanningExtract does not contain any links to the File01-InitialFile. There's no ID field within it like there is in File01-InitialFile but I do know that the start of each paper file is Section 01. So if we look at File02-ScanningExtract we see code like this:

<document> 
<field   level = "system" name = "Document Filename" value = "0001.pdf"/>
<field   level = "system" name = "Image count in document" value = "5"/>
<field   level = "system" name = "Created date &amp; time" value = "2015.06.19 10:45:29"/>
<field   level = "document" name = "ID" value = "01"/>
<field   level = "document" name = "Description" value = "SECTION 01"/>
</document>

Open in new window


So I can see that the Field ID has a value of "01". There can be anything up to 22 sections here but not every section would have something in them. If they are blank then there is no info on them.

So at the end of the first persons file and the beginning of the seconds I would see this:

<document> 
<field   level = "system" name = "Document Filename" value = "0018.pdf"/>
<field   level = "system" name = "Image count in document" value = "12"/>
<field   level = "system" name = "Created date &amp; time" value = "2015.06.19 10:45:29"/>
<field   level = "document" name = "ID" value = "22"/>
<field   level = "document" name = "Description" value = "SECTION 22"/>
</document> 
<document> 
<field   level = "system" name = "Document Filename" value = "0019.pdf"/>
<field   level = "system" name = "Image count in document" value = "5"/>
<field   level = "system" name = "Created date &amp; time" value = "2015.06.19 10:45:29"/>
<field   level = "document" name = "ID" value = "01"/>
<field   level = "document" name = "Description" value = "SECTION 01"/>
</document> 

Open in new window


So here I know the first paper file has ended and the second one has begun (Section 22 in this example is the end and Section 01 is the start of the new persons file).

Now I know that File01-InitialFile is ALWAYS in the correct order and I know that File02-ScanningExtract is ALWAYS in the correct order so I know this:
The sections 01 up the next section starting 01 is attached to the first person;
The next section from 01 until it gets to the next 01 is for the 2nd person;
From that Section 01 until the next section 01 would be for the third person and so on (hope that bit makes sense).
Each file attached here should hopefully give you an idea what I'm meaning with the above and I've also taken this data and merged it manually into an XLSX file before extracting to XML to give you an idea what I'm trying to achieve and how it will probably be laid out. -this is the File03-MergedFiles file.

Hopefully that makes sense. We could do this manually of course but the data is about to get fairly large very quickly so we're looking to be able to automate this as much as possible. Any way that this can be done shouldn't be an issue; we just need a solution if possible.

Thanx for looking and I appreciate any help offered.
File01-InitialFile.xlsx
File02-ScanningExtract.xml
File03-MergedFiles.xml
0
Steven O'Neill
Asked:
Steven O'Neill
  • 8
  • 6
3 Solutions
 
Geert BormansCommented:
It seems the end result you need is XML.
The "grouping" mechanism in the scanned file is pretty straightforward and clear
Based on that information, I would do the following

- export initial excel into XML
- use XSLT to merge two excel files
- done

If this seems workable for you, I can do an XSLT example that covers this
execution would simply be a command line process, can do a batch with it too
0
 
Steven O'NeillSolutions ArchitectAuthor Commented:
Sounds absolutely ideal. If you can do that then that'd be fantastic :)
0
 
Geert BormansCommented:
Hi,

This is what I just generated automatically

<?xml version="1.0" encoding="UTF-8"?>
<data-set>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>5</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>01</FOLDER_ID>
      <DESCRIPTION>SECTION 01</DESCRIPTION>
      <FILE>0001.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>06</FOLDER_ID>
      <DESCRIPTION>SECTION 06</DESCRIPTION>
      <FILE>0002.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>07</FOLDER_ID>
      <DESCRIPTION>SECTION 07</DESCRIPTION>
      <FILE>0003.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>08</FOLDER_ID>
      <DESCRIPTION>SECTION 08</DESCRIPTION>
      <FILE>0004.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>09</FOLDER_ID>
      <DESCRIPTION>SECTION 09</DESCRIPTION>
      <FILE>0005.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>10</FOLDER_ID>
      <DESCRIPTION>SECTION 10</DESCRIPTION>
      <FILE>0006.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>11</FOLDER_ID>
      <DESCRIPTION>SECTION 11</DESCRIPTION>
      <FILE>0007.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>12</FOLDER_ID>
      <DESCRIPTION>SECTION 12</DESCRIPTION>
      <FILE>0008.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>13</FOLDER_ID>
      <DESCRIPTION>SECTION 13</DESCRIPTION>
      <FILE>0009.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>14</FOLDER_ID>
      <DESCRIPTION>SECTION 14</DESCRIPTION>
      <FILE>0010.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>15</FOLDER_ID>
      <DESCRIPTION>SECTION 15</DESCRIPTION>
      <FILE>0011.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>16</FOLDER_ID>
      <DESCRIPTION>SECTION 16</DESCRIPTION>
      <FILE>0012.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>37</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>17</FOLDER_ID>
      <DESCRIPTION>SECTION 17</DESCRIPTION>
      <FILE>0013.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>18</FOLDER_ID>
      <DESCRIPTION>SECTION 18</DESCRIPTION>
      <FILE>0014.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>19</FOLDER_ID>
      <DESCRIPTION>SECTION 19</DESCRIPTION>
      <FILE>0015.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>20</FOLDER_ID>
      <DESCRIPTION>SECTION 20</DESCRIPTION>
      <FILE>0016.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>21</FOLDER_ID>
      <DESCRIPTION>SECTION 21</DESCRIPTION>
      <FILE>0017.pdf</FILE>
   </record>
   <record>
      <ID>ID0098754379</ID>
      <FORENAME>Person1FirstName</FORENAME>
      <SURNAME>Person1SecondName</SURNAME>
      <DOB>15071971</DOB>
      <IDREF>ID0098754379</IDREF>
      <COUNT>12</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>22</FOLDER_ID>
      <DESCRIPTION>SECTION 22</DESCRIPTION>
      <FILE>0018.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>5</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>01</FOLDER_ID>
      <DESCRIPTION>SECTION 01</DESCRIPTION>
      <FILE>0019.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>06</FOLDER_ID>
      <DESCRIPTION>SECTION 06</DESCRIPTION>
      <FILE>0020.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>07</FOLDER_ID>
      <DESCRIPTION>SECTION 07</DESCRIPTION>
      <FILE>0021.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>08</FOLDER_ID>
      <DESCRIPTION>SECTION 08</DESCRIPTION>
      <FILE>0022.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>09</FOLDER_ID>
      <DESCRIPTION>SECTION 09</DESCRIPTION>
      <FILE>0023.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>10</FOLDER_ID>
      <DESCRIPTION>SECTION 10</DESCRIPTION>
      <FILE>0024.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>11</FOLDER_ID>
      <DESCRIPTION>SECTION 11</DESCRIPTION>
      <FILE>0025.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>12</FOLDER_ID>
      <DESCRIPTION>SECTION 12</DESCRIPTION>
      <FILE>0026.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>13</FOLDER_ID>
      <DESCRIPTION>SECTION 13</DESCRIPTION>
      <FILE>0027.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>14</FOLDER_ID>
      <DESCRIPTION>SECTION 14</DESCRIPTION>
      <FILE>0028.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>15</FOLDER_ID>
      <DESCRIPTION>SECTION 15</DESCRIPTION>
      <FILE>0029.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>16</FOLDER_ID>
      <DESCRIPTION>SECTION 16</DESCRIPTION>
      <FILE>0030.pdf</FILE>
   </record>
   <record>
      <ID>ID0034517859</ID>
      <FORENAME>Person2FirstName</FORENAME>
      <SURNAME>Person2SecondName</SURNAME>
      <DOB>15071936</DOB>
      <IDREF>ID0034517859</IDREF>
      <COUNT>35</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>17</FOLDER_ID>
      <DESCRIPTION>SECTION 17</DESCRIPTION>
      <FILE>0031.pdf</FILE>
   </record>
   <record>
      <ID>ID0045628716</ID>
      <FORENAME>Person3FirstName</FORENAME>
      <SURNAME>Person3SecondName</SURNAME>
      <DOB>15071956</DOB>
      <IDREF>ID0045628716</IDREF>
      <COUNT>5</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>01</FOLDER_ID>
      <DESCRIPTION>SECTION 01</DESCRIPTION>
      <FILE>0032.pdf</FILE>
   </record>
   <record>
      <ID>ID0045628716</ID>
      <FORENAME>Person3FirstName</FORENAME>
      <SURNAME>Person3SecondName</SURNAME>
      <DOB>15071956</DOB>
      <IDREF>ID0045628716</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>18</FOLDER_ID>
      <DESCRIPTION>SECTION 18</DESCRIPTION>
      <FILE>0033.pdf</FILE>
   </record>
   <record>
      <ID>ID0045628716</ID>
      <FORENAME>Person3FirstName</FORENAME>
      <SURNAME>Person3SecondName</SURNAME>
      <DOB>15071956</DOB>
      <IDREF>ID0045628716</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>19</FOLDER_ID>
      <DESCRIPTION>SECTION 19</DESCRIPTION>
      <FILE>0034.pdf</FILE>
   </record>
   <record>
      <ID>ID0045628716</ID>
      <FORENAME>Person3FirstName</FORENAME>
      <SURNAME>Person3SecondName</SURNAME>
      <DOB>15071956</DOB>
      <IDREF>ID0045628716</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>20</FOLDER_ID>
      <DESCRIPTION>SECTION 20</DESCRIPTION>
      <FILE>0035.pdf</FILE>
   </record>
   <record>
      <ID>ID0045628716</ID>
      <FORENAME>Person3FirstName</FORENAME>
      <SURNAME>Person3SecondName</SURNAME>
      <DOB>15071956</DOB>
      <IDREF>ID0045628716</IDREF>
      <COUNT>2</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>21</FOLDER_ID>
      <DESCRIPTION>SECTION 21</DESCRIPTION>
      <FILE>0036.pdf</FILE>
   </record>
   <record>
      <ID>ID0045628716</ID>
      <FORENAME>Person3FirstName</FORENAME>
      <SURNAME>Person3SecondName</SURNAME>
      <DOB>15071956</DOB>
      <IDREF>ID0045628716</IDREF>
      <COUNT>8</COUNT>
      <FILE_DATE>2015.06.19 10:45:29</FILE_DATE>
      <EVENT_DATE>2015.06.19 10:45:29</EVENT_DATE>
      <FOLDER_ID>22</FOLDER_ID>
      <DESCRIPTION>SECTION 22</DESCRIPTION>
      <FILE>0037.pdf</FILE>
   </record>
</data-set>

Open in new window


All I needed for that is the excel data saved as XML like this
(you can use a little XML schema for that)

<data-set>
    <record>
        <ID>ID0098754379</ID>
        <FORENAME>Person1FirstName</FORENAME>
        <SURNAME>Person1SecondName</SURNAME>
        <DOB>15071971</DOB>
     </record>
    <record>
        <ID>ID0034517859</ID>
        <FORENAME>Person2FirstName</FORENAME>
        <SURNAME>Person2SecondName</SURNAME>
        <DOB>15071936</DOB>
    </record>
    <record>
        <ID>ID0045628716</ID>
        <FORENAME>Person3FirstName</FORENAME>
        <SURNAME>Person3SecondName</SURNAME>
        <DOB>15071956</DOB>
    </record>
</data-set>

Open in new window


Do you think that it is feasible for you to generate the XML from excel like that
The column names don't matter (that is only a small fix to the XSLT if you want different column names)
If you export the other fields as well, that does not make a difference

Can you also verify the result I generated automatically
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
Geert BormansCommented:
Later today, in order to make this work from a command line, you will need the latest Saxon Home Edition (that is an XSLT processor)
http://sourceforge.net/projects/saxon/files/Saxon-HE/9.6/SaxonHE9-6-0-6J.zip/download
you will need the saxon9he.jar only from the zip, but it does not hurt to put all jar in a directory for it
0
 
Geert BormansCommented:
Then make a project directory
and put this XSLT in the project directory
merge-files.xsl
0
 
Geert BormansCommented:
Of course, this XSLT is subject to change depending on your requirements
0
 
Steven O'NeillSolutions ArchitectAuthor Commented:
Working on this now to see what we can do...thanx Geert, be back to you shortly
0
 
Steven O'NeillSolutions ArchitectAuthor Commented:
Hi Geert

Apologies for the delay in getting back to you. That's me now taken the File01-InitialFile.xlsx file and created the XML file as you have asked for above.

Looks like this:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<data-set>
	<record>
		<DoB>15071971</DoB>
		<ID>ID0098754379</ID>
		<Forename>Person1FirstName</Forename>
		<Surname>Person1SecondName</Surname>
	</record>
	<record>
		<DoB>15071936</DoB>
		<ID>ID0034517859</ID>
		<Forename>Person2FirstName</Forename>
		<Surname>Person2Secondname</Surname>
	</record>
	<record>
		<DoB>15071956</DoB>
		<ID>ID0045628716</ID>
		<Forename>Person3FirstName</Forename>
		<Surname>Person3FirstName</Surname>
	</record>
</data-set>

Open in new window


I've also downloaded the Saxon Home Edition; extracted the files and placed the saxon9he.jar file into the same folder.

So in my Project Folder I now have:
File01-InitialFile.xml
import_mapping.xsd (the mapping schema file for Excel)
merge-files.xslt (your stylesheet from above)
saxon9he.jar

Should I have my File02-ScanningExtract.xml in here as well?

Now I have this together what's the next step to get my File03-MergedFiles.xml type document out?

Thanx again

Steven
0
 
Geert BormansCommented:
Hi,

First you need to understand that element names in XML are case sensitive.
So I changed the XSLT to accept the File1 xml as you created it
Just make sure you don't change the cases

If you put all teh files in one directory
you can create a bat file with the following content

set root=C:\LocalData\GBSP\excel-merge\

set xslt=%root%merge-files.xsl

set saxon=%root%saxon9he.jar

set file1=%root%File01-InitialFile.xml

set file2=%root%File02-ScanningExtract.xml

set file3=%root%File03-merged.xml

java -Xmx1200M -Xss3000k -cp %saxon% net.sf.saxon.Transform -xsl:%xslt% -s:%file2% -o:%file3% initial-file-name=%file1%

Open in new window


basically you only need to change the root variable (give it a trailing '\' please)
maybe you need to change the filenames too, I kept the numbering of the variables as you created them
%File3% is the result filename

run the batch a you should see a transformed file

Good luck,
Geert
0
 
Geert BormansCommented:
apparently attachment went void
here it is again
merge-files.xsl
0
 
Steven O'NeillSolutions ArchitectAuthor Commented:
Thanx Geert

Works very nicely from this initial test. I have one last question on this one though and it's to do with the batch file above. The original files are potentially going to have spaces in the file names so I need a way to handle that in the script (think I need to sort that in the batch file as I know there can be issues with spaces and how they are handled).

Can you advise how the batch file should handle this please?

Thanx

I will have a few more questions on this topic coming shortly but I'll raise these as a separate question entirely as this was a first phase check of a single file and I'll be looking to potentially perform this work on a folder (with a whole host of individual files in there) but I'll come back to this one as I say.

Thanx again
0
 
Steven O'NeillSolutions ArchitectAuthor Commented:
Hi Geert

I've sorted the spaces issue now within the batch files so I'm going to accept your solution for this one but I will be back with a few more questions tomorrow so hopefully you're around lol.

Thanx again

Steven
0
 
Steven O'NeillSolutions ArchitectAuthor Commented:
Superb amount of information from Geert and very much appreciated
0
 
Geert BormansCommented:
Hi Steven,
You are welcome.
I was travelling between posting the batch information and just now
the space issue is something you can handle by putting quotes around the filenames (variables) in the batch
we might need some little trick for the initial-file

In order to make this operate on a folder, there is a lot you can do in the batch file
Very happy to get that going tomorrow
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 8
  • 6
Tackle projects and never again get stuck behind a technical roadblock.
Join Now