I'm a newbie to XML and I could use some help, big time. A vendor that I am dealing with deviated from the interface specs that I had coded to, neglected to notify me of the changes and now I am under the gun to modify my
processing to work ASAP!!!.
Basically, I need to merge two xml input files on an approximate date/timestamp. Date/timestamps are in ascending order in both input files.
Here is a sample of gps.xml input file:
<?xml version="1.0"?>
<root>
<header>
<driver_id></driver_id>
<vehicle_id>101</vehicle_i
d>
<duty_shift_id></duty_shif
t_id>
<route_id></route_id>
<cid_terminal_id>5146</cid
_terminal_
id>
</header>
<header>
<driver_id></driver_id>
<vehicle_id>101</vehicle_i
d>
<duty_shift_id></duty_shif
t_id>
<route_id></route_id>
<cid_terminal_id>5146</cid
_terminal_
id>
</header>
<record>
<longitude>-105.111111</lo
ngitude>
<latitude>39.111111</latit
ude>
<date_time>2003/12/10.10:5
5</date_ti
me>
</record>
<record>
<longitude>-106.222222</lo
ngitude>
<latitude>38.555555</latit
ude>
<date_time>2003/12/10.11:0
5</date_ti
me>
</record>
<header>
<driver_id></driver_id>
<vehicle_id>101</vehicle_i
d>
<duty_shift_id></duty_shif
t_id>
<route_id></route_id>
<cid_terminal_id>5146</cid
_terminal_
id>
</header>
<record>
<longitude>-107.333333</lo
ngitude>
<latitude>37.444444</latit
ude>
<date_time>2003/12/10.11:1
5</date_ti
me>
</record>
</root>
The gps.xml file can have <header>'s with no <record>'s but the <header>'s are extraneous. I only care about merging the <record>'s in the gps.xml file to the tran.xml file.
The <longitude> and <latitude> tags in each gps.xml <record> identify a light rail transit stop location.
Here is a sample of the tran.xml input file:
<?xml version="1.0"?>
<root>
<header>
<driver_id></driver_id>
<vehicle_id>101</vehicle_i
d>
<duty_shift_id></duty_shif
t_id>
<route_id>C</route_id>
<cid_terminal_id>5141</cid
_terminal_
id>
</header>
<header>
<driver_id></driver_id>
<vehicle_id>101</vehicle_i
d>
<duty_shift_id></duty_shif
t_id>
<route_id>C</route_id>
<cid_terminal_id>5141</cid
_terminal_
id>
</header>
<record>
<longitude></longitude>
<latitude></latitude>
<date_time>2003/12/10.11:0
0</date_ti
me>
<tag_id>1111111111111111</
tag_id>
<stop_location_id></stop_l
ocation_id
>
<fare_type_cd>E</fare_type
_cd>
<blacklist_cd></blacklist_
cd>
</record>
<header>
<driver_id></driver_id>
<vehicle_id>101</vehicle_i
d>
<duty_shift_id></duty_shif
t_id>
<route_id>C</route_id>
<cid_terminal_id>5141</cid
_terminal_
id>
</header>
<record>
<longitude></longitude>
<latitude></latitude>
<date_time>2003/12/10.11:1
0</date_ti
me>
<tag_id>2222222222222222</
tag_id>
<stop_location_id></stop_l
ocation_id
>
<fare_type_cd>E</fare_type
_cd>
<blacklist_cd></blacklist_
cd>
</record>
<record>
<longitude></longitude>
<latitude></latitude>
<date_time>2003/12/10.11:2
0</date_ti
me>
<tag_id>3333333333333333</
tag_id>
<stop_location_id></stop_l
ocation_id
>
<fare_type_cd>E</fare_type
_cd>
<blacklist_cd></blacklist_
cd>
</record>
<record>
<longitude></longitude>
<latitude></latitude>
<date_time>2003/12/10.11:2
0</date_ti
me>
<tag_id>4444444444444444</
tag_id>
<stop_location_id></stop_l
ocation_id
>
<fare_type_cd>E</fare_type
_cd>
<blacklist_cd></blacklist_
cd>
</record>
</root>
The tran.xml file can have multiple <header>'s and multiple <record>'s within/following each <header>. It can also have <header>'s with no following <record>'s.
Each <date_time> tag value in the tran.xml file needs to be matched to the previous or equal <date_time> tag value in the gps.xml file with the longitude and latitude merged with the tran.xml data, retaining the tran.xml
<date_time> tag value and written to a tab-delimited output file as follows (tabs are represented by ?, four records):
?101??C?5141?-105.111111?3
9.111111??
?2003/12/1
0.11:00?11
1111111111
1111??E??
?101??C?5141?-106.222222?3
8.555555??
?2003/12/1
0.11:10?22
2222222222
2222??E??
?101??C?5141?-107.333333?3
7.444444??
?2003/12/1
0.11:20?33
3333333333
3333??E??
?101??C?5141?-107.333333?3
7.444444??
?2003/12/1
0.11:20?44
4444444444
4444??E??
Here is the xsl script (merge_lrv_gps_and_trans_t
o_tab_deli
m.xsl) that worked according to the original specs (matching on exact date/timestamps in the two input files):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:param name="lrv_gps_file"/>
<xsl:variable name="delim" select="'	'"/> <!-- tab -->
<xsl:variable name="nl" select="'
'"/> <!-- newline -->
<xsl:variable name="head">
<xsl:for-each select="/root/header/*">
<xsl:value-of select="concat(., $delim)"/>
</xsl:for-each>
</xsl:variable>
<xsl:template match="/">
<xsl:apply-templates select="root/record"/>
</xsl:template>
<xsl:template match="record">
<!--note: if gps.xml is in a different directory, you will need to use the relative path or URL-->
<!-- select="document('gps.xml'
)/root/rec
ord[date_t
ime = current()/date_time]"/> -->
<xsl:variable name="gps"
select="document($lrv_gps_
file)/root
/record[da
te_time = current()/date_time]"/>
<xsl:variable name="vHeader">
<xsl:for-each select="preceding-sibling:
:header[1]
/*">
<xsl:value-of select="concat(., $delim)"/>
</xsl:for-each>
</xsl:variable>
<xsl:value-of select="$vHeader"/>
<xsl:value-of select="concat($gps/longit
ude, $delim, $gps/latitude, $delim)"/>
<xsl:for-each select="*">
<xsl:value-of select="concat(., $delim)"/>
</xsl:for-each>
<xsl:value-of select="$nl"/>
</xsl:template>
</xsl:stylesheet>
<!-- java com.icl.saxon.StyleSheet -o tab_delim.dat tran.xml merge_lrv_gps_and_trans_to
_tab_delim
.xsl lrv_gps_file=gps.xml -->
Can anyone help with a solution to this?
TIA
Start Free Trial