stuayre
asked on
extract text from html file
Hi,
I need a function to extract some data from inside a html string.
the information repeats itself on the page, here's an example..in a <tr>
<tr><td align=left class="stdtext">AA-T128</a ></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T128">Coriander Oil 10ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£2.67</td> <td align=right class="stdtext"><input type=text size=4 name="qty_AA-T128" value="0"></td><td align=right class="stdtext"></td></tr>
I need to extract the product code AA-T128 and the stock 2
any ideas?
cheers
Stu
I need a function to extract some data from inside a html string.
the information repeats itself on the page, here's an example..in a <tr>
<tr><td align=left class="stdtext">AA-T128</a
I need to extract the product code AA-T128 and the stock 2
any ideas?
cheers
Stu
ASKER
Hi,
you're right I think that's a mistake on the website's part. here's a bigger chunk of code.
its not the problem tho :)
Stu
you're right I think that's a mistake on the website's part. here's a bigger chunk of code.
its not the problem tho :)
Stu
<tr><td align=left class="stdtext">AA-T118</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T118">Mandarin Oil 10ml</td><td align=right class="stdtext">1</td><td align=right class="stdtext">£2.67</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T118" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code48" value="AA-T119"><tr><td align=left class="stdtext">AA-T119</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T119">Myrrh Oil 5ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£3.37</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T119" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code49" value="AA-T1191"><tr><td align=left class="stdtext">AA-T1191</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T1191">Myrrh Oil 10ml</td><td align=right class="stdtext">1</td><td align=right class="stdtext">£5.39</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T1191" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code50" value="AA-T120"><tr><td align=left class="stdtext">AA-T120</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T120">Patchouli Oil 10ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£2.29</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T120" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code51" value="AA-T121"><tr><td align=left class="stdtext">AA-T121</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T121">Peppermint English Oil 10ml</td><td align=right class="stdtext">6</td><td align=right class="stdtext">£2.42</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T121" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code52" value="AA-T122"><tr><td align=left class="stdtext">AA-T122</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T122">Pine Scotch Oil 10ml</td><td align=right class="stdtext">3</td><td align=right class="stdtext">£2.42</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T122" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code53" value="AA-T123"><tr><td align=left class="stdtext">AA-T123</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T123">Rosemary Oil 10ml</td><td align=right class="stdtext">3</td><td align=right class="stdtext">£2.29</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T123" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code54" value="AA-T124"><tr><td align=left class="stdtext">AA-T124</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T124">Sandalwood Oil 5ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£5.12</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T124" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code55" value="AA-T1241"><tr><td align=left class="stdtext">AA-T1241</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T1241">Sandalwood Oil 10ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£9.02</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T1241" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code56" value="AA-T125"><tr><td align=left class="stdtext">AA-T125</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T125">Tea Tree Oil 10ml</td><td align=right class="stdtext">8</td><td align=right class="stdtext">£2.21</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T125" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code57" value="AA-T1251"><tr><td align=left class="stdtext">AA-T1251</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T1251">Tea Tree Oil 30ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£4.50</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T1251" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code58" value="AA-T126"><tr><td align=left class="stdtext">AA-T126</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T126">Ylang Ylang I Oil 10ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£3.10</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T126" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code59" value="AA-T127"><tr><td align=left class="stdtext">AA-T127</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T127">Cedarwood Atlas Oil 10ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£2.13</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T127" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code60" value="AA-T128"><tr><td align=left class="stdtext">AA-T128</a></td><td align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T128">Coriander Oil 10ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£2.67</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T128" value="0"></td><td align=right class="stdtext"></td></tr>
actually it will be ...
you would be using a parser to extract the value from a structure
The structure is invalid --> so the parser will fail at extracting the value
you would be using a parser to extract the value from a structure
The structure is invalid --> so the parser will fail at extracting the value
it's full of mistakes, not much you can do with that
ASKER
is it possible to write a reg exp to get everything between
<tr><td align=left class="stdtext"> and </a></td><td align=left class="stdtext"><a href
to get the product code
and then another one to get everything between..
</td><td align=right class="stdtext"> and </td><td align=right class="stdtext">£
to get the qty ?
i found this question if it helps
https://www.experts-exchange.com/questions/22104497/Extract-data-from-HTML-Tables-form-post.html
cheers
Stu
<tr><td align=left class="stdtext"> and </a></td><td align=left class="stdtext"><a href
to get the product code
and then another one to get everything between..
</td><td align=right class="stdtext"> and </td><td align=right class="stdtext">£
to get the qty ?
i found this question if it helps
https://www.experts-exchange.com/questions/22104497/Extract-data-from-HTML-Tables-form-post.html
cheers
Stu
This small class rather helpful such tag parsing works
http://fit.c2.com/Release/Source/fit/Parse.java
http://fit.c2.com/Release/Source/fit/Parse.java
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi Geert_Gruwez, it almost works.
it came back with this...in memo2
it came back with this...in memo2
AA-T118
AA-T1191</a></td><td
align=left class="stdtext"><a href="http://www.domain.com/shops/directlink.asp?name=AA-T1191">Myrrh Oil 10ml</td><td align=right class="stdtext">1</td><td align=right class="stdtext">£5.39</td><td align=right class="stdtext"><input type=text
size=4 name="qty_AA-T1191" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code50" value="AA-T120"><tr><td align=left class="stdtext">AA-T120</a></td><td align=left class="stdtext"><a
href="http://www.domain.com/shops/directlink.asp?name=AA-T120">Patchouli Oil 10ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£2.29</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-
T120" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code51" value="AA-T121"><tr><td align=left class="stdtext">AA-T121</a></td><td align=left class="stdtext"><a
href="http://www.domain.com/shops/directlink.asp?name=AA-T121">Peppermint English Oil 10ml</td><td align=right class="stdtext">6</td><td align=right class="stdtext">£2.42</td><td align=right class="stdtext"><input type=text size=4
name="qty_AA-T121" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code52" value="AA-T122"><tr><td align=left class="stdtext">AA-T122</a></td><td align=left class="stdtext"><a
href="http://www.domain.com/shops/directlink.asp?name=AA-T122">Pine Scotch Oil 10ml</td><td align=right class="stdtext">3</td><td align=right class="stdtext">£2.42</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-
T122" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code53" value="AA-T123"><tr><td align=left class="stdtext">AA-T123</a></td><td align=left class="stdtext"><a
href="http://www.domain.com/shops/directlink.asp?name=AA-T123">Rosemary Oil 10ml</td><td align=right class="stdtext">3</td><td align=right class="stdtext">£2.29</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-
T123" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code54" value="AA-T124"><tr><td align=left class="stdtext">AA-T124</a></td><td align=left class="stdtext"><a
href="http://www.domain.com/shops/directlink.asp?name=AA-T124">Sandalwood Oil 5ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£5.12</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-
T124" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code55" value="AA-T1241"><tr><td align=left class="stdtext">AA-T1241</a></td><td align=left class="stdtext"><a
href="http://www.domain.com/shops/directlink.asp?name=AA-T1241">Sandalwood Oil 10ml</td><td align=right class="stdtext">2</td><td align=right class="stdtext">£9.02</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA
-T1241" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code56" value="AA-T125"><tr><td align=left class="stdtext">AA-T125</a></td><td align=left class="stdtext"><a
href="http://www.domain.com/shops/directlink.asp?name=AA-T125">Tea Tree Oil 10ml</td><td align=right class="stdtex
</td><td align=right class="stdtext">£3.10</td><td align=right class="stdtext"><input type=text size=4 name="qty_AA-T126" value="0"></td><td align=right class="stdtext"></td></tr><input type=hidden name="code59" value="AA-T127"><tr><td
align=left class="stdtext">AA-T127
ASKER
oh sorry i had word wrap on... doh!
ASKER
thanks :)
some parts are invalid
<a href="">Coriander Oil 10ml</td>
should be
<a href="">Coriander Oil 10ml</a>
and you have a missing start tag for <A too ...
or isn't this the problem ?