VHSB
asked on
regex element cannot be found
I have the following regex: Salary</h3>([[:print:]] +)
The html Im trying to parse is:<h3>Salary</h3>£12,500 - £14,500 pa</div>
Here is my code to get the salary: <cfset MatchSalary=REFindNoCase(# Trim(xmlOb j.xmlRoot. site[1].de tailpagepa rsers.pars e[3].xmlAt tributes.r e)#, cfhttp.FileContent,1,True) >
<cfset thisSalary = mid(cfhttp.FileContent,Mat chSalary.p os[2],Matc hSalary.le n[2])>
Problem:
The element at position 2 cannot be found.
The error occurred in C:\CFusionMX\wwwroot\Proje ct\1.cfm: line 33
Called from C:\CFusionMX\wwwroot\Proje ct\1.cfm: line 21
Called from C:\CFusionMX\wwwroot\Proje ct\1.cfm: line 1
31 :
32 : <cfset MatchSalary=REFindNoCase(# Trim(xmlOb j.xmlRoot. site[1].de tailpagepa rsers.pars e[3].xmlAt tributes.r e)#, cfhttp.FileContent,1,True) >
33 : <cfset thisSalary = mid(cfhttp.FileContent,Mat chSalary.p os[2],Matc hSalary.le n[2])>
The html Im trying to parse is:<h3>Salary</h3>£12,500 - £14,500 pa</div>
Here is my code to get the salary: <cfset MatchSalary=REFindNoCase(#
<cfset thisSalary = mid(cfhttp.FileContent,Mat
Problem:
The element at position 2 cannot be found.
The error occurred in C:\CFusionMX\wwwroot\Proje
Called from C:\CFusionMX\wwwroot\Proje
Called from C:\CFusionMX\wwwroot\Proje
31 :
32 : <cfset MatchSalary=REFindNoCase(#
33 : <cfset thisSalary = mid(cfhttp.FileContent,Mat
I would suggest doing a <cfdump var="#matchSalary#"> and seeing if the data you're coming up with is what you want - it sounds like its not grabbing the right info with your regex, what does it show when you cfdump matchsalary?
ASKER
I changed the regex to: Salary[</h3>]([[:print: ]]+) and that got rid of the error message, but it has also given me a new problem.
That regex only returns something if there is text between the tags, for example:
for the HTML: <h3>Salary</h3>Depending on experience and qualifications</div>
The regex returns: Depending on experience and qualifications</div>
But for the HTML: <h3>Salary</h3>£12,500 - £14,500 pa</div>
The regex returns nothing.
Do you think it might be something to do with the £ character in the html?
That regex only returns something if there is text between the tags, for example:
for the HTML: <h3>Salary</h3>Depending on experience and qualifications</div>
The regex returns: Depending on experience and qualifications</div>
But for the HTML: <h3>Salary</h3>£12,500 - £14,500 pa</div>
The regex returns nothing.
Do you think it might be something to do with the £ character in the html?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Addendum: If the content in between the </h3> tag and the </div> tag contains html, it'll only grab up to the next html tag - however given your questions I'm under the impression you expect this to not contain any html.
ASKER
Umbrae
"If the content in between the </h3> tag and the </div> tag contains html, it'll only grab up to the next html tag - however given your questions I'm under the impression you expect this to not contain any html." Yes you were right.
It worked a treat, thanks for your time.
Regards
"If the content in between the </h3> tag and the </div> tag contains html, it'll only grab up to the next html tag - however given your questions I'm under the impression you expect this to not contain any html." Yes you were right.
It worked a treat, thanks for your time.
Regards