Cole3388
asked on
Scrape Data From Web Page and Add to Variables in ColdFusion
This is a sort of general question. What I'm looking for is a method to scrape data from a web page, such a a few numbers or a few words, and embed them in a cfset variable. I was just curious if anyone may have a reference or path I could get on to figure out how to do this. For example, ColdFusion heads over to the Fox news website and grabs the front page. I can later grab data out of this page if it exists. i.e. Grab every 10 words after the word NASA on the Fox news homepage. Make sense?
you may need to use Replace function to created a regex.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
> Grab every 10 words after the word NASA on the Fox news homepage
Again it would depend on how the content is formatted. "NASA" could be in the middle of plain text,
or in the middle of html tags <span style="....">NASA</span>. One approach would be to remove all
of the html tags first (see udf from cflib.org)
http://www.cflib.org/udf.cfm?id=1598
Then use a regex (or possibly adapt this function) to grab the next 10 words after "NASA"
http://www.cflib.org/udf/FullLeft
Again it would depend on how the content is formatted. "NASA" could be in the middle of plain text,
or in the middle of html tags <span style="....">NASA</span>. One approach would be to remove all
of the html tags first (see udf from cflib.org)
http://www.cflib.org/udf.cfm?id=1598
Then use a regex (or possibly adapt this function) to grab the next 10 words after "NASA"
http://www.cflib.org/udf/FullLeft