Sorry, no. It needs to be done in JavaScript at this point.
Main Topics
Browse All TopicsI've been searching through the Regular Expression posts, but have not been able to successfully implement what I want. I want to read though a string consisting of an entire HTML page, and pull out some key text from specific elements and ideally populate a multidimensional array with the text.
Each entry for the array should have 8 sub items, and I only need the first 10 entries located within the original HTML string.
The items needed are located within a table and all, except for one, are wrapped in a div with a class of "list". The exception is the href attribute of the anchor element that immediately precedes the first div.
The code example lists an example entry (the text strings needed are wrapped in curly brackets {}). The at the beginning of the div.list appears to be inconsistent. It is not wanted regardless of if it's there or not.
Any help would be appreciated.
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
Sorry for the delay in replying. That may be possible, however, I wasn't aware that it's possible to get elements from a string variable in the same method as getting it from the page's DOM. I'll look into it further tomorrow.
The HTML is all within a string because I use a simple ColdFusion script (that acts as a proxy) to scrape an external page, then return the HTML as a string. I then plan to use JavaScript to extract the relevant contact details from that HTML string and append them onto the page.
The other issue is the page I'm scraping isn't well formed HTML, though hopefully it won't be a problem as I'm simply planning on searching for items (div.list and the a@href immediately prior to the first div.list in the tr) within the HTML page/string.
I'm using jQuery's $get function as shown below, via a server side script acting as a proxy, to allow me to grab the external HTML and include in on the current page.
The "proxy.cfm" uses the "urltarget" to scrape the relevant page (because JavaScript can't do it itself due to browser cross-domain security), and it then returns the page content which gets assigned to the variable "data", which I then assign as a new variable "rawHTML".
Yes. That looks much closer to what I want.
The end result is I want a 2 dimensional array containing the contacts details, for example:
Array[10][7], with an example of one row of the array containing:
[url, name, position, extension, unit, agency, phone, address]
Is it possible to have the same regular expression match the URL in the a@href preceding the first div.list containing the {name}, or do I need to do a separate RegEx?
Yes. I believe it always has the 7 lines within the group (within the <tr></tr>). The information would be outputted from a database somewhere, so I don't imagine the number of lines or elements would change.
The href attribute is always present. The others are also always present, though they may be empty.
Business Accounts
Answer for Membership
by: HonorGodPosted on 2009-08-06 at 17:22:18ID: 25039176
If you are willing to use Python, there is a wonder library called BeautifulSoup that parses the HTML, and lets you easily search for stuff like you want.
ware/Beaut ifulSoup/
http://www.python.org
http://www.crummy.com/soft