Solved

Extract table from html file string - Regular Expression?

Posted on 2009-07-16
8
1,405 Views
Last Modified: 2012-05-07
I'm opening and reading an HTML file into a string using PHP.  I don't excel at regular expressions so I'm looking for some help.  I need to be able to extract one table from this HTML file that I know will always come immediately after the string "<p>Click on Course ID link in the first column to drop/change a class.</p>", two newline characters and maybe a little white-space ( a single tab or whatever that translates into ).  I'm pretty sure this table will always start at the same line, but it will not always end at the same line.  Is this easily done using regular expressions?
Here's what I'm trying to do.  The table has some information I'm trying to extract, but I'm going to do it using JavaScript.  So, I'm trying to extract the info from an uploaded file into a php script that will print it so that when the page is loaded I can use JavaScript's DOM to gather the info.  I'm open to suggestions as an alternate method (i.e. a PHP table parser), but I'm on a shared server and I don't want to deal with any added extensions.
0
Comment
Question by:khsater
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
  • 2
8 Comments
 
LVL 40

Accepted Solution

by:
mrjoltcola earned 400 total points
ID: 24872034
Yes, probably something like this.

/<p>Click on Course ID link in the first column to drop/change a class.<\/p>.*?(<table>.*<\/table>)/


Table will be captured in $1 of a regex engine
0
 
LVL 4

Author Comment

by:khsater
ID: 24872965
You forgot to escape the one of the slashes, which took me a while to notice, but besides that I couldn't get the regular expression to match when I used fread to read the file.  It works perfectly when I copy and paste the code, though.  Any idea why?
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24873007
Not sure. the surrounding / / are just normal notation for regular expressions. Those are probably left off in PHP if I recall.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 4

Author Comment

by:khsater
ID: 24873057
That wasn't it.  The surrounding slashes are included in PHP pregs.
0
 
LVL 35

Assisted Solution

by:Terry Woods
Terry Woods earned 100 total points
ID: 24874016
Expanding on mrjoltcola's code, you'll want to allow the . regex wildcard to match newline characters by the sound of it. With preg_match, you'll need the "s" pattern modifier to do that:
$input = "blah
blah
asdfsdf  <p>Click on Course ID link in the first column to drop/change a class.</p>
  <table> table contents
more contents
more contents </table>
blah blah blah";
 
#with s pattern modifier to allow . to match newlines:
$pattern = "/<p>Click on Course ID link in the first column to drop\/change a class.<\/p>.*?(<table>.*?<\/table>)/s";  
preg_match($pattern, $input, $matches);
print "Result:{$matches[1]}";
 
Output:
Result:<table> table contents
more contents
more contents </table>

Open in new window

0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 24874030
Pattern modifiers etc are documented in the PCRE cheat sheet, in case you haven't seen it, available from here:
http://www.phpguru.org/article/pcre-cheat-sheet
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24874321
Thanks TerryAtOpus, I can't believe I blanked out on that one detail. I was a bit too distracted, I apologize.
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24874825
Terry did the heavy lifting on that one, I feel like I took too many points. :(
Glad to help.

0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

I imagine that there are some, like me, who require a way of getting currency exchange rates for implementation in web project from time to time, so I thought I would share a solution that I have developed for this purpose. It turns out that Yaho…
Introduction This article is intended for those who are new to PHP error handling (https://www.experts-exchange.com/articles/11769/And-by-the-way-I-am-New-to-PHP.html).  It addresses one of the most common problems that plague beginning PHP develop…
The viewer will learn how to count occurrences of each item in an array.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question