Solved

Extract table from html file string - Regular Expression?

Posted on 2009-07-16
8
1,407 Views
Last Modified: 2012-05-07
I'm opening and reading an HTML file into a string using PHP.  I don't excel at regular expressions so I'm looking for some help.  I need to be able to extract one table from this HTML file that I know will always come immediately after the string "<p>Click on Course ID link in the first column to drop/change a class.</p>", two newline characters and maybe a little white-space ( a single tab or whatever that translates into ).  I'm pretty sure this table will always start at the same line, but it will not always end at the same line.  Is this easily done using regular expressions?
Here's what I'm trying to do.  The table has some information I'm trying to extract, but I'm going to do it using JavaScript.  So, I'm trying to extract the info from an uploaded file into a php script that will print it so that when the page is loaded I can use JavaScript's DOM to gather the info.  I'm open to suggestions as an alternate method (i.e. a PHP table parser), but I'm on a shared server and I don't want to deal with any added extensions.
0
Comment
Question by:khsater
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
  • 2
8 Comments
 
LVL 40

Accepted Solution

by:
mrjoltcola earned 400 total points
ID: 24872034
Yes, probably something like this.

/<p>Click on Course ID link in the first column to drop/change a class.<\/p>.*?(<table>.*<\/table>)/


Table will be captured in $1 of a regex engine
0
 
LVL 4

Author Comment

by:khsater
ID: 24872965
You forgot to escape the one of the slashes, which took me a while to notice, but besides that I couldn't get the regular expression to match when I used fread to read the file.  It works perfectly when I copy and paste the code, though.  Any idea why?
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24873007
Not sure. the surrounding / / are just normal notation for regular expressions. Those are probably left off in PHP if I recall.
0
Revamp Your Training Process

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action.

 
LVL 4

Author Comment

by:khsater
ID: 24873057
That wasn't it.  The surrounding slashes are included in PHP pregs.
0
 
LVL 35

Assisted Solution

by:Terry Woods
Terry Woods earned 100 total points
ID: 24874016
Expanding on mrjoltcola's code, you'll want to allow the . regex wildcard to match newline characters by the sound of it. With preg_match, you'll need the "s" pattern modifier to do that:
$input = "blah
blah
asdfsdf  <p>Click on Course ID link in the first column to drop/change a class.</p>
  <table> table contents
more contents
more contents </table>
blah blah blah";
 
#with s pattern modifier to allow . to match newlines:
$pattern = "/<p>Click on Course ID link in the first column to drop\/change a class.<\/p>.*?(<table>.*?<\/table>)/s";  
preg_match($pattern, $input, $matches);
print "Result:{$matches[1]}";
 
Output:
Result:<table> table contents
more contents
more contents </table>

Open in new window

0
 
LVL 35

Expert Comment

by:Terry Woods
ID: 24874030
Pattern modifiers etc are documented in the PCRE cheat sheet, in case you haven't seen it, available from here:
http://www.phpguru.org/article/pcre-cheat-sheet
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24874321
Thanks TerryAtOpus, I can't believe I blanked out on that one detail. I was a bit too distracted, I apologize.
0
 
LVL 40

Expert Comment

by:mrjoltcola
ID: 24874825
Terry did the heavy lifting on that one, I feel like I took too many points. :(
Glad to help.

0

Featured Post

Why Off-Site Backups Are The Only Way To Go

You are probably backing up your data—but how and where? Ransomware is on the rise and there are variants that specifically target backups. Read on to discover why off-site is the way to go.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Developers of all skill levels should learn to use current best practices when developing websites. However many developers, new and old, fall into the trap of using deprecated features because this is what so many tutorials and books tell them to u…
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

632 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question