Solved

PHP Pregmatch

Posted on 2009-05-09
13
352 Views
Last Modified: 2012-05-06
I get a webpage code like this help me in using pregmatch to takeout the words of the second td that is "VSKP" in this for example and in the last td "DVD" and respective third td that is "VISHAKAPATNAM" and "DUVADA" in this example in a array . 2 seperate arrays one for second TD and one for third TD . i.e, one for VSKP and one for VISKAPATNAM

   Im using curl function to the webpage code

    Help me on how to proceed with this is there any other way other that pregmatch .
<TR>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>VSKP</TD>
<TD>VISHAKAPATNAM  </TD>
<TD ALIGN = Center>1</TD>
 
<TD ALIGN = Center>17:25</TD>
<TD ALIGN = Center>0</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>2</TD>
<TD ALIGN = Center>DVD </TD>
<TD>DUVVADA        </TD>
<TD ALIGN = Center>1</TD>

Open in new window

0
Comment
Question by:navinbabu
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
13 Comments
 
LVL 3

Expert Comment

by:webvogel
ID: 24345831
I think, this will helt you:
<?php 
$str = "<TR>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>VSKP</TD>
<TD>VISHAKAPATNAM  </TD>
<TD ALIGN = Center>1</TD>
 
<TD ALIGN = Center>17:25</TD>
<TD ALIGN = Center>0</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>2</TD>
<TD ALIGN = Center>DVD </TD>
<TD>DUVVADA        </TD>
<TD ALIGN = Center>1</TD>";
 
preg_match_all('#<TR>\s*<TD.*>.*</TD>\s*<TD.*>(.*)</TD>\s*<TD.*>(.*)</TD>#i', $str, $res);
echo "<pre>";
print_r($res[1]);
print_r($res[2]);

Open in new window

0
 
LVL 6

Expert Comment

by:abhi376
ID: 24346950
Well , if we have to parse the whole page and find only the table how do we proceed then
0
 
LVL 2

Author Comment

by:navinbabu
ID: 24347616
Thanks webvogel ,

   If i have the entire HTML page how do i get the table only so that I can remove the tr and td as shown in the example by you . Im posting the sample HTML page
 
<HTML>
<HEAD>
 
<Title>TRAIN NUMBERS</Title>
<HEAD>
<style type = "text/css">a:link{	color: blue; TEXT-DECORATION: none; }
a:visited{	color: rgb(0,153,153); } 
a:active {	color: rgb(255,102,0);}
p{	background-color: #E8E8D0; font-size: 20px;   FONT-FAMILY: arial,san-serif; FONT-SIZE: 10px; TEXT-DECORATION: none   color: #800000; }
body         { font-family: Verdana, Arial, Helvetica; background-color: FFFFFF;                color: #003366; font-size:10pt }
table        { table-border-color-light: rgb(00,153,153); table-border-color-dark:                rgb(00,153,153); background-color: #FFFFFF; font-size: 12px;                font-family: arial, san-serif; text-decoration:                none table-align Center; color: #800000;  }
h1, h2, h3, h4, h5, h6 { font-family:  Arial, Helvetica }
h1           
{ color: #800000; text-align: center; font-family: Arial; background-color:#FFFFFF;               font-size: 13pt; letter-spacing: 1pt; font-weight:                bold }
h2{	color: rgb(100,10,200); 	font-weight:bold;}
h3{	color: orange); }
h4           
{ font-family: arial, san-serif; text-decoration: none color #0066CC; color:                #FFFFFF; background-color: #003399; font-size: 10px }
h5           
{ color: #000066; font-size: 85%; text-align: Center; font-family: Arial }
h6           
{ color: #800080; font-family: Times New Roman; font-size: 10px;                letter-spacing: 1pt; text-align: Right; background-color:                #E8E8D0; font-style: italic; font-weight: bold }
.LegText     
{ color: #800000; font-family: arial, san-serif; font-size: 12px;                text-decoration: none; background-color: #FFFFFF }              
th           
{ background-color:#009999 ; font-size: 12px; font-family: arial, san-serif;                text-decoration: none; color: #FFFFFF }}
{ background-color:#009999 ; font-size: 12px; font-family: arial, san-serif;                text-decoration: none; color: #FFFFFF }}
 
</style>
</HEAD>
<BR>
 <BODY><table border="0" cellPadding="0" cellSpacing="0" width="95%"><tr>      <td align="center">         </td></tr></table>
 
<TABLE BORDER ALIGN=center >
<TR>
</TR>
</TABLE>
<TABLE border="0">
<td valign="top">
<H1 ALIGN = Center> TRAIN ROUTE </H1>
<TABLE BORDER ALIGN=center >
 <CAPTION> You Queried For </CAPTION>
<TR>
<TH>Train No</TH>
 
<TH>Train Name</TH>
<TH>Date</TH>
<TH>Runs From Source</TH>
<TH COLSPAN=7>Runs On</TH>
</TR>
<TR>
<TD>2727 </TD>
<TD>GODAVARI EXP   </TD>
<TD ALIGN = center>10/05/2009</TD>
<TD ALIGN = center>VISHAKAPATNAM  </TD>
 
<TD>MON </TD>
<TD>TUE </TD>
<TD>WED </TD>
<TD>THU </TD>
<TD>FRI </TD>
<TD>SAT </TD>
<TD>SUN </TD></TR>
</TABLE>
</td>
<td style="background-color: #FFFFFF" width="391" valign="top" rowspan="2">
 
<div id="narrow_ads_bottom"></div>
</td>
<tr>
<td valign="top">
<TABLE BORDER ALIGN=center>
<TR>
<TH>SNo</TH>
<TH>Stn Code</TH>
<TH>Stn Name</TH>
<TH>Route No.</TH>
<TH>Arrival Time</TH>
<TH>Dep. Time</TH>
 
<TH>Distance</TH>
<TH>Day</TH>
<TH>Remark</TH>
</TR>
<TR>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>VSKP</TD>
<TD>VISHAKAPATNAM  </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center><FONT COLOR = red>Source</TD>
 
<TD ALIGN = Center>17:25</TD>
<TD ALIGN = Center>0</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>2</TD>
<TD ALIGN = Center>DVD </TD>
<TD>DUVVADA        </TD>
<TD ALIGN = Center>1</TD>
 
<TD ALIGN = Center>17:53</TD>
<TD ALIGN = Center>17:54</TD>
<TD ALIGN = Center>18</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>3</TD>
<TD ALIGN = Center>AKP </TD>
<TD>ANAKAPALLE     </TD>
 
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>18:08</TD>
<TD ALIGN = Center>18:09</TD>
<TD ALIGN = Center>34</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>4</TD>
<TD ALIGN = Center>YLM </TD>
 
<TD>ELLAMANCHILI   </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>18:28</TD>
<TD ALIGN = Center>18:29</TD>
<TD ALIGN = Center>57</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>5</TD>
 
<TD ALIGN = Center>NRP </TD>
<TD>NARSIPATNAM RD </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>18:43</TD>
<TD ALIGN = Center>18:44</TD>
<TD ALIGN = Center>75</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
 
<TR>
<TD ALIGN = Center>6</TD>
<TD ALIGN = Center>TUNI</TD>
<TD>TUNI           </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>18:59</TD>
<TD ALIGN = Center>19:00</TD>
<TD ALIGN = Center>97</TD>
<TD ALIGN = Center>1</TD>
 
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>7</TD>
<TD ALIGN = Center>ANV </TD>
<TD>ANNAVARAM      </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>19:13</TD>
<TD ALIGN = Center>19:14</TD>
<TD ALIGN = Center>114</TD>
 
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>8</TD>
<TD ALIGN = Center>PAP </TD>
<TD>PITHAPURAM     </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>19:31</TD>
<TD ALIGN = Center>19:32</TD>
 
<TD ALIGN = Center>139</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>9</TD>
<TD ALIGN = Center>SLO </TD>
<TD>SAMALKOT JN    </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>19:44</TD>
 
<TD ALIGN = Center>19:45</TD>
<TD ALIGN = Center>151</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>10</TD>
<TD ALIGN = Center>APT </TD>
<TD>ANAPARTI       </TD>
<TD ALIGN = Center>1</TD>
 
<TD ALIGN = Center>20:04</TD>
<TD ALIGN = Center>20:05</TD>
<TD ALIGN = Center>177</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>11</TD>
<TD ALIGN = Center>RJY </TD>
<TD>RAJAMUNDRY     </TD>
 
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>20:41</TD>
<TD ALIGN = Center>20:45</TD>
<TD ALIGN = Center>201</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>12</TD>
<TD ALIGN = Center>NDD </TD>
 
<TD>NIDADAVOLU JN  </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>21:09</TD>
<TD ALIGN = Center>21:10</TD>
<TD ALIGN = Center>223</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>13</TD>
 
<TD ALIGN = Center>TDD </TD>
<TD>TADEPALLIGUDEM </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>21:27</TD>
<TD ALIGN = Center>21:28</TD>
<TD ALIGN = Center>243</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
 
<TR>
<TD ALIGN = Center>14</TD>
<TD ALIGN = Center>EE  </TD>
<TD>ELURU          </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>22:01</TD>
<TD ALIGN = Center>22:02</TD>
<TD ALIGN = Center>291</TD>
<TD ALIGN = Center>1</TD>
 
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>15</TD>
<TD ALIGN = Center>BZA </TD>
<TD>VIJAYAWADA JN  </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>23:40</TD>
<TD ALIGN = Center>23:55</TD>
<TD ALIGN = Center>350</TD>
 
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>16</TD>
<TD ALIGN = Center>KMT </TD>
<TD>KHAMMAM        </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>01:13</TD>
<TD ALIGN = Center>01:15</TD>
 
<TD ALIGN = Center>452</TD>
<TD ALIGN = Center>2</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>17</TD>
<TD ALIGN = Center>WL  </TD>
<TD>WARANGAL       </TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>02:46</TD>
 
<TD ALIGN = Center>02:48</TD>
<TD ALIGN = Center>559</TD>
<TD ALIGN = Center>2</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>18</TD>
<TD ALIGN = Center>KZJ </TD>
<TD>KAZIPET JN     </TD>
<TD ALIGN = Center>1</TD>
 
<TD ALIGN = Center>03:05</TD>
<TD ALIGN = Center>03:07</TD>
<TD ALIGN = Center>569</TD>
<TD ALIGN = Center>2</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>19</TD>
<TD ALIGN = Center>SC  </TD>
<TD>SECUNDERABAD JN</TD>
 
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>05:45</TD>
<TD ALIGN = Center>05:50</TD>
<TD ALIGN = Center>701</TD>
<TD ALIGN = Center>2</TD>
<TD ALIGN = Center>                </TD>
</TR>
<TR>
<TD ALIGN = Center>20</TD>
<TD ALIGN = Center>HYB </TD>
 
<TD>HYDERABAD DECAN</TD>
<TD ALIGN = Center>1</TD>
<TD ALIGN = Center>06:15</TD>
<TD ALIGN = Center><FONT COLOR = red>Destination</TD>
<TD ALIGN = Center>710</TD>
<TD ALIGN = Center>2</TD>
<TD ALIGN = Center>                </TD>
</TR>
</TABLE>
</td>
 
</tr>
</table>
 
</BODY>
</HTML>

Open in new window

0
Creating Instructional Tutorials  

For Any Use & On Any Platform

Contextual Guidance at the moment of need helps your employees/users adopt software o& achieve even the most complex tasks instantly. Boost knowledge retention, software adoption & employee engagement with easy solution.

 
LVL 3

Expert Comment

by:webvogel
ID: 24348577
Hello navinbabu,

$str = your html code
I changed only the line with the table you want, like this:
<TABLE id="my_table" BORDER ALIGN=center>

(You have to write border="1" or border="0", but not only border)

In the first step I search this table and the second step is the same, I wrote here:
preg_match('#(<TABLE id="my_table".*>.*</TABLE>)#isU', $str, $table);
preg_match_all('#<TR>\s*<TD.*>.*</TD>\s*<TD.*>(.*)</TD>\s*<TD.*>(.*)</TD>#i', $table[1], $res);
 
echo "<pre>"; 
print_r($table[1]);
print_r($res[1]);
print_r($res[2]);

Open in new window

0
 
LVL 2

Author Comment

by:navinbabu
ID: 24348614
Webvogel,

   The page I get cant be modified if anything I have to modify from the code after getting it . I get the entire page using curl function of remote page so I cant change anything in source

   here I have 2 tables with <TABLE BORDER ALIGN=center> for suppose the preg match variable is $res can i acess it with like $res[1][0] or something like that I think I make some sense .

      If yes help me with the code

Thanks for the way around
0
 
LVL 3

Expert Comment

by:webvogel
ID: 24348664
If the code is not changing, this will work:
<?php
preg_match('#(<TABLE BORDER ALIGN=center>.*</TABLE>)#isU', $str, $table);
preg_match_all('#<TR>\s*<TD.*>.*</TD>\s*<TD.*>(.*)</TD>\s*<TD.*>(.*)</TD>#i', $table[1], $res);
echo "<pre>";
print_r($table[1]);
print_r($res[1]);
print_r($res[2]);

Open in new window

0
 
LVL 2

Author Comment

by:navinbabu
ID: 24348706
Hello

Array
(
)
Array
(
)


I get this when I excute the code
0
 
LVL 3

Accepted Solution

by:
webvogel earned 500 total points
ID: 24348757
Maybe you like this more:
<?php
preg_match_all('#(<TABLE.*>\s*<TR>\s*<TH>.*</TH>.*</TABLE>)#isU', $str, $table);
preg_match_all('#<TR>\s*<TD.*>.*</TD>\s*<TD.*>(.*)</TD>\s*<TD.*>(.*)</TD>#i', $table[1][1], $res);
echo "<pre>";
print_r($table[1][1]);
print_r($res[1]);
print_r($res[2]);
?>

Open in new window

0
 
LVL 3

Expert Comment

by:webvogel
ID: 24348772
You put the string like this?
$str = '<HTML> ....</HTML>';

Maybe, the code is not exactly the same, you posted? It has to match.
I get this:
// your table
 
Array
(
    [0] => VSKP
    [1] => DVD 
    [2] => AKP 
    [3] => YLM 
    [4] => NRP 
    [5] => TUNI
    [6] => ANV 
    [7] => PAP 
    [8] => SLO 
    [9] => APT 
    [10] => RJY 
    [11] => NDD 
    [12] => TDD 
    [13] => EE  
    [14] => BZA 
    [15] => KMT 
    [16] => WL  
    [17] => KZJ 
    [18] => SC  
    [19] => HYB 
)
Array
(
    [0] => VISHAKAPATNAM  
    [1] => DUVVADA        
    [2] => ANAKAPALLE     
    [3] => ELLAMANCHILI   
    [4] => NARSIPATNAM RD 
    [5] => TUNI           
    [6] => ANNAVARAM      
    [7] => PITHAPURAM     
    [8] => SAMALKOT JN    
    [9] => ANAPARTI       
    [10] => RAJAMUNDRY     
    [11] => NIDADAVOLU JN  
    [12] => TADEPALLIGUDEM 
    [13] => ELURU          
    [14] => VIJAYAWADA JN  
    [15] => KHAMMAM        
    [16] => WARANGAL       
    [17] => KAZIPET JN     
    [18] => SECUNDERABAD JN
    [19] => HYDERABAD DECAN
)

Open in new window

0
 
LVL 2

Author Comment

by:navinbabu
ID: 24348789
Thanks it worked like a charm can u please explain the two preg match statements I know the fundas like (.*) denotes a variable
0
 
LVL 3

Expert Comment

by:webvogel
ID: 24348906
\s = space, break and tabulator
. = every sign, not break

* = zero or more times

() = expression (here the result)

# = begin and end of the pattern

i = upper and lower case
s = dot including newlines
U = Ungreedy

My english is not very good, you can read more here:
http://de.php.net/manual/en/book.pcre.php
0
 
LVL 49

Expert Comment

by:Roonaan
ID: 24348988
Using the code I showed you in your other questions (http://www.experts-exchange.com/Web_Development/Web_Languages-Standards/PHP/Q_24395841.html), you can easily parse each entries second column, or analyze any of the other fields.

If you want to access a specifc column, you would however not do a foreach on $table, but you would work directly on $matches[2] or $matches[7] i assume.
0
 
LVL 2

Author Comment

by:navinbabu
ID: 24349084
thanks webvogel I was always jumping around these statements like *. and i and all thoose I never had a clarity .
0

Featured Post

[Webinar] Code, Load, and Grow

Managing multiple websites, servers, applications, and security on a daily basis? Join us for a webinar on May 25th to learn how to simplify administration and management of virtual hosts for IT admins, create a secure environment, and deploy code more effectively and frequently.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Things That Drive Us Nuts Have you noticed the use of the reCaptcha feature at EE and other web sites?  It wants you to read and retype something that looks like this. Insanity!  It's not EE's fault - that's just the way reCaptcha works.  But it i…
Introduction A frequently asked question goes something like this:  "I am running a long process in the background and I want to alert my client when the process finishes.  How can I send a message to the browser?"  Unfortunately, the short answer…
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question