We help IT Professionals succeed at work.

PHP Parse, Extract and print content from another website

prevarant
prevarant asked
on
650 Views
Last Modified: 2013-12-13
Hi Experts,

I need to extract content from this website: http://www.astrolook.com/dnevni.shtml
I need to extract text and print it for every <!-pocetak--><!-kraj--> Tag. I also need text between
<font class="htext"></font>

Example:
1) Ovan
Some description between pocetak/kraj for "Ovan"

2) Bik
Some description betweeb pocetak/kraj for "Bik"

3)....

Thank You in advance
Marko Miljus
Comment
Watch Question

ZvonkoSystems architect
CERTIFIED EXPERT
Top Expert 2006

Commented:
1) Ovan: Some sta citas dalje ;-)
ZvonkoSystems architect
CERTIFIED EXPERT
Top Expert 2006

Commented:
This how it would work in JavaScript (I have no PHP for test):

<script>
window.onload = function(){
  var theText = document.getElementsByTagName("table")[4].innerHTML;
  theText = theText.replace(/<\/font><BR>/gi,":  ");
  theText = theText.replace(/<[^>]+>/g,"");
  alert(theText)
}
</script>



Author

Commented:
Pozdrav Zvonko!
Ths won't work because I need to parse and retrive content from another server page and load it to my websites's page.

I tried this PHP code but than I get almost all content:
--------------------------------------------------------------
<?php

$page = "http://www.astrolook.com/dnevni.shtml";

    // tags

    $start = '<!-pocetak-->';
    $end = '<!-kraj-->';

    // open the file
    $fp = fopen( $page, 'r' );

    $cont = "";

    // read the contents
    while( !feof( $fp ) ) {
        $buf = trim( fgets( $fp, 4096 ) );
        $cont .= $buf;
    }
   
    // get tag contents
    preg_match( "/$start(.*)$end/s", $cont, $match );

    // tag contents
    $contents = $match[ 1 ];
      echo $match[ 1 ];

?>
-------------------------------------------------------------
I think that I need to put "break;" somewhere.
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
actually, your regex will match from the very first start tag on the page to the very last one on the page.  you need to use a non-greedy match qualifier on your .*, so make it .*? so that it will break on the first end tag, rather than the last.

Author

Commented:
Ebosscher, this *? works but I only get content for first pair of tags <!-pocetak--><!-kraj-->.
How to get for all 12 pairs?
Thanks
Just to push ym answer again - The script will do exactly what you want, just change the URL... :)

Author

Commented:
Thank You Basiclife, I didn't try Your code until...and now...it works. Simple solution does the job!

<?php
$url="http://www.astrolook.com/dnevni.shtml";
$contents=file_get_contents($url);
$open="<!-pocetak-->";
$close="<!-kraj-->";
$start=0;
$end=0;
$finished=false;
while($finished==false && $start<strlen($contents)) {
      $start = strpos($contents, $open, $end);
      if($start === false) {$finished=true;}
      $end = strpos($contents, $close, $start);
      if($end === false) {$finished=true;}
            if($start !== false && $end !== false) {
            print substr($contents, $start+strlen($open), $end-$start-strlen($open)) . "<BR/><BR/>";
      }
}
Excellent, glad I could help :)

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.