PHP Sracper

Hello Experts,

I am trying to scrape the content of a page and replacing some of the content by my own content! I am also trying to keep the look and feel of the page intact.

 I have added 2 tags to separate the bottom from the top
<!--top-end-->
<!--bottom-start-->

How!!!
jccyberAsked:
Who is Participating?
 
Lukasz ChmielewskiConnect With a Mentor Commented:
Then I think this should work:

<?php

$url="http://www.google.pl";
 
$ch = curl_init();
 
curl_setopt($ch, CURLOPT_URL, $url);
 
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
 
$file=curl_exec ($ch) or die(curl_error());
 
curl_close ($ch);
 
$file = preg_replace("/<!--top-end-->(.*?)<!--bottom-start-->/ims", "mytext", $file);

echo $file;

?>

Open in new window

0
 
Lukasz ChmielewskiCommented:
What part do you want to replace ? between those two ? preg_replace would be the way to go.
0
 
jccyberAuthor Commented:
Whatever is in between
 <!--top-end-->
<!--bottom-start-->
0
Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 
Lukasz ChmielewskiCommented:
Try this:

<?php
$somehtml = "<html><head></head><body><!--top-end-->some text<!--bottom-start--></body></html>";

$somehtml = preg_replace("/<!--top-end-->(.*?)<!--bottom-start-->/ims", "mytext", $somehtml);

echo $somehtml;

?>
0
 
jccyberAuthor Commented:
I am currently using this code to get the page


$url="http://mydomain/page.html";
 
$ch = curl_init();
 
curl_setopt($ch, CURLOPT_URL, $url);
 
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1);
 
$file=curl_exec ($ch) or die(curl_error());
 
curl_close ($ch);
 


echo $file;

Open in new window

0
 
Ray PaseurCommented:
If you want to post the actual URL of the page you want to scrape, we might be able to provide more concrete answers.  But that said, please be sure that you have permission to access the page in an automated manner and that you have copyright for the information you are using.  Many sites do not allow web scraping and explicitly deny this use case in their terms of service.  Also, many sites that want to allow automated access to the underlying data model will offer an API.  Just a thought, ~Ray
0
 
jccyberAuthor Commented:
This is it.



Thank you
0
All Courses

From novice to tech pro — start learning today.