asked on

How can I get dynamic content from another page to be displayed on my frontpage?

Our company has weekly Intranews which is published on our external site and so currently someone has to manually copy and paste these news articles into a template and publish them to our Intranet. I have to figure there would be a way to use a SSI script to pull the relevant data into the current template automatically every time a new article is published. There is the main site which contains the last 5 published articles, and then a link which goes to the complete articles. If there are any suggestions or ideas to follow I would greatly appreciate it! Any code you may need just post and I will put it up for analyzing as needed. Thanks.

eejones

Can you use an iframe, and the source of the iframe would be the page that contains the 5 articles?

PhillipsPlastics

ASKER

Currently the site is setup in a 3 column design and on the frontpage the articles rotate through every 15 min using a little php code. I would like to continue to use this method since it allows for the article to be displayed on the whole middle column and makes the site look more dynamic per the CIO's request.

eejones

Please forgive me if I have misunderstood the task. It would help to see the external website (can you post the url?)

Could the middle column on the intranet site contain an iframe? Are you trying to capture the whole frontpage in this intranet page middle column?

PhillipsPlastics

ASKER

Unfortunately the external site is limited to company employees as it is password protected since it deals with sales etc... I can post the relevant code after I clean it up a bit however if you would like me to.

In regards to the second question I would just want to glean 1 story at a time without the rest of the page being displayed. Whether that means saving the content in another html document (as is done now) or merely rotating the call to get the current page doesn't matter so long as it works consistently.

Every story ends with a "" before closing 2 div classes and beginning a new div class and adding another article on the external site. I was thinking that a call to grab all content within that parameter and putting it into a different html page might be possible but due to my own inexperience I wasn't sure how to do so.

eejones

I wonder if this will help - instructions on how to scrape from another website using PHP.

http://www.oooff.com/php-scripts/basic-php-scrape-tutorial/basic-php-scraping.php

Once you get the whole page content into a variable you can extract what you need and display it where you want.

PhillipsPlastics

ASKER

This may be able to work... the only problem I run into is how to add authentication to the scrape request?

eejones

Ack, that might be a problem. If you have to login to the external site to see the articles, I do not see how you can do this unless the login script uses method="get" in the <form> tag, in which case you can pass the username and password in the url. The webmaster of the external site could easily tweak their code to accept the username:password either as "post" or as "get." Then you could use this for the url:

http://externalwebsite.com?username=chocolate&password=vanilla

assuming they are using the variable names 'username' and 'password'

PhillipsPlastics

ASKER

I had to go ahead and use the curl extension using php... however now that I have imported the data properly I need to pull each individual article into a separate htm file. Any suggestions?

<?php
$userAgent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.6) Gecko/20070725 Firefox/3.5';
$target_url = "http://xxxxxxx.com";
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERPWD, 'login:password');
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$html = curl_exec($ch);
echo $html;
if (!$html) {
	echo "<br />cURL error number:" .curl_errno($ch);
	echo "<br />cURL error:" . curl_error($ch);
	exit;
}
?>

Open in new window

eejones

Are all the articles concatenated and stored in the $html variable?

PhillipsPlastics

ASKER

Yes and I did successfully add php code to output everything to a file which may or may not be needed to search and grab the content I need.

eejones

I think you could use the php file manipulation features to parse $html and write several html files, and the include links to them in your main page. Or maybe you could put all the articles into one html file and separate the articles with <a href name="article1"> tags and link to those anchors?

PhillipsPlastics

ASKER

What kind of statements would I need to parse the $html and write the serveral html files? This seems to be my best option for how the site is currently setup.

eejones

To open and read the source file I think you need the PHP functions

fopen()
fgets()
fwrite()
fclose()

This might be useful as a reference:
http://us.php.net/manual/en/ref.filesystem.php
http://www.w3schools.com/PHP/php_file.asp

You might need some php string but probably not, since you are just reading/writing lines "as is" from the source file.

Some rough code is below. There may be some errors in it but I cannot test it right now. HTH.

Something like this to read the source file and break it up into chunks of code for the new pages, stored in an array $mypages: 
 
<?php
$file = fopen("sourcefile.html", "r") or exit("cannot open file");
//Initialize a variable to hold the contents for a new html file
$contents = "";
$startnewpage=0;
//Read lines until end of file
$mypages = array();
while(!feof($file)) {
$thisline = fgets($file);
// Test if this line of code marks the start or end of a page. 
// You may have to use substr_count() for the comparison 
// in case $thisline includes carriage return, trailing spaces etc.
if ($thisline  == 'text that marks the start of a new page') {
$startnewpage=1;
$contents = "";
}
if ($startnewpage) {
$contents = $contents . $thisline;
} 
if ($thisline == 'text that marks the end of a page') {
$startnewpage=0;
array_push($mypages,$content);
$content="";
}
}
fclose($file);
?>
 
Then you would loop through the array $mypages, and for each element in the array, open a new file, write into it the HTML header and opening body tags, write into it the content stored in $mypages[$n], write the HTML closing tags, and close the file.(You may have to have extracted some header information from the source file, such as a link to a stylesheet.) 
 
Here is, in a nutshell, how to open a file for writing, write to it and close it:
 
$myFile = "newfile.html";
$fh = fopen($myFile, 'w') or die("cannot open file");
$content = "Hello Kitty\n";
fwrite($fh, $content);
$content = "Hello Duckies\n";
fwrite($fh, $content);
fclose($fh);

Open in new window

eejones

I already see an error in that snippet: $contents and $content should be the same variable name.

PhillipsPlastics

ASKER

I can't seem to get any output echoing the variables to debug and make sure I can get the first part of the code working...
The first unique entry for each would be <h3 class="entry-header">
The ending unique would be <div class="entry" id="
OR the new page could start again just at the <h3 class="entry-header">
Currently the only thing that IS working is echoing out $thisline at line 10 shows the entire webpage....

<?php
$file = fopen("./output.txt", "r") or exit("cannot open file");
//Initialize a variable to hold the contents for a new html file
$content = "";
$startnewpage=0;
//Read lines until end of file
$mypages = array();
while(!feof($file)) {
$thisline = fgets($file);
// Test if this line of code marks the start or end of a page. 
// You may have to use substr_count() for the comparison 
// in case $thisline includes carriage return, trailing spaces etc.
if ($thisline  == 'entry-header') {
$startnewpage=1;
$content = "";
}
if ($startnewpage) {
$content = $content . $thisline;
} 
if ($thisline == 'entry') {
$startnewpage=0;
array_push($mypages,$content);
$content="";
}
}
fclose($file);
?>

Open in new window

PhillipsPlastics

ASKER

ok so I got it working so it echoes out the correct things and debugged it thus far however I can't seem to figure out how to get it to properly output to a file within my loop.... suggestions?

<?php
$file = fopen("./output.txt", "r") or exit("cannot open file"); //Initialize a variable to hold the contents for a new html file
 
$content = "";
$startnewpage = 0;
$data = "";
c = 1;
 
$mypages = array();		//Read lines until end of file
while(!feof($file)) {
$data = fgets($file);
						// Test if this line of code marks the start or end of a page. 
						// You may have to use substr_count() for the comparison 
						// in case $data includes carriage return, trailing spaces etc.
		if ((substr_count($data,"h3"))  > 0) {
			echo $data;
			while(((substr_count($data,"technorati") == 0))) {
			$data = fgets($file);
			echo $data;
			}
			echo "<br><br><br><br><br><br><br><br>";
	}
	if ($data != "newline1") {
		/* echo "fail <br>"; */
		$data = "";
	}
 
}
fclose($file);
?>

Open in new window

PhillipsPlastics

ASKER

ugh forgot the "$" before the counter var... but still working on output...

PhillipsPlastics

ASKER

yay! got it working final code posted below...

<?php
$file = fopen("./output.txt", "r") or exit("cannot open file"); //Initialize a variable to hold the contents for a new html file
 
$content = "";
$startnewpage = 0;
$data = "";
$c = 0;
 
$mypages = array();		//Read lines until end of file
while(!feof($file)) {
$data = fgets($file);
						// Test if this line of code marks the start or end of a page. 
						// You may have to use substr_count() for the comparison 
						// in case $data includes carriage return, trailing spaces etc.
		if ((substr_count($data,"h3"))  > 0) {
			echo $data;
			while(((substr_count($data,"technorati") == 0))) {
			$data = fgets($file);
			echo $data;
				$myFile = "newfile$c.html";
				$fh = fopen($myFile, 'a') or die("cannot open file");
				$content = $data;
				fwrite($fh, $content);
				fclose($fh);
			}
						$c++;
	}
	if ($data != "newline1") {
		/* echo "fail <br>"; */
		$data = "";
	}
 
}
fclose($file);
?>

Open in new window

ASKER CERTIFIED SOLUTION

eejones

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial