[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

PHP or Perl: Content Generation Script Needed

Posted on 2006-05-22
7
Medium Priority
?
213 Views
Last Modified: 2008-02-01
------------------------------
              Intro
------------------------------
I need to find an existing PHP or Perl script, or have one custom written.

------------------------------
      Project Description
------------------------------
I have over 500 articles that I need to post to my website but I don't want to post them all at once.  I want post a few and then add a few more every day until they are all posted.

I have a rotate.php script that generates the pages, on the fly, and displays the current article, and then the complete list of all the other articles below it.  You click on a link to read another article, the page of which is generated by rotate.php, and so on.  

The current script, rotate.php, refers to a text file, auto.txt, which has all 500 article titles and short descriptions listed.  

But since I don't want to start with all the articles listed, I need to create a script that determines what articles are available for posting, and creates a new auto.txt on the fly with just those articles.  Then in a day or two, I would have to run the script again, but include a few more of the articles.  

------------------------------
    Generating Auto.txt
------------------------------
You can see the contents of auto.txt that need to be generated (see source file below).  The short description must be extracted from the article itself, which is a php file.  For example, the title of the article is: 10_health_fitness_tips.php.  This title appears three times in the file, once in the <title> tag, once in the <meta description> tag, and finally, above the text of the article itself.  

A fixed number of words are included in the auto.txt description, followed by an elipsis (...).  You can see an example of the article and the auto.txt, below.

Sometimes, the title of the article itself includes a dash in the file name, and it never includes the underscores, so matching the text string may be a little tricky.

------------------------------
      PHP or Perl Script
------------------------------
I'm not sure if a PHP script can run on my hard drive.  I could place the new articles in a folder for it to read.  Or does it have to be on the server?  I don't want the articles on the site where the search spiders would find it.  Don't know if they will ignore it using robots.txt.  Or, could the script read the articles from another site of mine where it would be a repostitory of articles there?  

Or, it could be a Perl script to run on my hard drive to generate the auto.txt file.

--------------------------------
       Source: auto.txt
--------------------------------
<font size="2" face="Arial" color="#0000FF">
<b><a href='
10_health_fitness_tips.php'>
10 Health Fitness Tips</a></b></font><BR>

More than any other time in history, people are all trying to have the best, healthiest body possible. The health and fitness industries are making billions of dollars every year on herbal supplements, fitness equipment, gyms, and special diets. If...<br>
<hr color="#C0C0C0" size="1" style="width: 100%">

#BREAK#
<font size="2" face="Arial" color="#0000FF">
<b><a href='
15_minutes_to_firm_arms_bye_bye_jiggle_arms.php'>
15 MINUTES TO FIRM ARMS - Bye bye Jiggle Arms</a></b></font><BR>

Are you hiding your arms because they are flabby and have that jiggle effect when waving to a friend? Good news! No more jiggle arms, or as many call them "grandma arms"! We will be doing some toning exercises to have sexy arms and wave proudly...<br>
<hr color="#C0C0C0" size="1" style="width: 100%">

#BREAK#

--------------------------------------------------
       Source: 10_health_fitness_tips.php
--------------------------------------------------
<title>10 Health Fitness Tips</title>
<meta http-equiv="DESCRIPTION" content="10 Health Fitness Tips">
           . . .
                  <td><b>10 Health Fitness Tips</b><br><br>
                         More than any other time in history, people are all trying to have the best, healthiest body possible. The health and fitness industries are making billions of dollars every year on herbal supplements, fitness equipment, gyms, and special diets. If you watch TV or read magazines, there is always some intriguing commercial asking for money to help you get into shape. <br><br>
           . . .
0
Comment
Question by:WizeOwl
7 Comments
 
LVL 10

Expert Comment

by:ClickCentric
ID: 16740128
The easiest way to do this would be to simply put the auto.txt file on the server and split it into 2 files. In the first file, have the ones you want currently posted and the rest in the second file.  Then just update the first file on the server every couple of days with the article sections that you want to add.  Based on what you've described, a custom script would definitely be needed to do what you're asking.
0
 

Author Comment

by:WizeOwl
ID: 16740427
Does anyone here have something close to what I need?  It'll give me something to start with.

0
 
LVL 11

Expert Comment

by:siliconbrit
ID: 16740959

It is extremely unlikely that someone has a script that will do anything very close to what you want, without your having to modify the architecture of the auto.txt file, and the code in the rotate.php file.

If I were you, I would edit the code in the rotate.php file as follows:

1) Hardcode the date you want this to start, perhaps as a unix timestamp - you might call this STARTDATE

2) Hardcode that rotate.php show only the first 100 articles, you might call this NUMBER_OF_ARTICLES

3) If the CURRENT date is more than ONE DAY greater than STARTDATE, increase NUMBER_OF_ARTICLES by the number of articles you want to add each day.

4) Change the code in rotate.php that reads the articles from auto.txt to limit the number of articles to the new value of NUMBER_OF_ARTICLES.

If you post the contents of rotate.php, I might write this for you.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 5

Expert Comment

by:dutchclan
ID: 16741954
I dont quite get the logic if this script.

first you got this kind of "menu / description" in the auto.txt of wich the content is in "content_name.php" all generated by the "rotate.php" ?

what should the "outcome" look like, can you supply an example of some site?

regards, May
0
 
LVL 10

Accepted Solution

by:
ClickCentric earned 1600 total points
ID: 16743403
Wait, are you trying to generate the auto.txt file itself?  This sounds like what you're after is a site scraper.  If you're trying to scrape someone elses site, this is illegal in most circumstances.  If you're trying to scrape your own site, which it sounds like you are since you mention that the articles are already on another site you own, then it would be much simpler to do something on that end to achieve your goal.  You could set up the site that already has the articles, site1, to create a feed that's available to the site you're trying to create, site2.  This would generally be the far better way to do this.  You can cache the results from site one and set up the rss feed script to only retrieve a new list once a day.  And for this, you could find many code samples out there that would do for you with just minor adjustments to account for the details of your respective sites.  
0
 

Author Comment

by:WizeOwl
ID: 16744428
That sounds like a great alternate solution, ClickCentric.  Can you recommend any such code samples, or what sites to look at?

I think you are correct about the scraping, and yes, it is my own site content.  

I already have the rotator.php, which refers to the auto.txt.  Therefore, I'd like to complete this project before going on to the RSS solution.  Any help with creating the auto.txt would be appreciated.
0
 
LVL 10

Expert Comment

by:ClickCentric
ID: 16744527
Well, I need some details about the site you're scraping from to help with that.  How are the articles stored on the content site?  In a database?  Flat files?  Hard coded into individual php files (this would be unusual, but not the first time I would have seen it).  Or do you want something that would just go to the page and pick the information off of the page and modify it to the format that you need?  If it's the latter, I'd need to see the page you're scraping from before I could offer a suggestion.
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
It’s a season to be thankful, and we’re thankful for users like you who engage on site, solve technology problems, and network with others in the industry. What tech are we most thankful for? Keep reading.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…
Suggested Courses
Course of the Month19 days, 5 hours left to enroll

834 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question