Solved

loop an array and replace instances in file

Posted on 2006-06-16
8
405 Views
Last Modified: 2008-02-01
Hello all!

This is a somewhat complex issue, so I will do my best to explain it clearly.  

We wish (for reasons of SEO and loading speed) to nightly convert our PHP files into HTML. We will do so using the PEAR::HTTP_Request class.  It reads the source of the PHP file as a string into a variable.

I can do this perfectly fine.  The issue now becomes editing the string so that all the HTML files link to one another.  

The possible replacements (.html URLs) are held in an array ($URLArray - it also holds the .html file name, and the save-to path).   How would I search the string for all instances of links, and if the url of the link matches a link from the array, replace it?  I am assuming I would have to loop through the array and use preg_match or preg_match_all to find them.  

Any input you give would be greatly appreciated.

Here are a few bits of the array - its huge:

Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [0] => http://www.buffalotraders.com/product_info.php?products_id=569
                    [1] => adit-600-base-and-2-v35-ports-part-number-02-aa1-00000c-00.html
                    [2] => /buffalotraders/test/dan/firsttry/products/
                )

            [1] => Array
                (
                    [0] => http://www.buffalotraders.com/product_info.php?products_id=54
                    [1] => cisco-cp-7940g-ip-phone.html
                    [2] => /buffalotraders/test/dan/firsttry/products/
                )

            [2] => Array
                (
                    [0] => http://www.buffalotraders.com/product_info.php?products_id=570
                    [1] => pw130-8023af-power-over-ethernet-poe-injector.html
                    [2] => /buffalotraders/test/dan/firsttry/products/
                )


here is the pertinent code:

public function buildFiles($URLArray)
    {
      //for($m = 0; $m < count($URLArray[0]); $m++)
      //{
         for($l = 0; $l < count($URLArray[0]); $l++)
         {
            $URL  = $URLArray[0][$l][0];
            $Name = $URLArray[0][$l][1];
            $Path = $URLArray[0][$l][2];

            $req =& new HTTP_Request($URL);
            if(!PEAR::isError($req->sendRequest()))
            {
              $html_out = $req->getResponseBody();
            }
            /// The entire URL is now held in as a string in the $html_out variable.
            /// The next step is to find all links on the page, and determine if they
            /// match with a URL/Name pair from URLArray.  If they do, we will then
            /// replace the link with the correct .html Name.

            $content = editLinks($URLArray,$html_out);

            $write_to_file = $Path.$Name;
            /// Now that all links have been edited properly, we can now write the
            /// source to the Name in the Path using the writeFiles() function.
            echo $URL ."-----". $write_to_file ."\n\n";
            $this->writeFiles($write_to_file,$content);
          }      
       //}
    }

    /// The following function receives the URL/Name/Path array and the source of the
    /// html file (in the $html_out variable).
    public function editLinks($URLArray,$html_out)
    {
      //$regex = '@<a(.*?)>(.*?)</a>@i';
      $regex = '@<a[^>]*href=(")([^"]+)"[^>]*>(.*?)</a>@i';
      preg_match_all($regex,$html_out,$matches);
     
      for($n = 0; $n < count($URLArray[0]); $n++)
      {
           
      }
    }
0
Comment
Question by:flow79
  • 5
  • 2
8 Comments
 
LVL 49

Accepted Solution

by:
Roonaan earned 125 total points
Comment Utility
You can also change the $URLArray into a input-array for str_replace:

    public function editLinks($URLArray,$html_out)
    {
        $replace_from = array();
        $replace_to     = array();
        foreach($URLArray[0] as $info) {
           list($from, $to_file, $to_path) = $info;
           $replace_from[] = $from;
           $replace_to[] = $to_path.$to_file;
        }      
        return str_replace($replace_from, $replace_to, $html_out);
    }

-r-
0
 
LVL 14

Expert Comment

by:ThG
Comment Utility

This is probably off topic, but i would suggest a better approach:
- Use url rewriting to make yourself able to use .html links even on "live" php pages. This way you can edit your html templates to actually point to the .html link
- Dump the whole contents with PEAR::HTTP_Request in the same way
- You no longer need the above script and you have a cleaner architecture.

(this also make you able to build a transparent cache)
0
 
LVL 13

Author Comment

by:flow79
Comment Utility
ThG,

I am always willing to try new techniques, but you will need to give me more information on how do complete that.  (i'm somewhat new to PHP, sorry).  

If you could either edit my above code to do so, or give me an in-depth example, I can probably make it work.
0
 
LVL 13

Author Comment

by:flow79
Comment Utility
Roonaan,

Will the code you provided above find all instances of links, and then pick out the appropriate new .html file name based on the original URL from the array?

I guess i dont completely follow how your code works.

Any more information you could provide would be greatly appreciated.

- Dan
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 13

Author Comment

by:flow79
Comment Utility
i used your code in my function Roonaan, and it does not edit the links at all.  any thoughts?
0
 
LVL 13

Author Comment

by:flow79
Comment Utility
ThG,

I have been looking into URL rewriting using the mod_rewrite module in Apache and such.  It seems like a very nice functionality, but half the reason we want to make true HTML files is for loading speed (both for the user and spiders/crawlers).  It seems that the mod_rewrite only rewrites the way the URL is displayed, but does not actually re-create the file as static HTML.

That is our goal here.  Any assistance anyone can provide is greatly appreciated, as this project is fairly time-important.

Thanks!
0
 
LVL 13

Author Comment

by:flow79
Comment Utility
actually, now that i look closer, i do believe your code is working properly Roonaan - thanks so much!!!!

0
 
LVL 14

Expert Comment

by:ThG
Comment Utility
flow79: in fact, you understood well how mod_rewrite works. Please read again my above post...

I said that url_rewrite can help you to use the .html-links even on "live" php pages. After that, you still need to concretize html files by executing your script, but this time it works automatically without parsing again the saved files...
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now