Solved

SITEMAP FROM FOLDERS and HTML files

Posted on 2004-04-25
6
185 Views
Last Modified: 2009-07-29
How can I make a sitemap based on the folder stucture. The contents should be down the screen and split across multiple pages if posssible. HTML ouput strictly.

TITLE DESCRIPTION AND METATAGS should be included if at all possible %TAGS% with str_replace would be good. So its easier on layout.

best regards

PS if yer need the templates preped for yer and some snippets of code to cut yer time down thats not a prob.

$fp=fopen('sitemap.html');
        $sitemap=fread($fp,filesize ('sitemap.html'));
        fclose($fp);

if yer can use explode split by 20 links per page that output be most useful I have then numbering code below for you

$template = explode('<!--BEGIN-->',$sitemap);
$template = explode('<!--END-->',$tmp[1]);

add a count with that use a STR_replace for the above code should be fine
for($=1 $i<$pages; $i++){

$template=str_replace('%SITE%',$pages,$sitemap); or along those lines
fwrite($fpsitemap,stripslashes($tmp));

I think thats most of it might need to obstart it ob_ end_clean it with include the actual sitemap code itself keep them seperate if you can. One is for the reading and writing the other is the generation part.  Same for pagenumebering bottom of page for code

If yer need more info on layouts or design

index.html/folder/name-1.html-------- then lots of html pages here thats the site structure through out. around a couple of hundered foldres so spilting it over many pages would be ideal )
                               ) they will be linked together
                               )
sitemap.html/sitemap/then all files etc but must link to the above structure actual direct links with domain would have to be needed www.domain.com/index.html/folder/ etc etc (or maybe theres a better way)

index numrbering below

<?php

$start=$i;

echo "page $start of $nindexes<br>\n";
$prev = $start-1;
$range = $start-3;
$next = $start+1;

if($start > 1) {
    echo "<a href=\"$name-1.html\">First</a> <a href=\"$name-$prev.html\">Prev</a> ";
     if($start > 3){
          echo "[<a href=\"$name-$range.html\">$range</a>] ";
     }
     $range += 1;
     if($start > 2){
          echo "[<a href=\"$name-$range.html\">$range</a>] ";
     }
     echo "[<a href=\"$name-$prev.html\">$prev</a>] ";
}
echo "<strong>[$start]</strong> ";
if($start < $nindexes) {
    echo "[<a href=\"$name-$next.html\">$next</a>] ";
    $range = $next+1;
    if($start < ($nindexes-2)){
         echo "[<a href=\"$name-$range.html\">$range</a>] ";
    }
    $range += 1;
    if($start < ($nindexes-3)){
         echo "[<a href=\"$name-$range.html\">$range</a>] ";
    }
    echo "<a href=\"$name-$next.html\">Next</a> <a href=\"$name-$nindexes.html\">Last</a>";
}
?>

best regards
0
Comment
Question by:playstat
  • 3
  • 3
6 Comments
 
LVL 10

Accepted Solution

by:
eeBlueShadow earned 500 total points
ID: 10911626
OK, I'll start this off by trying code for collecting the information, rather than formatting it in any way.

The following code will take a text file containing one URL per line, and scan each of those files for <title> and <meta> tags.
At the moment, the code puts everything in an array for you to do what you want with it, let me know if you need more help after that:

<!--START CODE-->
<pre>
<?php
// this is the name of the file with the list of URLs
$pagesFile = "files.txt";

// Collect the list of URLs into a list
$pages = file($pagesFile);
// and strip the newline character from each of them
foreach($pages as $i=>$v) $pages[$i] = rtrim($v);

$sitemap = array();
foreach($pages as $page)
{
    $thisPage = array();
    $thisPage['url'] = $page;
    // This is the line which grabe the text of each page
    $content = file_get_contents($page);

    // Grab the page title
    if(preg_match("#<title>(.*)</title>#i", $content, $titleMatch))
      $thisPage['title'] = $titleMatch[1];
    else
      $thisPage['title'] = "[No Title]";

    // Grab the meta tags
    if(preg_match_all("#<meta.+(name|http-equiv)=['\"]([^'\"]*)['\"].+content=['\"]([^'\"]*)['\"][^>]*>#i", $content, $metaMatch,PREG_SET_ORDER))

    // Collate the meta tags into the new page array
    foreach($metaMatch as $m)
    {
      $thisPage['meta'][$m[2]] = $m[3];
    }

    $sitemap[] = $thisPage;
}

print_r($sitemap);

?>
</pre>
<!--END CODE-->

At the moment, this code does nothing in the way of writing anything pretty to screen or making other files, let me know if it's the kind of direction you want though...
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10911628
One thing to mention is that tat code may take a while to run, it may have to load a number of files which is a time consuming task
0
 

Author Comment

by:playstat
ID: 10927793
yer thats perfect eeblueshadow.

0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 

Author Comment

by:playstat
ID: 10927804
If yer can output it with %tags% i think i can put the rest together and integrate it with the script
0
 
LVL 10

Expert Comment

by:eeBlueShadow
ID: 10928589
Well now all of the information you need is sat in the array, everything from here on is presentation, and that depends on how your site works.

What you can do with this script will be that it writes to static files (which are the files you will link to as sitemap.html).
The %tags% you keep referring to seem to hint about a template system or similar, which probably refers to custom written code that isn't standard to PHP. So, without knowing the format of the code there isn't that much I can do.

If you can give me an example of how you want the final sitemap.html to look then I can probably sort this. If I'm right in guessing that the HTML from sitemap.html is read into another page and altered to make your final output, just give me an example of the original page before it is altered.

Otherwise if I'm way off the mark, let me know

_Blue
0
 

Author Comment

by:playstat
ID: 10935024
yer its a basic html page ive added the tags on the page where ever. read it str_replace it then fwrite it to whatever format if looped over many pages / the content text files. Its all down to this simple system but all files are written in HTML or and php. I havent used smarty to keep it lean as possible. And html only as no code resides on the server or/and on the pages if they fail. (they are all encrypted with 24levels of scrambled code. the fastest php scripts going to. Ive managed to take 15gig of php files down to 6gig. Approx 60 percent. and the lowest server overhead I can get.

by the way blue if yer want your files protected I can help you there after all the effort you have done :0)

0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Mail Not Sent 6 42
Storing files securely - database or filesystem 3 89
Time difference 10 35
showing numeric numbers 2 10
Developers of all skill levels should learn to use current best practices when developing websites. However many developers, new and old, fall into the trap of using deprecated features because this is what so many tutorials and books tell them to u…
This article discusses how to create an extensible mechanism for linked drop downs.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

706 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now