c# spider the web

may programs can extract email from url and many pro gram can copy con tent of url to local machine   How can those program known the directory of url they follow the link or they have a class to explore web directory
teeraAsked:
Who is Participating?
 
aboo_sConnect With a Mentor Commented:
Oh I see,

When you have the correct URL, let's say a.com exists, then you can list the whole directory underneath
using php command for example:  readdir (example code attached)

This way when a URL is found you can get the list of files(directories) under it, and work with them accordingly!
/* List all files in a directory */
<?php

if ($handle = opendir('/path/to/files')) {
    echo "Directory handle: $handle\n";
    echo "Files:\n";

   
    while (false !== ($file = readdir($handle))) {
        echo "$file\n";
    }

   

    closedir($handle);
}
?>

Open in new window

0
 
aboo_sCommented:
If I understand your question is how to explore all url's on the web, you can create a program that will
try all combinations in .com then .net and so on to discover which are real url by connecting to them using port 80.  you start like this a.com b.com c.com ...aa.com ab.com ...and so on ..you can explore most of the web ..!!


Does this answer your question!?
0
 
teeraAuthor Commented:
Hi aboo_s

i wanto explore a.com , a.om/s.htm  a.com/sys.php a.com/pic/in.jpg  i want to explore on the url
0
 
aboo_sCommented:
Oh ..but you must ask this!!! You can only run php on the server which you cannot!!!
So ..I'll think of something ..
0
 
aboo_sCommented:
Ok, I guess you will most probably have to also guess them just like with  the url s
Cause som(most) site hide their files from being listed!

Anyway because you do not have to re-invent the wheel !!!
Take a look at this page:
http://allseeing-i.com/ASIHTTPRequest/
0
All Courses

From novice to tech pro — start learning today.