Solved

c# spider the web

Posted on 2010-11-26
5
461 Views
Last Modified: 2012-05-10
may programs can extract email from url and many pro gram can copy con tent of url to local machine   How can those program known the directory of url they follow the link or they have a class to explore web directory
0
Comment
Question by:teera
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
5 Comments
 
LVL 10

Expert Comment

by:aboo_s
ID: 34220696
If I understand your question is how to explore all url's on the web, you can create a program that will
try all combinations in .com then .net and so on to discover which are real url by connecting to them using port 80.  you start like this a.com b.com c.com ...aa.com ab.com ...and so on ..you can explore most of the web ..!!


Does this answer your question!?
0
 

Author Comment

by:teera
ID: 34221409
Hi aboo_s

i wanto explore a.com , a.om/s.htm  a.com/sys.php a.com/pic/in.jpg  i want to explore on the url
0
 
LVL 10

Accepted Solution

by:
aboo_s earned 500 total points
ID: 34222518
Oh I see,

When you have the correct URL, let's say a.com exists, then you can list the whole directory underneath
using php command for example:  readdir (example code attached)

This way when a URL is found you can get the list of files(directories) under it, and work with them accordingly!
/* List all files in a directory */
<?php

if ($handle = opendir('/path/to/files')) {
    echo "Directory handle: $handle\n";
    echo "Files:\n";

   
    while (false !== ($file = readdir($handle))) {
        echo "$file\n";
    }

   

    closedir($handle);
}
?>

Open in new window

0
 
LVL 10

Expert Comment

by:aboo_s
ID: 34222523
Oh ..but you must ask this!!! You can only run php on the server which you cannot!!!
So ..I'll think of something ..
0
 
LVL 10

Expert Comment

by:aboo_s
ID: 34222559
Ok, I guess you will most probably have to also guess them just like with  the url s
Cause som(most) site hide their files from being listed!

Anyway because you do not have to re-invent the wheel !!!
Take a look at this page:
http://allseeing-i.com/ASIHTTPRequest/
0

Featured Post

PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article aims to explain the working of CircularLogArchiver. This tool was designed to solve the buildup of log file in cases where systems do not support circular logging or where circular logging is not enabled
The article shows the basic steps of integrating an HTML theme template into an ASP.NET MVC project
If you’ve ever visited a web page and noticed a cool font that you really liked the look of, but couldn’t figure out which font it was so that you could use it for your own work, then this video is for you! In this Micro Tutorial, you'll learn yo…
This is my first video review of Microsoft Bookings, I will be doing a part two with a bit more information, but wanted to get this out to you folks.

632 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question