Solved

building a bot

Posted on 2008-10-08
6
267 Views
Last Modified: 2013-12-08
suppose i want to build a web bot that will scan only the main pages of all websites in some country.
where do i start from?
0
Comment
Question by:Sasha-N
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 13

Accepted Solution

by:
Onthrax earned 500 total points
ID: 22673735
It depends on the language you want to write this in.

The correct search term you need to google your answer is:
'creating web spider'

For example a tutorial in .NET:
http://www.beansoftware.com/NET-Tutorials/Creating-Web-Spider.aspx

Or in Java:
http://www.javaworld.com/javaworld/jw-11-2004/jw-1101-spider.html

Hope this helps
0
 

Author Comment

by:Sasha-N
ID: 22673946
so i dont need special hardware and i can just run it from home server?
how much time do you think will take to scan all the net domains?
0
 
LVL 13

Expert Comment

by:Onthrax
ID: 22676379
Correct. Although I imagine it will take a long long time with only your own machine to spider ALL the net domains.

For example Google has many many server parks consisting of hundreds of computers in every country.
0
 

Author Comment

by:Sasha-N
ID: 22677001
suppose i want to scan all the sites in russia- with .ru domain
will it take about a month? or about an year?
0
 
LVL 13

Expert Comment

by:Onthrax
ID: 22677091
I really couldn't make an estimate m8. It all depends on a lot of factors.

For example:
- The method you will be using to spider domains. e.g. loop through all possible domains like a.ru, b.ru, ab.ru, abc.ru etc. or spider a few .ru sites and fetch links from those etc.
- The capacity of your machine. A pentium 1 would take longer than a powerfull Quad core.
- The bandwidth available and it's speed
- The size of the webpages you will be spidering. A single page with only a few words will be faster done than a site with a huge page and a lot of backpages.

Imagine google not having indexed the entire internet yet with all their machines. It's a huge job..

0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When it comes to write a Context Sensitive Help (an online help that is obtained from a specific point in state of software to provide help with that state) ,  first we need to make the file that contains all topics, which are given exclusive IDs. …
Developer portfolios can be a bit of an enigma—how do you present yourself to employers without burying them in lines of code?  A modern portfolio is more than just work samples, it’s also a statement of how you work.
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.
The is a quite short video tutorial. In this video, I'm going to show you how to create self-host WordPress blog with free hosting service.

734 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question