Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

building a bot

Posted on 2008-10-08
6
263 Views
Last Modified: 2013-12-08
suppose i want to build a web bot that will scan only the main pages of all websites in some country.
where do i start from?
0
Comment
Question by:Sasha-N
  • 3
  • 2
6 Comments
 
LVL 13

Accepted Solution

by:
Onthrax earned 500 total points
ID: 22673735
It depends on the language you want to write this in.

The correct search term you need to google your answer is:
'creating web spider'

For example a tutorial in .NET:
http://www.beansoftware.com/NET-Tutorials/Creating-Web-Spider.aspx

Or in Java:
http://www.javaworld.com/javaworld/jw-11-2004/jw-1101-spider.html

Hope this helps
0
 

Author Comment

by:Sasha-N
ID: 22673946
so i dont need special hardware and i can just run it from home server?
how much time do you think will take to scan all the net domains?
0
 
LVL 13

Expert Comment

by:Onthrax
ID: 22676379
Correct. Although I imagine it will take a long long time with only your own machine to spider ALL the net domains.

For example Google has many many server parks consisting of hundreds of computers in every country.
0
 

Author Comment

by:Sasha-N
ID: 22677001
suppose i want to scan all the sites in russia- with .ru domain
will it take about a month? or about an year?
0
 
LVL 13

Expert Comment

by:Onthrax
ID: 22677091
I really couldn't make an estimate m8. It all depends on a lot of factors.

For example:
- The method you will be using to spider domains. e.g. loop through all possible domains like a.ru, b.ru, ab.ru, abc.ru etc. or spider a few .ru sites and fetch links from those etc.
- The capacity of your machine. A pentium 1 would take longer than a powerfull Quad core.
- The bandwidth available and it's speed
- The size of the webpages you will be spidering. A single page with only a few words will be faster done than a site with a huge page and a lot of backpages.

Imagine google not having indexed the entire internet yet with all their machines. It's a huge job..

0

Featured Post

Resolve Critical IT Incidents Fast

If your data, services or processes become compromised, your organization can suffer damage in just minutes and how fast you communicate during a major IT incident is everything. Learn how to immediately identify incidents & best practices to resolve them quickly and effectively.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Does your audience prefer people in photos or no people? How can you best highlight what you’re selling? What are your competitors doing, and what can you do that is different and unique from them?  Continue reading to learn how to make your images …
There’s a good reason for why it’s called a homepage – it closely resembles that of a physical house and the only real difference is that it’s online. Your website’s homepage is where people come to visit you. It’s the family room of your website wh…
The viewer will receive an overview of the basics of CSS showing inline styles. In the head tags set up your style tags: (CODE) Reference the nav tag and set your properties.: (CODE) Set the reference for the UL element and styles for it to ensu…
Learn how to create flexible layouts using relative units in CSS.  New relative units added in CSS3 include vw(viewports width), vh(viewports height), vmin(minimum of viewports height and width), and vmax (maximum of viewports height and width).

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question