?
Solved

ASP crawler

Posted on 2005-04-25
9
Medium Priority
?
902 Views
Last Modified: 2012-05-05
Hi
I need to know if there is any free ASP code or package available, which can crawl a given web site (link to link) and create/update a database of all it pages. I'm going to create one but before I start, I'd like to see other's experiences.
Your comments will be appreciated,
Huji
0
Comment
Question by:huji
9 Comments
 
LVL 23

Expert Comment

by:apresto
ID: 13859839
Hey Huji, how you doing?

I havent seen one around, i looked into it a little bit a whileback but didnt get very far.  I suppose it would involved scanning directories with FSO for the pages but i dont know how you would identify the link for link relationships - how are you planning on doing it?
0
 
LVL 33

Assisted Solution

by:hongjun
hongjun earned 400 total points
ID: 13859891
I have not tried it but I somehow found this in my bookmarks :)
http://www.asp101.com/articles/chris/spider/default.asp

hongjun
0
 
LVL 14

Author Comment

by:huji
ID: 13864504
Hi my friends,
@hongjun: Good link, though I've found it prior to reading your message here. But the fact is, I want to see one in action. The article speaks about how to do it, but is not giving an example to be downloaded.
@apresto: The article linked above by hongjun is a complete answer to your question. A brief answer would be like this: The crawler should look inside the first page of my site, store the text (i.e. the returned content from the web site, after stripping it's HTML tags and so) in a database (second column! the first column will be the URL of the page;) and find every link on that page, and store them in a list. When it finishes one page, it should do the same thing, with the next page in the list. This way, it will search all pages of my site, and store the resulted HTML in a database. So I can build a search engine for my site based on this.
You may say I can do all the same with indexing service! I say no! The indexing service searching is not that customizable. I can control the results of this type of search engine (for example control which page to apear on the top of the results), I can control which parts of the HTML of my pages to be stored in the database, in which parts to not to (to prevent getting results which are only happening because the keyword has been found in an advertisement in the bottom of the page!) and there are many more aspects where this type of search engine is prefered to any other. This can be called in mini-Google in my site. (And perhaps the hard part is to build a PageRank system! *L* )

@both: It is acceptable, that coding the whole system in ASP (only) can result in slow, CPU-consuming results, and I can simply face a page timeout just in the begining, or worst, a server crash down! For the same reason, I prefer the crawler system (and not the search engine) to be coded another way, for example in a COM object or so. I have very little ASP.Net understanding, but I can manage it to some extent. So if you post a link which is about a ASP.Net solution, it will be the same valuable to me.

Finally, I'd be glad if you can help me with this question as well:
http://www.experts-exchange.com/Web/Web_Languages/ASP/Q_21399124.html

Special thanks,
Huji
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 6

Accepted Solution

by:
cjinsocal581 earned 1200 total points
ID: 13867748
0
 
LVL 6

Expert Comment

by:cjinsocal581
ID: 13867760
One more thing, it uses both Access and SQL and utilizes the Full text search of SQL.

Read the details and you can see it work.
0
 
LVL 14

Author Comment

by:huji
ID: 13870812
Yider is a great example. I've focused on it by now. (Actually I didn't know it before you gave me the link on my other open question here: http://www.experts-exchange.com/Web/Web_Languages/ASP/Q_21399124.html)
I like more examples, and as I stated before, ASP.Net examples are welcome, if any.
Thanks a lot CJ.
Huji
0
 
LVL 6

Expert Comment

by:cjinsocal581
ID: 13870837
Keep in mind that Yider is fully customizable. In fact I have made it to where you can type in the links you want parsed. The nice thing abour Yider is the Ranking it does on the page searches.
0
 
LVL 14

Author Comment

by:huji
ID: 13873577
Again I should repeat that I'm not going to use Yider, or any other example. I'm going to develop one on my own in future, so I need to know others experience about it.
Huji
0
 
LVL 14

Author Comment

by:huji
ID: 13959010
Thanks all, for your help
Huji
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I recently decide that I needed a way to make my pages scream on the net.   While searching around how I can accomplish this I stumbled across a great article that stated "minimize the server requests." I got to thinking, hey, I use more than one…
Hello, all! I just recently started using Microsoft's IIS 7.5 within Windows 7, as I just downloaded and installed the 90 day trial of Windows 7. (Got to love Microsoft for allowing 90 days) The main reason for downloading and testing Windows 7 is t…
We’ve all felt that sense of false security before—locking down external access to a database or component and feeling like we’ve done all we need to do to secure company data. But that feeling is fleeting. Attacks these days can happen in many w…
Please read the paragraph below before following the instructions in the video — there are important caveats in the paragraph that I did not mention in the video. If your PaperPort 12 or PaperPort 14 is failing to start, or crashing, or hanging, …
Suggested Courses
Course of the Month16 days, 4 hours left to enroll

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question