Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 439
  • Last Modified:

web crawler vs mirroring web site?

I am thinking they are same thing but
I am trying to write up a crawler that will download everything from the given URL.
The web crawler only looks for the hyperlinked <a> tag, is that correct?
What if the web folder/subfolders which are not linked on the URL?
Hoes it find those?
I guess I don't understand what crawler exactly does.
Can you explain this?
0
dkim18
Asked:
dkim18
  • 2
1 Solution
 
for_yanCommented:
0
 
for_yanCommented:

You can read also this about mirrors:
http://www8.org/w8-papers/4c-server/mirror/mirror.html

I was thinking that by crawler we mean a program which tries to cover many web sites maybe related to some topic
and index the links so that effective serach becomes available (I guess Google has the ultimate web crawler)

Mirror is something which is focused on one particular site and makes the full copy of it -
and has quite different purpose of giveing access to this site to seom categorties of users
having additional copy just in case,ecetc., etc - see article about mirrors

I think mirrors can often be by arrangements with the mirrored site,
whereas crawlers do not need to be

At least that is my understaniding
 
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now