Avatar of Octalys
Octalys

asked on 

I am looking for an opensource threaded web spider/web crawler/web scraper package for Linux. Any suggestions?

I am looking for an opensource threaded web spider/web crawler/web scraper package for Linux. Any suggestions?
Web DevelopmentLinuxProgramming

Avatar of undefined
Last Comment
Duncan Roe
Avatar of Arty K
Arty K
Flag of Kazakhstan image

http://www.httrack.com/ is like a 'teleport', but open source, multithreaded ...

If you need something different (say if you don't need offline downloading), be more specific.
Avatar of Arty K
Arty K
Flag of Kazakhstan image

Say, there is a 'Lucene' engine, that is also a crawler, but used for text indexing/searching: http://lucene.apache.org/java/docs/
Avatar of Octalys
Octalys

ASKER

I need to collect urls with the software.
ASKER CERTIFIED SOLUTION
Avatar of Duncan Roe
Duncan Roe
Flag of Australia image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Web Development
Web Development

Web development includes all aspects of presenting content on intranets and the Internet, including delivery development, protocols, languages and standards, server software, browser clients, databases and multimedia generation.

78K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo