Solved

Error with Apache nutch installation on windows 7

Posted on 2014-01-21
4
1,075 Views
Last Modified: 2014-05-12
Hello All,

I have installed apache nutch 2.1 in windows7 and am using CYGWIN.

I have the following environment variables set:

JAVA_HOME : C:\Java\jdk1.7.0_07
NUTCH_HOME:  C:\apache-nutch-2.1-src\apache-nutch-2.1
NUTCH_JAVA_HOME: C:\Java\jdk1.7.0_07

When I execute the command, "./bin/nutch crawl urls -depth 3 -topN 5" I get the error below:

Error: Could not find or load main class org.apache.nutch.crawl.Crawler

Is there a permissions error here? Should I set my environment variables differently? Suggestions please?


Bewlow are some of the commands I have executed:

$ set | grep 'HOME'
ANT_HOME='C:\Program Files\ant'
HOME=/home/prasankr
HOMEDRIVE=C:
HOMEPATH='\Users\prasankr'
JAVA_HOME='C:\Java\jdk1.7.0_07'
NUTCH_HOME='C:\apache-nutch-2.1-src\apache-nutch-2.1'
NUTCH_JAVA_HOME='C:\Java\jdk1.7.0_07'


 
$ find ${NUTCH_HOME} -type f -name '*nutch*.jar'
C:\apache-nutch-2.1-src\apache-nutch-2.1/build/apache-nutch-2.1.jar

Executed:
./bin/nutch crawl urls -depth 3 -topN 5 2>&1 | tee nutch.log


Output:
alling nutch job
cygpath: can't convert empty path
after calling nutch job

before nutch conf
C:\Java\jdk1.6.0_45\bin;C:\Users\prasankr\Downloads\vertica-jdk5-6.0.1-0.jar
nutch conf dir
/cygdrive/c/apache-nutch-2.1-src/apache-nutch-2.1/src/conf
/cygdrive/c/apache-nutch-2.1-src/apache-nutch-2.1/src/conf:C:\Java\jdk1.7.0_07/lib/tools.jar
checking cygwin
nutch opts
-Dhadoop.log.dir=C:\apache-nutch-2.1-src\apache-nutch-2.1\src\logs -Dhadoop.log.file=hadoop.log
nutch opts after
-Dhadoop.log.dir=C:\apache-nutch-2.1-src\apache-nutch-2.1\src\logs -Dhadoop.log.file=hadoop.log
executing call
C:\Java\jdk1.7.0_07/bin/java -Xmx1000m -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -Dhadoop.log.dir=C:\apache-nutch-2.1-src\apache-nutch-2.1\src\logs -Dhadoop.log.file=hadoop.log -classpath C:\apache-nutch-2.1-src\apache-nutch-2.1\src\conf;C;C:\Java\jdk1.7.0_07\lib\tools.jar;C:\apache-nutch-2.1-src\apache-nutch-2.1\src\lib\*.jar
Class
org.apache.nutch.crawl.Crawler
Error: Could not find or load main class org.apache.nutch.crawl.Crawler


Thanks,
Prasanna
0
Comment
Question by:pkrish80
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 

Author Comment

by:pkrish80
ID: 39799006
Also, attached the nutch file.
nutch.txt
0
 
LVL 27

Accepted Solution

by:
mrcoffee365 earned 500 total points
ID: 39808707
This line:
Error: Could not find or load main class org.apache.nutch.crawl.Crawler

indicates that the classpath for your executable is incorrect.  Try checking your classpath again and the content of the jars in your classpath.
0
 

Author Comment

by:pkrish80
ID: 39883052
I checked the classpath and still have issues but will look into reinstalling nutch.
0
 
LVL 27

Expert Comment

by:mrcoffee365
ID: 39883619
There's no other interpretation of that exception.  Many things can cause it, and it can be hard for new users to track down.  Reinstalling nutch might fix it -- maybe there's an environment variable which wasn't set correctly.
0

Featured Post

Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

What do responsible coders do? They don't take detrimental shortcuts. They do take reasonable security precautions, create important automation, implement sufficient logging, fix things they break, and care about users.
The SignAloud Glove is capable of translating American Sign Language signs into text and audio.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

635 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question