Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Error with Apache nutch installation on windows 7

Posted on 2014-01-21
4
1,004 Views
Last Modified: 2014-05-12
Hello All,

I have installed apache nutch 2.1 in windows7 and am using CYGWIN.

I have the following environment variables set:

JAVA_HOME : C:\Java\jdk1.7.0_07
NUTCH_HOME:  C:\apache-nutch-2.1-src\apache-nutch-2.1
NUTCH_JAVA_HOME: C:\Java\jdk1.7.0_07

When I execute the command, "./bin/nutch crawl urls -depth 3 -topN 5" I get the error below:

Error: Could not find or load main class org.apache.nutch.crawl.Crawler

Is there a permissions error here? Should I set my environment variables differently? Suggestions please?


Bewlow are some of the commands I have executed:

$ set | grep 'HOME'
ANT_HOME='C:\Program Files\ant'
HOME=/home/prasankr
HOMEDRIVE=C:
HOMEPATH='\Users\prasankr'
JAVA_HOME='C:\Java\jdk1.7.0_07'
NUTCH_HOME='C:\apache-nutch-2.1-src\apache-nutch-2.1'
NUTCH_JAVA_HOME='C:\Java\jdk1.7.0_07'


 
$ find ${NUTCH_HOME} -type f -name '*nutch*.jar'
C:\apache-nutch-2.1-src\apache-nutch-2.1/build/apache-nutch-2.1.jar

Executed:
./bin/nutch crawl urls -depth 3 -topN 5 2>&1 | tee nutch.log


Output:
alling nutch job
cygpath: can't convert empty path
after calling nutch job

before nutch conf
C:\Java\jdk1.6.0_45\bin;C:\Users\prasankr\Downloads\vertica-jdk5-6.0.1-0.jar
nutch conf dir
/cygdrive/c/apache-nutch-2.1-src/apache-nutch-2.1/src/conf
/cygdrive/c/apache-nutch-2.1-src/apache-nutch-2.1/src/conf:C:\Java\jdk1.7.0_07/lib/tools.jar
checking cygwin
nutch opts
-Dhadoop.log.dir=C:\apache-nutch-2.1-src\apache-nutch-2.1\src\logs -Dhadoop.log.file=hadoop.log
nutch opts after
-Dhadoop.log.dir=C:\apache-nutch-2.1-src\apache-nutch-2.1\src\logs -Dhadoop.log.file=hadoop.log
executing call
C:\Java\jdk1.7.0_07/bin/java -Xmx1000m -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl -Dhadoop.log.dir=C:\apache-nutch-2.1-src\apache-nutch-2.1\src\logs -Dhadoop.log.file=hadoop.log -classpath C:\apache-nutch-2.1-src\apache-nutch-2.1\src\conf;C;C:\Java\jdk1.7.0_07\lib\tools.jar;C:\apache-nutch-2.1-src\apache-nutch-2.1\src\lib\*.jar
Class
org.apache.nutch.crawl.Crawler
Error: Could not find or load main class org.apache.nutch.crawl.Crawler


Thanks,
Prasanna
0
Comment
Question by:pkrish80
  • 2
  • 2
4 Comments
 

Author Comment

by:pkrish80
ID: 39799006
Also, attached the nutch file.
nutch.txt
0
 
LVL 27

Accepted Solution

by:
mrcoffee365 earned 500 total points
ID: 39808707
This line:
Error: Could not find or load main class org.apache.nutch.crawl.Crawler

indicates that the classpath for your executable is incorrect.  Try checking your classpath again and the content of the jars in your classpath.
0
 

Author Comment

by:pkrish80
ID: 39883052
I checked the classpath and still have issues but will look into reinstalling nutch.
0
 
LVL 27

Expert Comment

by:mrcoffee365
ID: 39883619
There's no other interpretation of that exception.  Many things can cause it, and it can be hard for new users to track down.  Reinstalling nutch might fix it -- maybe there's an environment variable which wasn't set correctly.
0

Featured Post

Connect further...control easier

With the ATEN CE624, you can now enjoy a high-quality visual experience powered by HDBaseT technology and the convenience of a single Cat6 cable to transmit uncompressed video with zero latency and multi-streaming for dual-view applications where remote access is required.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A short article about problems I had with the new location API and permissions in Marshmallow
Introduction This article is intended for those who are new to PHP error handling (https://www.experts-exchange.com/articles/11769/And-by-the-way-I-am-New-to-PHP.html).  It addresses one of the most common problems that plague beginning PHP develop…
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question