pkrish80
asked on
Error with Apache nutch installation on windows 7
Hello All,
I have installed apache nutch 2.1 in windows7 and am using CYGWIN.
I have the following environment variables set:
JAVA_HOME : C:\Java\jdk1.7.0_07
NUTCH_HOME: C:\apache-nutch-2.1-src\ap ache-nutch -2.1
NUTCH_JAVA_HOME: C:\Java\jdk1.7.0_07
When I execute the command, "./bin/nutch crawl urls -depth 3 -topN 5" I get the error below:
Error: Could not find or load main class org.apache.nutch.crawl.Cra wler
Is there a permissions error here? Should I set my environment variables differently? Suggestions please?
Bewlow are some of the commands I have executed:
$ set | grep 'HOME'
ANT_HOME='C:\Program Files\ant'
HOME=/home/prasankr
HOMEDRIVE=C:
HOMEPATH='\Users\prasankr'
JAVA_HOME='C:\Java\jdk1.7. 0_07'
NUTCH_HOME='C:\apache-nutc h-2.1-src\ apache-nut ch-2.1'
NUTCH_JAVA_HOME='C:\Java\j dk1.7.0_07 '
$ find ${NUTCH_HOME} -type f -name '*nutch*.jar'
C:\apache-nutch-2.1-src\ap ache-nutch -2.1/build /apache-nu tch-2.1.ja r
Executed:
./bin/nutch crawl urls -depth 3 -topN 5 2>&1 | tee nutch.log
Output:
alling nutch job
cygpath: can't convert empty path
after calling nutch job
before nutch conf
C:\Java\jdk1.6.0_45\bin;C: \Users\pra sankr\Down loads\vert ica-jdk5-6 .0.1-0.jar
nutch conf dir
/cygdrive/c/apache-nutch-2 .1-src/apa che-nutch- 2.1/src/co nf
/cygdrive/c/apache-nutch-2 .1-src/apa che-nutch- 2.1/src/co nf:C:\Java \jdk1.7.0_ 07/lib/too ls.jar
checking cygwin
nutch opts
-Dhadoop.log.dir=C:\apache -nutch-2.1 -src\apach e-nutch-2. 1\src\logs -Dhadoop.log.file=hadoop.l og
nutch opts after
-Dhadoop.log.dir=C:\apache -nutch-2.1 -src\apach e-nutch-2. 1\src\logs -Dhadoop.log.file=hadoop.l og
executing call
C:\Java\jdk1.7.0_07/bin/ja va -Xmx1000m -Djavax.xml.parsers.Docume ntBuilderF actory=com .sun.org.a pache.xerc es.interna l.jaxp.Doc umentBuild erFactoryI mpl -Dhadoop.log.dir=C:\apache -nutch-2.1 -src\apach e-nutch-2. 1\src\logs -Dhadoop.log.file=hadoop.l og -classpath C:\apache-nutch-2.1-src\ap ache-nutch -2.1\src\c onf;C;C:\J ava\jdk1.7 .0_07\lib\ tools.jar; C:\apache- nutch-2.1- src\apache -nutch-2.1 \src\lib\* .jar
Class
org.apache.nutch.crawl.Cra wler
Error: Could not find or load main class org.apache.nutch.crawl.Cra wler
Thanks,
Prasanna
I have installed apache nutch 2.1 in windows7 and am using CYGWIN.
I have the following environment variables set:
JAVA_HOME : C:\Java\jdk1.7.0_07
NUTCH_HOME: C:\apache-nutch-2.1-src\ap
NUTCH_JAVA_HOME: C:\Java\jdk1.7.0_07
When I execute the command, "./bin/nutch crawl urls -depth 3 -topN 5" I get the error below:
Error: Could not find or load main class org.apache.nutch.crawl.Cra
Is there a permissions error here? Should I set my environment variables differently? Suggestions please?
Bewlow are some of the commands I have executed:
$ set | grep 'HOME'
ANT_HOME='C:\Program Files\ant'
HOME=/home/prasankr
HOMEDRIVE=C:
HOMEPATH='\Users\prasankr'
JAVA_HOME='C:\Java\jdk1.7.
NUTCH_HOME='C:\apache-nutc
NUTCH_JAVA_HOME='C:\Java\j
$ find ${NUTCH_HOME} -type f -name '*nutch*.jar'
C:\apache-nutch-2.1-src\ap
Executed:
./bin/nutch crawl urls -depth 3 -topN 5 2>&1 | tee nutch.log
Output:
alling nutch job
cygpath: can't convert empty path
after calling nutch job
before nutch conf
C:\Java\jdk1.6.0_45\bin;C:
nutch conf dir
/cygdrive/c/apache-nutch-2
/cygdrive/c/apache-nutch-2
checking cygwin
nutch opts
-Dhadoop.log.dir=C:\apache
nutch opts after
-Dhadoop.log.dir=C:\apache
executing call
C:\Java\jdk1.7.0_07/bin/ja
Class
org.apache.nutch.crawl.Cra
Error: Could not find or load main class org.apache.nutch.crawl.Cra
Thanks,
Prasanna
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I checked the classpath and still have issues but will look into reinstalling nutch.
There's no other interpretation of that exception. Many things can cause it, and it can be hard for new users to track down. Reinstalling nutch might fix it -- maybe there's an environment variable which wasn't set correctly.
ASKER
nutch.txt