<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Kristian Lunde &#187; nutch</title>
	<atom:link href="http://www.klunde.net/tag/nutch/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.klunde.net</link>
	<description>www.klunde.net</description>
	<lastBuildDate>Sun, 18 Jul 2010 21:00:28 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=abc</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Installing nutch 1.0 on OSX</title>
		<link>http://www.klunde.net/2009/04/07/installing-nutch-10-on-osx/</link>
		<comments>http://www.klunde.net/2009/04/07/installing-nutch-10-on-osx/#comments</comments>
		<pubDate>Tue, 07 Apr 2009 21:15:05 +0000</pubDate>
		<dc:creator>Kristian Lunde</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Mac]]></category>
		<category><![CDATA[nutch]]></category>
		<category><![CDATA[OS X]]></category>

		<guid isPermaLink="false">http://www.klunde.net/?p=275</guid>
		<description><![CDATA[
			
				
			
		
Today I started to work on a little project that required a crawler, and Nutch seemed to do most of what I needed. The nutch team conveniently released Nutch 1.0 late in March 2009, so I had a brand new release to test out. Installing nutch 1.0 on a mac is not as straight forward [...]]]></description>
			<content:encoded><![CDATA[<div class="tweetmeme_button" style="float: right; margin-left: 10px;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.klunde.net%2F2009%2F04%2F07%2Finstalling-nutch-10-on-osx%2F" onclick="urchinTracker('/outgoing/api.tweetmeme.com/share?url=http_3A_2F_2Fwww.klunde.net_2F2009_2F04_2F07_2Finstalling-nutch-10-on-osx_2F&amp;referer=');"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.klunde.net%2F2009%2F04%2F07%2Finstalling-nutch-10-on-osx%2F&amp;source=kristianlunde&amp;style=normal&amp;service=bit.ly" height="61" width="50" /><br />
			</a>
		</div>
<p>Today I started to work on a little project that required a crawler, and <a href="http://lucene.apache.org/nutch/" onclick="urchinTracker('/outgoing/lucene.apache.org/nutch/?referer=');">Nutch</a> seemed to do most of what I needed. The nutch team conveniently released Nutch 1.0 late in March 2009, so I had a brand new release to test out. Installing nutch 1.0 on a mac is not as straight forward as I thought, I ran into a lot of unexpected issues and here is my cook book description of how to successfully install nutch 1.0 on your mac.</p>
<ol>
<li>Download the latest source code from the Apache SVN repository <i>http://svn.apache.org/repos/asf/lucene/nutch/</i>. I tried running it from the tarball without success, I also tried to compile the source from the tarball, but a post on the nutch forum clearly states that this will not work.</li>
<li>Set your <b>JAVA_HOME</b> and <b>NUTCH_JAVA_HOME</b> variables, again this is not straight forward, they both need to point to your real installation of Java 1.6 (earlier versions of Java will fail). I sat these variables to: <i>/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home</i>, I could not get the <i>/Library/Java/Home</i> symbolic link to work properly.</li>
<li>Compile the source code using Ant (I built it in Eclipse).</li>
<li>Setup your nutch configuration, by following the <a href="http://zillionics.com/resources/articles/NutchGuideForDummies.htm" class="broken_link" onclick="urchinTracker('/outgoing/zillionics.com/resources/articles/NutchGuideForDummies.htm?referer=');">tutorial by Peter P. Wang</a></li>
<li>Run your first crawl with: <i>./bin/nutch crawl urls -dir crawl -depth 3 -topN 50</i></li>
</ol>
<p>Most of the issues I encountered was related to the Java version and the fact that using <i>/Application/Utilities/Java/Java preferences</i> application do not really change the <b>JAVA_HOME</b> directory <i>/Library/Java/Home</i> properly. So make sure you have set both <b>JAVA_HOME</b> and <b>NUTCH_JAVA_HOME</b>, and that your OSX does not fool you when it pretend to be symbolically linking to the 1.6 installation. </p>
<p>Good luck.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.klunde.net/2009/04/07/installing-nutch-10-on-osx/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
