Installing nutch 1.0 on OSX

Today I started to work on a little project that required a crawler, and Nutch seemed to do most of what I needed. The nutch team conveniently released Nutch 1.0 late in March 2009, so I had a brand new release to test out. Installing nutch 1.0 on a mac is not as straight forward as I thought, I ran into a lot of unexpected issues and here is my cook book description of how to successfully install nutch 1.0 on your mac.

  1. Download the latest source code from the Apache SVN repository I tried running it from the tarball without success, I also tried to compile the source from the tarball, but a post on the nutch forum clearly states that this will not work.
  2. Set your JAVA_HOME and NUTCH_JAVA_HOME variables, again this is not straight forward, they both need to point to your real installation of Java 1.6 (earlier versions of Java will fail). I sat these variables to: /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home, I could not get the /Library/Java/Home symbolic link to work properly.
  3. Compile the source code using Ant (I built it in Eclipse).
  4. Setup your nutch configuration, by following the tutorial by Peter P. Wang
  5. Run your first crawl with: ./bin/nutch crawl urls -dir crawl -depth 3 -topN 50

Most of the issues I encountered was related to the Java version and the fact that using /Application/Utilities/Java/Java preferences application do not really change the JAVA_HOME directory /Library/Java/Home properly. So make sure you have set both JAVA_HOME and NUTCH_JAVA_HOME, and that your OSX does not fool you when it pretend to be symbolically linking to the 1.6 installation.

Good luck.

2.5 applications I really miss in OS X

In a previous post I wrote about my new life running on a MacBook Pro and OSX. It has now been over a month since I switch over to this unix hybrid, and I am quite liking it. It is very stable, I almost never turn off my mac, but I put it to sleep, this is working fine and my last reboot is over two weeks ago. I have also gotten used to some of the new weird keys on the keyboard and the shortcuts, but I am not yet as efficent on a mac as I am/were on ubuntu/windows. During the last few weeks I have discovered that Apple and other in most cases provide me with the applications I need, but not always, and here is the list of applications I really miss:

1. TortoiseSVN
Windows application which integrates itself with Windows Explorer and provide a SVN client. I would say that this is the best graphical SVN client I have ever used. SVNX which I currently use on the mac is not a very good replacement.

2. Kate / Notepad++
Kate is a KDE text editor for unix based systems. Notpad++ is Kates equivalent on Windows. Both editors provide a simple and intuitive user interface, and a lot of syntax highlight files for all the obscure programming languages you can think of.
I know Mac have the TextMate application, but that is third party software and you have to pay €48 or something for a license, and that is probably what I probably will do. The TextMate application is really good and provide most if not all the functionality that Kate and Notepad++ provide.

In my desperation for a good text editor I almost went off and tried to install KDE on Mac, but that was said to be experimental and could break my entire system. So that is a no go for now. The article however was really interesting:


I still miss my ubuntu system and will probably go off and install parallels or vmware and ubuntu, just to have it accessible :)

Porting to Mac

At Orange Bus we all work on MacBook Pro, and when I started up at Orange Bus it was the very first time I used a Mac and OS X. It is now 12 days since that first experience with an Apple computer and though I am still a bit unfamiliar with certain features I am getting in to it, and I really like it. The stuff that I have most problem with are:

The keyboard
I have some trouble finding the keys and key combinations I am used to from both Ubuntu and WinXP. And it probably does not help that the keyboard on the laptop is English, and my external keyboard at work is Norwegian (I was thinking it would be a good idea to bring a Norwegian keyboard to the UK, yeah right).

Changing windows
Moving between different set of windows and the different set of their instances are still a bit unfamiliar. In particular moving between a set of windows of the same application.

Universal access
Universal access is a piece of software which enhances for instance the readability on the Mac, and the key command to activating the speach on the mac must be some command I am used to from Windows or Ubuntu. I am always turning it on, very annoying.

That was the “negative” part about Mac, but that is just things I need to learn. Now let us have a look at the positive stuff in OS X.

I have heard it a million times by mac users, it is so easy to use a mac but I have never really believed them. But they are correct. Using a mac is extremely simple, Apple must have some of the best UI designers in the world. Everything is so amazingly simple. And another thing you just have to love in OS X is how you install software. You open the install file and drag it into the Application folder, and woosh, the application is installed. I would go as far to say that if you are a novice computer user, Mac would probably be the best choice. Lets compare it to moving around. Using a mac is not much harder than walking, in comparison to Microsoft Windows which would be something like controlling a space shuttle.

Well I have been praising Mac for their simplicity right now, but even though it is very easy to use you still have the possibility to do advanced stuff. For instance, since Mac is based on Unix you have a terminal easily accessible and you can use almost all your regular commands :D

Mail is a Mac application for managing….. yes you are right, mail. Well Mail is a okay application it works fine, but it is not as good as Microsoft Outlook, but it is yet another example of how simple and intuitive an Apple product can be.

Quicksilver is a third party software for easy launch of applications. Type “ctrl” + “space” and then start typing the name of the application you want to launch, it automatically find the application and launches it when you hit enter. Brilliant!

At this point I can not see one good reason for going back to MS Windows. I am still very fond of Linux distros like Ubuntu, but Mac and OS X have really impressed me. So I suppose I have to line in with all the other Mac fanatics now (no I am not a mac fanatic, at least not yet).

