Archive for the ‘NoSQL’ Category
Great article on Super Columns and Cassandra
This is a great article describing super columns and the general structure of cassandra. A must read for NoSQL newbies.
http://arin.me/code/wtf-is-a-supercolumn-cassandra-data-model
Installing Cassandra and Thrift on OSX
Note that this installation description was written for Cassandra 0.5 and may not be correct for the current releases.
Cassandra is a NoSQL distributed database developed by Facebook, it is built to handle huge amounts of data and to perform CRUD operations quickly. The Cassandra site’s strap line says:
“The Apache Cassandra Project develops a highly scalable second-generation distributed database, bringing together Dynamo’s fully distributed design and Bigtable’s ColumnFamily-based data model.
Thrift is also developed by Facebook and is a software framework for service development and is used as an interface to Cassandra. The Thrift page site says:
“Thrift is a software framework for scalable cross-language services development. It combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, Smalltalk, and OCaml.”
Both Cassandra and Thrift are apache incubator projects.
Installing Cassandra
1. Download cassandra
Download from: http://incubator.apache.org/cassandra/.
2. Create and set the correct paths in the storage-conf.xml
You can find the storage-conf.xml file in your cassandra root directory /conf
My configuration file settings in storage-conf.xml:
-
<CommitLogDirectory>/Users/kristianlunde/tmp/cassandra-log/commitlog</CommitLogDirectory>
-
<DataFileDirectories>
-
<DataFileDirectory>/Users/kristianlunde/workspaces/mysapient/cassandra/data</DataFileDirectory>
-
</DataFileDirectories>
-
<CalloutLocation>/Users/kristianlunde/workspaces/mysapient/cassandra/callouts</CalloutLocation>
-
<BootstrapFileDirectory>/Users/kristianlunde/workspaces/mysapient/cassandra/bootstrap</BootstrapFileDirectory>
-
<StagingFileDirectory>/Users/kristianlunde/workspaces/mysapient/cassandra/staging</StagingFileDirectory>
Notice: You have to create all these directories for cassandra to run properly.
3. Set a log directory in the log4j.properties file
This file is found in the same directory as the storage-conf.xml
4. Check that you are running java 6 as default
-
java -version
If you are running an earlier version of java you will have to change your version. Java 6 should already be installed on your mac if you keep your os in sync with the automatic updates from apple. You can change your java version by using the “Java Settings” application located in your /Application/Utilities directory.
5. Starting Cassandra
You should be ready to go now, navigate to the root directory of your cassandra installation and start cassandra by typing:
-
bin/cassandra -f
If you dont see any error messages cassandra is probably running as it should, so it is time to test it out.
Cassandra comes with a CLI interface which allowes you to do simple queries to the database. Notice that the CLI interface is not not as powerful as the thrift interface. You can for instance not execute get queries in Super Columns, those queries will create a java exception.
To test the CLI interface, run the following command from the cassandra root directory:
-
./bin/cassandra-cli –host localhost –port 9160
Inserting values to the keyspace:
-
cassandra> set Keyspace1.Standard1['blog-post']['name'] = 'Installing Cassandra and Thrift OSX'
-
Value inserted.
-
cassandra> set Keyspace1.Standard1['blog-post']['author'] = 'Kristian Lunde'
-
Value inserted.
Retrieving data from the keyspace:
-
(column=name, value=Installing Cassandra and Thrift OSX; timestamp=1258748376097)
-
(column=author, value=Kristian Lunde; timestamp=1258748405486)
-
Returned 2 rows.
-
cassandra>
Installing Thrift
Update: I found this manual after I had installed thrift: http://wiki.apache.org/thrift/ThriftInstallationMacOSX, using this install guide will probably fix the issues I had with compiling thrift.
1. Download Thrift
Download from http://incubator.apache.org/thrift/download/ and extract it.
2. Check that you have installed the following:
- g++ 3.3.5+
- Runtime libraries for lex and yacc might be needed for the compiler.
- boost 1.33.1+ (1.34.0 for building all tests) http://www.boost.org/.
I had to install boost manually:
-
sudo port install boost
Notice: the boost installation might take a while, It took about 5 – 10 minutes on my Macbook PRO (2.53GHz).
You can see the full requirements for thrift at http://wiki.apache.org/thrift/ThriftRequirements.
3. Start the installation
-
kristian-lundes-macbook-pro:thrift kristianlunde$ ./bootstrap.sh
-
configure.ac:26: installing `./missing'
-
configure.ac:26: installing `./install-sh'
-
compiler/cpp/Makefile.am: installing `./depcomp'
-
configure.ac: installing `./ylwrap'
-
kristian-lundes-macbook-pro:thrift
-
./configure
This ended up in an error message for me:
-
./configure: line 20722: syntax error near unexpected token `MONO,'
-
./configure: line 20722: ` PKG_CHECK_MODULES(MONO, mono >= 2.0.0, net_3_5=yes, net_3_5=no)'
To fix this I had to copy my pkg.m4 file from /opt/local/share/aclocal/pkg.m4 to my thrift/aclocal directory.
Navigate to your thrift root directory:
-
cp /opt/local/share/aclocal/pkg.m4 aclocal
Thanks to http://aaronspotlatch.appspot.com/archive/Jul-2008 and
http://qslack.com/post/thrift-macosx-104 for pointing me in the right direction.
You should now be ready to run make
-
make
and
-
sudo make install
You should now be able to run thrift on your mac.
-
thrift
You should now be ready to build your amazing application with Cassandra if both your installation of Cassandra and Thrift were successful.
I will try to post another blog post shortly on using Cassandra, Thrift and PHP. Stay tuned.