PDA

View Full Version : WTF is a Super column part two



prosperent brian
05-25-2011, 12:59 PM
There is a fairly well known blog post from an engineer at digg.com from a few months ago where they describe their hurdles experienced converting the datastore over to cassandra (http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model). One of the most confusing things when starting to work with cassandra is figuring out how the data needs to be structured and determining if you should use a column family or a super column family. We came up with a simple explanation today after reading that all columns needed to be deserialized when accessing a sub column within a super column. A super column is simply a regular column family with the value stored as a serialized array so it appears that you have multiple column rows, but in reality, you have the same basic data structure as a column family, but with serialized arrays stored within. This is important to understand because the newer releases of cassandra have extra functionality built in that allows you to query column families with CQL or cassandra query language, but it doesn't work with super columns (probably due to the serialized nature of the data), they also have secondary indexes which again don't work with super columns (again because of the data storage). So, given this realization, we remodeled everything to no longer use super columns. Cassandra and the data model became quite simple once we wrapped our heads around all of this. Using composite keys makes things even easier because we rarely need to use secondary indexes, especially in cases where they are not suggested (a column that has many unique values). When modeling our user table we ran into an issue where we would need a secondary index on the e-mail address field so we could lookup a user id based on the e-mail they used to log into the system. with a composite key, we can use the userid and e-mail in a single key which then allows us to get both pieces of data and slice out what we need without a second index, and without making multiple calls.

monalisa
05-25-2011, 09:45 PM
This rant blew my top off. Brian, it would help if you can start a tutorial section on Cassandra for us.

prosperent brian
05-25-2011, 11:03 PM
It's a very very different type of datastore. It is really designed for web scale applications. We have a pretty serious mysql server with 24 processor cores, 8 drives in a raid 10 array and 48GB of ram. With that we handle about 1500 queries per second. Our small cassandra cluster of 3 machines was benchmarked yesterday at 100,000 write operations per second (not even touching read capacity yet.) At any rate, for 99 percent of apps out there, mysql is going to be the easiest and best fit for the job. Once you get to the point where a single box isn't enough, or you have more writes than reads which isn't a good fit for replication, you can look at cassandra. Until that time.... mysql is going to be easier to manage. Our issue is that we need to scale to billions of api requests/impressions per day, and mysql is never going to do that in a linear fashion and handle our write rate, so switching now is going to be "easier". Still, we have to write custom python systems to bridge between the java based cassandra and our php front end language. Not a small undertaking.

AcidRaZor
05-26-2011, 12:06 AM
Ah I see... plus after it's done the techs can fuck up as much as they want, data would be intact, right? :D

prosperent brian
05-26-2011, 06:20 AM
Exactly. You set a replication factor on the cluster, so each piece of data is on multiple nodes automatically. You also set a read and write consistency level so you are reading from or writing to multiple nodes at a time. If data is stale on one of the nodes, it is automatically repaired during a read based on internal timestamps. You also get rack awareness, so the cluster knows which rack the servers are on and the most efficient path to take to communicate. Last, and very importantly, you can do multi datacenter replication. That is normally a very difficult one because of the network latency involved.

in the end, we can lose multiple nodes, or even an entire datacenter and still be online.

gnarlyhat
05-30-2011, 09:34 AM
I need this setup :D