Pruning Columns in Cassandra | ansaurus

tags:

views:

77

answers:

1

+1 Q:

Pruning Columns in Cassandra

Hi,

I'm thinking about using Cassandra for a large data project. The data will be sourced from a traditional data warehouse. Cassandra will host the data formated in a way my application can correctly read it.

I don't quite understand how I will prune the data from Cassandra.

For example, I want to count the number of visits a particular ip address has made to a website in the past 24 hours. I plan on generating this data every hour and I'd like to keep 2 weeks per IP address. My Column structure looks like:

127.0.0.1: {
  visitorsLast24Hours: {
    1279554672: 30,
    1279553072: 24,
    etc...
  }
}

How do I remove rows from the visitorsLast24Hours column?

So far, the best solution I've come up with is to:

Get the column I want to work with
Prune the values I no longer want to keep
Delete the column from the database
Re-insert the new pruned column

This seems like a poor method for working with the database. I'm assuming my data sizes will balloon, based on the way storage is done in Cassandra.

Is there a more efficient way of doing it?

I'm currently working with phpcassa as my interface to Cassandra.

Thanks!

+1 A:

You actually don't have to delete and re-write the entire column. Assuming you're using a SuperColumn here, you can delete just a specified key from within the supercolumn (visitorsLast24Hours in this case). So you would traverse specific key values within the supercolumn that are older than your cutoff time, and delete each of those. With a supercolumn you don't have to re-write the entire dataset each time you add or delete a sub-row. Items of interest: http://wiki.apache.org/cassandra/API06 slicing and deleting.

Unoti 2010-07-23 04:42:15

related questions

IDE suggestions: Eclipse IDE vs. Zend Studio ( confused )

MySQL/Apache Error in PHP MySQL query

Lightweight IDE for Linux

What PHP framework would you choose for a new application and why?

Why is my ternary expression not working?

How can I get at the matches when using preg_replace in PHP?

Mechanisms for tracking DB schema changes

Wordpress theme development offline tools

Using object property as default for method property

How can I get the authenticated user name under Apache using plain HTTP authentication and PHP?

Make XAMPP/Apache serve file outside of htdocs

How do you debug PHP scripts?

PHP Variables passed by value or by reference?

Best way to implement unit testing in PHP

Connect PHP to an AS/400

Best way to access Exchange using PHP?

PHP Session Security

How do I access a remote form in php?

What's the best way to generate a tag cloud from an array? (using h1 through h6 for sizing)

Apache/PHP: error_log per Virtual Host?

How do I track file downloads with apache/PHP

How would you access Object properties from within an object method?

Flat File Databases in PHP

Best way to allow plugins for a PHP application

Latest information on PHP upcoming releases