bigtable

bigtable vs cassandra vs simpledb vs dynamo vs couchdb vs hypertable vs riak vs hbase, what do they have in common?

Sorry if this question is somewhat subjective. I am new to 'could store', 'distributed store' or some concepts like this. I really wonder what do they have in common and want to get an overview on all of them. What do I need to prepare if I want to write a product similar to this? ...

What is an SSTable?

In BigTable/GFS and Cassandra terminology, what is the definition of a SSTable? ...

what is a commit log?

in google's bigtable context, what does a commit log mean? and what is the use of a commit log? ...

What is the best way to create a running integer id on the AppEngine data storage?

For various reasons, I need a unique running integer id for my entities stored on the Google AppEngine. The automatically generated key sort of has this behaviour, but it doesn't start from 1 (or 0) and doesn't guarantee that the generated integer part will come from a continuous sequence. What would be the best way to efficiently impl...

Database that consumes less disk space

I'm looking at solutions to store a massive quantity of information consuming the less possible disk space. The information structure is very simple and the queries will also be very simple. I've looked at solutions like Apache Cassandra and relations databases but couldn't find a comparison where disk usage is mentioned. Any ideas on ...

Hadoop Map/Reduce - simple use example to do the following...

I have MySQL database, where I store the following BLOB (which contains JSON object) and ID (for this JSON object). JSON object contains a lot of different information. Say, "city:Los Angeles" and "state:California". There are about 500k of such records for now, but they are growing. And each JSON object is quite big. My goal is to do ...

mysql query performance help

Hi I have a quite large table storing words contained in email messages mysql> explain t_message_words; +----------------+---------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------------+---------+------+-----+---------+----------------+ | mwr_key | int(11...

Google App-Engine Java Batch Update

I need to upload a .csv file and save the records in bigtable. My application successfully parse 200 records in the csv files and save to table. Here is my code to save the data. for (int i=0;i<lines.length -1;i++) //lines hold total records in csv file { String line = lines[i]; //The record have 3 columns integer,integer,Tex...

Display image stored as blob using GWT RPC

I'd like to display images I've stored as Blobs in a GWT rendered page using RPC. I don't want to use a servlet because then loading the images is synchronous, and if I have many images can slow down the page load times. Any ideas? ...

how to model a follower stream in appengine?

I am trying to design tables to buildout a follower relationship. Say I have a stream of 140char records that have user, hashtag and other text. Users follow other users, and can also follow hashtags. I am outlining the way I've designed this below, but there are two limitaions in my design. I was wondering if others had smarter ways...

HTTP application to GET, PUT, DELETE

Hello there, Do you guys know if there is an application that enables me to use GET, PUT, DELETE HTTP methods in a simple way? I want to run it against Google's BigTable. Thanks a lot. ...

App engine downtime

I've noticed that google app engine seems to have a fair amount of downtime where they place the datastore into read-only mode. Frequently this downtime is in the middle of the day. Is this something that is happening only during early development, or is this something that I can expect to be always be occurring? I'm developing an app...

Scalability comparison between different DBMSs

By what factor does the performance (read queries/sec) increase when a machine is added to a cluster of machines running either: a Bigtable-like database MySQL? Google's research paper on Bigtable suggests that "near-linear" scaling is achieved can be achieved with Bigtable. This page here featuring MySQL's marketing jargon suggest...

Alternative databases to use when putting IIS Logs into a database using LogParser

We have run some scripts that use LogParser to dump our IIS logs into a SQL Server database. We can then query this to get simple stats on hits, usage etc. It's also good when linking it to error log databases and performance counter database to compare usage with errors, etc. Having implemented this for just one system and for the las...

Database design - google app engine

I am working with google app engine and using the low leval java api to access Big Table. I'm building a SAAS application with 4 layers: Client web browser RESTful resources layer Business layer Data access layer I'm building an application to help manage my mobile auto detailing company (and others like it). I have to represent the...

Does having large number of properties in an Entity effect datastore read/write performance?

I have couple of entities with properties numbering in the range of 40 - 50. All these properties are unindexed. These entities are a part of a larger entitygroup tree structure, and are always retrieved by using their key. None of the properties (except the key property) are indexed. I am using Objectify to work with entities on BigTabl...

Denormalization in Google App Engine?

Background:::: I'm working with google app engine (GAE) for Java. I'm struggling to design a data model that plays to big table's strengths and weaknesses, these are two previous related posts: http://stackoverflow.com/questions/3120192/database-design-google-app-engine http://stackoverflow.com/questions/3125115/appointments-and-line...

Batch put with pre-defined keys on Google App Engine

I would like to do a batch put of entities with pre-defined keys using the low-level api for Java. You can do a batch get: Map<Key,Entity> get(.Iterable<Key> keys) However the batch puts all seem to want to allocate their own keys: List<Key> put(Iterable<Entity> entities) Documentation page: http://code.google.com/appengine/docs...

How to store tag cloud on Google App Engine for Java

Hi, I'm looking to store entities using GAE Java which have 1-many tags. I would like to display a tag cloud, so will need to know how many times each tag occurs (can't use aggregate functions and group by on GAE to do this like I would in SQL). And when I click a tag I would like to retrieve all entities with the selected tag. Does ...

BigTable vs noSQL

may i know in 'nosql' there is limitation just like bigtable where we should 'denormalized' our table/entity ? any api wrapper that allow we to write code once and can be used for google app engine bigtable and nosql ? (something like hiberanate) ...