sharding

Database sharding and Rails

What's the best way to deal with a sharded database in Rails? Should the sharding be handled at the application layer, the active record layer, the database driver layer, a proxy layer, or something else altogether? What are the pros and cons of each? ...

MySQL Partitioning / Sharding / Splitting - which way to go?

We have an InnoDB database that is about 70 GB and we expect it to grow to several hundred GB in the next 2 to 3 years. About 60 % of the data belong to a single table. Currently the database is working quite well as we have a server with 64 GB of RAM, so almost the whole database fits into memory, but we’re concerned about the future wh...

When people talk about scaling a website with 'shards', what do they mean?

I have heard the 'shard' technique mentioned several times with regard to solving scaling problems for large websites. What is this 'shard' technique and why is it so good? ...

Extreme Sharding: One SQLite Database Per User

I'm working on a web app that is somewhere between an email service and a social network. I feel it has the potential to grow really big in the future, so I'm concerned about scalability. Instead of using one centralized MySQL/InnoDB database and then partitioning it when that time comes, I've decided to create a separate SQLite databa...

Searching across shards?

Short version If I split my users into shards, how do I offer a "user search"? Obviously, I don't want every search to hit every shard. Long version By shard, I mean have multiple databases where each contains a fraction of the total data. For (a naive) example, the databases UserA, UserB, etc. might contain users whose names begin ...

Resources for Database Sharding and Partitioning

I'm working with a database schema that is running into scalability issues. One of the tables in the schema has grown to around 10 million rows, and I am exploring sharding and partitioning options to allow this schema to scale to much larger datasets (say, 1 billion to 100 billion rows). Our application must also be deployable onto se...

Best way to move a data row to another shard?

The question says it all. Example: I'm planning to shard a database table. The table contains customer orders which are flagged as "active", "done" and "deleted". I also have three shards, one for each flag. As far as I understand a row has to be moved to the right shard, when the flag is changed. Am I right? What's the best way to ...

Web scripting language with parallel non blocking database access?

My webapp will need to use multiple database shards, and occasionally need to query these shards in parallel. Are there any web scripting languages that have mature, stable support for parallel non blocking database access? If so, can you point me in the right direction? Free open source is preferred, but I mostly want something that ...

Sharding with ASP.NET's SqlMembershipProvider?

I'm considering writing a blog hosting app in ASP.NET MVC. I'm new to .NET, but I'm reasonably competent in the LAMP world. My question concerns horizontal scaling of user data. Each user with a blog would have something like 6 tables in a database. I'd like to plan for horizontal scaling so that 20% of the users could be on one data...

MySQL Simple Table Synchronization?

Ok, So I'm developing a website which to begin with will have 3 clear sub sites: Forum, News and a Calendar. Each sub site will have it's own database and common to all of these databases will be a user table which needs to be in each database so that joins can be done. How can I synchronize all the user tables so that it doesn't matte...

Database Sharding Support in Propel

Just wonder how good is Propel's support for database sharding? I am thinking about creating my application in PHP, using MySQL as the database server and Propel as the ORM. I figure out that it may be good to keep the architecture scalable right from the start, just in case my application takes off. What's your take? ...

How to create unique row ID in sharded databases?

In a non-sharded DB, I could just use auto-increment to generate a unique ID to reference a specific row. I want to shard my DB, say into 12 shards. Now when I insert into a specific shard, the auto-increment ID is no longer unique. Would like to hear anyone's experience in dealing with this problem. ...

MySQL Proxy Alternatives for Database Sharding

Are there any alternatives for MySQL Proxy. I don't want to use it since it's still in alpha. I will have 10 MySQL servers with table_1 table_2 table_3 table_4 ... table_10 spread across the 10 servers. Each table is identical in their structure, their just shards with different data sets. Is there a alternative to MySQL Proxy, where...

How to I determine if an object exists for a given key in the Google AppEngine datastore using Java?

I'm trying to port the Sharding Counters example (code.google.com/appengine/articles/sharding_counters.html) to Java. The only problem is that the Java API does not have a call similar to Python's 'get_by_key_name'. This is the basic idea: Transaction tx = pm.currentTransaction(); Key key = KeyFactory.createKey(C...

what is a good way to horizontal shard in postgresql

Hey guys what is a good way to horizontal shard in postgresql 1. pgpool 2 2. gridsql which is a better way to use sharding also is it possible to paritition without changing client code It would be great if some one can share a simple tutorial or cookbook example of how to setup and use sharding ...

Have any good links/articles on job queuing & db sharding?

Does anyone have good links on how/when/why to use job queuing to scale web apps? Also, articles on db sharding would be useful too :) ...

How do we do horizontal sharding/partitioning in Postgresql using pgpool-II?

Grid sql and pgpool-ii are partitioning tools for postgresql. gridsql is designed for reporting business applications. PGPool-II for transactional systems. Can some one show me how to do a horizontal partition on the bigint column uid on table users? thanks ...

can someone show how to setup pgpool 2 in parallel *query* mode ie horizontal partitioning

i did read both extensively, but finishing all the steps parallel mode or horizontal partitioning mode doesnot work! but this is my conf file backend_hostname, backend_port, backend_weight here are examples backend_hostname0 = 'localhost' backend_port0 = 5432 backend_weight0 = 1 backend_data_directory0 = '/mnt/work/database' backend...

Data Sharding

I'm interested in sharding my websites user data across multiple servers. For example, users will login from the same place. but the login script needs to figure out what server that users data resides on. So the login script would query the master registry for that user name, and it might return that it's on server B. The login scrip...

Distributed Key-Value Data Store with Offline Access (Static Partitioning)

Need to be able to set server(s) that replicate all information, as a master data store that has all the data. Also need servers that specifically store/replicate certain data, available in local LANs, so that when the internet connection goes down, they can still access their local data. Under normal circumstances, the clients will ac...