partitioning

How to scale out by evolving from database partitions to sharding?

Say I have a MySQL table: CREATE TABLE tweets ( tweet_id INT NOT NULL AUTO_INCREMENT, author_id INT NOT NULL, text CHAR(140) NOT NULL, PRIMARY KEY (tweet_id) ) PARTITION BY HASH(tweet_id) PARTITIONS 12; All is good. The table lives on a single server - Server1. But eventually I may want to scale out. So I'd want to shard the table and...

Data load to huge partitioned table

I have a huge table. First range partitioned by price_date, then hash partitioned by fund_id. The table has 430 million rows. Every day I have a batch job in which insert 1.5 million to 3 million rows, every day. We are looking the for enable and disable local indexes(not all indexes but based on data which partitions are touched by d...

How to create a PostgreSQL partitioned sequence?

Is there a simple (ie. non-hacky) and race-condition free way to create a partitioned sequence in PostgreSQL. Example: Using a normal sequence in Issue: | Project_ID | Issue | | 1 | 1 | | 1 | 2 | | 2 | 3 | | 2 | 4 | Using a partitioned sequence in Issue: | Project_ID | Issue | | 1 ...

mysql 5.1 partitioning - do I have to remove the index/key element?

I have a table with several indexes. All of them contain an specific integer column. I'm moving to mysql 5.1 and about to partition the table by this column. Do I still have to keep this column as key in my indexes or I can remove it since partitioning will take care of searching only in the relevant keys data efficiently without need t...

What is the "m-bridge technique" for partitioning binary trees for parallel processing?

How does it work? Please explain in enough detail in English or pseudocode so that I can implement in any language. It is mentioned and briefly described in this paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.4.3643&rep=rep1&type=pdf but there isn't enough detail there to implement myself. (the weights in Fig...

Which granulary to choose for database table partitioning?

I have a 20-million record table in MySQL database. SELECT's work really fast because I have set up good indexes, but INSERT and UPDATE operation is getting to be really slow. The database is back-end of a web application under heavy load. INSERTs and UPDATEs are really slow because there are some 5 indexes on this table and index size i...

fdisk fix mis-aligned SSD to correct head and sector count

maletor@denmark:~$ sudo fdisk /dev/sdc WARNING: DOS-compatible mode is deprecated. It's strongly recommended to switch off the mode (command 'c') and change display units to sectors (command 'u'). Command (m for help): p Disk /dev/sdc: 64.0 GB, 64023257088 bytes 255 heads, 63 sectors/track, 7783 cylinders Units =...

Picking ranges for splitting up a dataset

I have a few million integers between 0 and 64K. I'd like to split them up into N buckets, where each bucket contains about the same number of items from a contiguous range. So for example, if I only had a single datapoint with each possible value, and 64 buckets, ideally I'd end up with a bucket for 0-1024, one for 1025-2048, etc. ...

Partitioning table in sql server

Hi, I'm having table like empid,empname,cityname,statename,countryname. how to split this table structure? Regards Bharathi ...

Table with 80 million records and adding an index takes more than 18 hours (or forever)! Now what?

A short recap of what happened. I am working with 71 million records (not much compared to billions of records processed by others). On a different thread, someone suggested that the current setup of my cluster is not suitable for my need. My table structure is: CREATE TABLE `IPAddresses` ( `id` int(11) unsigned NOT NULL auto_incremen...

How to partition a MySQL table based on char column?

Is it possible to partition based on char column? After reviewing the MySQL 5.1 documentation it appears that only integer types can be used. Is this correct? Or can I use some function to convert the char into an integer? The char field in question contains a unique identifier. ...

How to generate a random partition from an iterator in Python

Given the desired number of partitions, the partitions should be nearly equal in size. This question handles the problem for a list. They do not have the random property, but that is easily added. My problem is, that I have an iterator as input, so shuffle does not apply. The reason for that is that I want to randomly partition the nodes...

LINQ Partition List into Lists of 8 members.

How would one take a List (using LINQ) and break it into a List of Lists partitioning the original list on every 8th entry? I imagine something like this would involve Skip and/or Take, but I'm still pretty new to LINQ. Edit: Using C# / .Net 3.5 ...

How to calculate axis ticks at a uniform distribution?

Given a data range by its minimum and maximum value, how can one calculate the distribution of buckets on a an axis? The requirements are: - The axis ticks shall be evenly distributed (equal size). - Further, all ticks shall appear as numbers like 10, 20, 30, ... or -0.3, 0.1, 0.2, 0.4, ... - The calculation shall accept a parameter that...

Partitioning the Users - multiple OpenIDs

There is database of users. Let's say that I want to support that some users can have multiple OpenID's and use them to log in and let's say that I want to partition Users in to the multiple databases. Is there some solution for this ? StackOverflow supports two OpenIDs per user, how would they do this? If the users could use only one ...

Mysql Partition : how to deal with month and hash

Hi all, I have a specific question on mysql sub-partitioning using hash on a date/datetime column. I have partitioned by site_id, and I want now to subpartitioned by month (1 to 12), so the partitions number is fixed in time (sp_dec, sp_jan, ...) Current cut (columns and other partitions) table structure : CREATE TABLE IF NOT EXISTS `...

SQL Server 2005 Partition table by foreign referenced data

is there a canonical way to partition a table by referenced data to another table? for example timetable id datetime bigtable id timetable_id -- foreign key .. other data .. i want to partition bigtable by the datetime in timetable. thankx. ...

digraph partitioning to subgraphs

Hello Given a DAG with |V| = n and has s sources we have to present subgraphs such that each subgraph has approximately k1=√|s| sources and approximately k2=√|n| nodes. If we define the height of the DAG to be the maximum path length from some source to some sink. We require that all subgraphs generated will have approximately the sa...

Automatically sharding MySQL?

Right now, I'm dealing with a TON (trust me) of data that needs to be available in real-time for fast reads and writes to customers. The backend storage system that we're using is Oracle, but we'd like to replace our big, beefy machines with a leaner system. For various reasons, we can't use Cassandra, and we're testing (but I'm scared...