partitioning

Partitioning a table in mysql after creation

I have a table with a bunch of data already in it. I know how to create a partitioned table or alter an already existing partitioned table, but can I add partitions to a table after it has been created, has data in it, without losing the data? The other option is to dump all the data, recreate the table with the partitions and then ins...

Find all possible partitions of n elements with k-sized subsets, where two elements share same set only once

I have n elements that need to be partitioned into x sets, each set has to hold exactly k=4 elements. I need to find all possible partitions with the constraint that each pair of elements only shares the same set once. So if I start with [1 2 3 4] [5 6 7 8] [...], all consecutive partitions cannot hold e.g. [1 2 X X] or [X X 1 3]. sets...

How do I INSERT and SELECT data with partitioned tables?

I set up a set of partitioned tables per the docs at http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html CREATE TABLE t (year, a); CREATE TABLE t_1980 ( CHECK (year = 1980) ) INHERITS (t); CREATE TABLE t_1981 ( CHECK (year = 1981) ) INHERITS (t); CREATE RULE t_ins_1980 AS ON INSERT TO t WHERE (year = 1980) DO INSTEA...

How do I ALTER a set of partitioned tables in Postgres?

I created a set of partitioned tables in Postgres, and started inserting a lot of rows via the master table. When the load process blew up on me, I realized I should have declared the id row BIGSERIAL (BIGINT with a sequence, behind the scenes), but inadvertently set it as SERIAL (INTEGER). Now that I have a couple of billion rows loaded...

Optimizing a Partition Function

Here is the code, in python: # function for pentagonal numbers def pent (n): return int((0.5*n)*((3*n)-1)) # function for generalized pentagonal numbers def gen_pent (n): return pent(int(((-1)**(n+1))*(round((n+1)/2)))) # array for storing partitions - first ten already stored partitions = [1, 1, 2, 3, 5, 7, 11, 15, 22, 30, 42] #...

Parititioned Data Map/Reduce

Hello everyone, I have written my custom partitioner for partitioning datasets. I want to partition two datasets using the same partitioner and then in the next mapreduce job, I want each mapper to handle the same partition from the two sources and perform some function such as joining etc. How I can I ensure that one mapper gets the sp...

Tools for optimizing scalability of an Hadoop application?

I'm working with a team of mine on a small application that takes a lot of input (logfiles of a day) and produces useful output after several (now 4, in the future perhaps 10) map-reduce steps (Hadoop & Java). Now I've done a partial POC of this app and ran it on 4 old desktops (my Hadoop test cluster). What I've noticed is that if you ...

How to make a database appear partitioned by some columns with Active::Record

Suppose a column client_id is ubiquitous through out our database, and for a given session or request, we will be 'in the context' of a client the whole time. Is there a way to simulate having each client's data stored in a separate database, while keeping them in the same table for simpler database-management ? What I want is similar ...

Is partitioning easier than sorting?

This is a question that's been lingering in my mind for some time ... Suppose I have a list of items and an equivalence relation on them, and comparing two items takes constant time. I want to return a partition of the items, e.g. a list of linked lists, each containing all equivalent items. One way of doing this is to extend the equiv...

Puzzle: Need an example of a "complicated" equivalence relation / partitioning that disallows sorting and/or hashing

From the question "Is partitioning easier than sorting?": Suppose I have a list of items and an equivalence relation on them, and comparing two items takes constant time. I want to return a partition of the items, e.g. a list of linked lists, each containing all equivalent items. One way of doing this is to extend th...

postgresql: using NEW.* in dynamic command for EXECUTE

hi i try to create a plpgsql trigger for postgresql 8.3 which automatically partitions a table on before insert by the id column if the destination table doesnt exist it will be created, and the insert goes there so i created the insert statement with the new table name like this exec_insert := 'INSERT INTO '||TG_TABLE_SCHEMA||'.'||T...

Oracle Partition Pruning with bind variables

I have a large (150m+ row) table, which is partitioned into quarters using a DATE partition key. When I query the table using something like... SELECT * FROM LARGE_TABLE WHERE THE_PARTITION_DATE >= TO_DATE('1/1/2009', 'DD/MM/YYYY') AND THE_PARTITION_DATE < TO_DATE('1/4/2009', 'DD/MM/YYYY'); ... partition pruning works correctly...

Optimizing daily data storage in a relational db

Update: There was a comment that the question was not clear, that I made a leap of logic claiming that I would have 118 billion rows. I have edited the text below to clarify things. See the italicized text below I have been struggling with this for a while now, have even gone down a few paths, but I turn now to the community for ideas. ...

Sql Server Dynamic Database Partitioning

Hi, Is there anything called dynamic partitioning in SQL server? If so how can i implement it? ...

SQL Server 2008 Table Partitioning

I have a huge database that has several tables that hold several million records. It's holding engineering data for a client and continually grows. This is impacting performance, even with optimised indexing. So I've been looking at partitioning. However, I would be looking at partitioning on a version held in a table. In it's most sim...

Partitioning in mysql for select and update query

I am using mysql where I am having a table with 3 integer columns c1, c2, c3. Value of column c2 is different on every row Value of column c1 will change around every 3 million rows. c3 is having only two values (1 or 2). primary key of the table is (c1, c2). Table can have millions of records and we are require to perform followin...

Number of all possible groupings of a set of values?

I want to find a combinatorial formula that given a certain number of integers, I can find the number of all possible groupings of these integers (such that all values belong to a single group) Say I have 3 integers, 1, 2, 3 There would be 5 groupings: 1 2 3 1|2|3| 1 2|3 1|2 3 2|1 3 I have calculated these computationally for N = 3 t...

How to do automatic data archiving in SQL Server?

Hi, I have table for which every day I want to do automatic archiving. So to be clear every day I want to take information generated during that day and move it into another partition (of same table) not in another archive table. That's because I want old data to be accessible with same query as new ones. I'm using SQL Server 2005, I'v...

Mysql Partitioning and prunning issue version 5.1.40

Hi All, MYSQL running on my machine is V-5.1.40 I am able to partition the Table as i wanted but when i am trying to query using the key which i used to partition the table it does not take effect it still considers all the partitions below is the code i used and its o/p respectivly ALTER TABLE testTable REMOVE PARTITIONING; ALTER T...

Don't more than a few dozen partitions make sense?

I store time-series simulation results in PostgreSQL. The db schema is like this. table SimulationInfo ( simulation_id integer primary key, simulation_property1, simulation_property2, .... ) table SimulationResult ( // The size of one row would be around 100 bytes simulation_id integer, res_date Date, res_...