views:

431

answers:

3

I am not very familiar with databases, and so I do not know how to partition a table using sqlachemy...

Your help would be greatly appreciated.

+3  A: 

It's quite an advanced subject for somebody not familiar with databases, but try Essential SQLAlchemy (you can read the key parts on Google Book Search -- p 122 to 124; the example on p. 125-126 is not freely readable online, so you'd have to purchase the book or read it on commercial services such as O'Reilly's Safari -- maybe on a free trial -- if you want to read the example).

Perhaps you can get better answers if you mention whether you're talking about vertical or horizontal partitioning, why you need partitioning, and what underlying database engines you are considering for the purpose.

Alex Martelli
sorry, hehe :) i think it`s not about vertical or horizontal partitioning ?My situation is that i have a database table which is >billion level, so i think it`s better to make partitions to this table so that my query and insert can be faster...But i do not know how to use sqlalchemy for table-partitioning
Andy
vertical and horizontal partitioning (i.e., by rows and by columns!) are what you can do in SQL. Oracle and other specific (typically commercial) "enterprise level" offerings may offer other options -- but it's impossible to go further with no idea of what Enterprise DB you're using (and then it will probably take tools specific to that particular DB to use its non-standard extras).
Alex Martelli
Btw, ">billion level" means nothing to me -- what's a level? A billion bytes is nothing (easily and rapidly handled by any engine, just make good indices!), a billion rows, it depends on how much data per row, but may be worth partitioning -- but how to partition sensibly depends on your data and queries of interest. It IS an advanced subject, you know!-) SQLAlchemy's the least of your worries.
Alex Martelli
ye..i know...i am using oracle DB, and the table rows are >billion, data per row is not too much. I can not describe how the program insert and query clearly(advanced to me :( ), but if i want to make some partitions for the table from date , how should i do with sqlalchmey? :(...
Andy
You shouldn't do Oracle-specific partitioning (interval, reference, list, range, composite, region, ...) with SQLAlchemy, but with Oracle-specific tools, see http://www.oracle.com/solutions/business_intelligence/partitioning.html -- actually you should have a highly skilled Oracle DBA do it for you, it's an extremely specialized job (I'm a very experienced programmer, not a DBA, and I wouldn't trust myself with such a task). Would you have a cardiologist drill your root canal -- or, do it yourself?-) No, you'd go to a dentist. Same here.
Alex Martelli
hehe, got that! thsnks a lot Alex!
Andy
+2  A: 

There are two kinds of partitioning: Vertical Partitioning and Horizontal Partitioning.

From the docs:

Vertical Partitioning

Vertical partitioning places different kinds of objects, or different tables, across multiple databases:

engine1 = create_engine('postgres://db1')
engine2 = create_engine('postgres://db2')
Session = sessionmaker(twophase=True)
# bind User operations to engine 1, Account operations to engine 2
Session.configure(binds={User:engine1, Account:engine2})
session = Session()

Horizontal Partitioning

Horizontal partitioning partitions the rows of a single table (or a set of tables) across multiple databases.

See the “sharding” example in attribute_shard.py

Just ask if you need more information on those, preferably providing more information about what you want to do.

nosklo
thanks a lot ! nosklo!But i what i wanted to do is database level partitioning...
Andy
+1  A: 

Automatic partitioning is a very database engine specific concept and SQLAlchemy doesn't provide any generic tools to manage partitioning. Mostly because it wouldn't provide anything really useful while being another API to learn. If you want to do database level partitioning then do the CREATE TABLE statements using custom Oracle DDL statements (see Oracle documentation how to create partitioned tables and migrate data to them). You can use a partitioned table in SQLAlchemy just like you would use a normal table, you just need the table declaration so that SQLAlchemy knows what to query. You can reflect the definition from the database, or just duplicate the table declaration in SQLAlchemy code.

Very large datasets are usually time-based, with older data becoming read-only or read-mostly and queries usually only look at data from a time interval. If that describes your data, you should probably partition your data using the date field.

There's also application level partitioning, or sharding, where you use your application to split data across different database instances. This isn't all that popular in the Oracle world due to the exorbitant pricing models. If you do want to use sharding, then look at SQLAlchemy documentation and examples for that, for how SQLAlchemy can support you in that, but be aware that application level sharding will affect how you need to build your application code.

Ants Aasma
thanks a lot Ants!
Andy