views:

220

answers:

4

I am working on a project, which is having huge database. [ around 32gb data in one week ]. We are using DB2, and spring-framework + jdbc. I just wanted to know is JDBC capable of handling this much of data? Or should i use something else? Or if JDBC is capable of doing this then should i use some specific technique for this thing.

+6  A: 

JDBC is just the interface between the database and the java program. It's up to the database to handle that amount of data. In the java world, there is hardly an alternative to using JDBC when it comes to database connectivity.

ammoQ
+8  A: 

JDBC is just the connection - it doesn't care how much data is in the database. I'd expect it to be more of an issue at the database side, if anywhere. If you've got indexes which are expensive to create etc, you're more likely to have issues - but to be honest, 32GB in a week isn't really that big. I'd expect any "real" server-side database to handle it fairly easily.

I suggest you try it before committing yourself too far down any particular path. Chuck data at it as quickly as you can. I'd be slightly worried if you couldn't create 32GB of data in a few hours.

Jon Skeet
so chuck, Oops, i mean Jon :D, according to you it is just the DBA who will have to care about this, not me.
Rakesh Juyal
Almost certainly. JDBC just represents the pipe through which you're pumping the data. Obviously it incurs *some* overhead to process the data, but that's unlikely to be significant.
Jon Skeet
A: 

It all depends on the processing you would be doing on the database. How many tables will you be accessing at any time and also will it be more writes or reads from the database. Based on that you can design it. You can also look at using an ORM solution like hibernate which integrates well with spring. This will provide you some options like caching to avoid direct db access every time. Also you should setup some connection pooling to reuse connections.

Arvind
"How many tables will you be accessing at any time", we are not having more than 20 tables. At any point of time we are fetching from at max 6-7 tables.
Rakesh Juyal
+1  A: 

Although your SQL API and database abstraction layer are important, the biggest impact on the performance and maintainability of your database will be the indexing, clustering, and partitioning scheme your DBA will use for managing the significant amounts of data being inserted every week. The most powerful features in these areas are available in the enterprise version of the DB2 data engine for Linux, UNIX, and Windows. I would recommend looking at a combination of multi-dimensional clustering (MDC), range table partitioning, and deep compression to manage the table as it grows, facilitate easy roll-in/roll-out, and most importantly, to quickly zero in on the data requested with a minimum of scanning. You may also benefit from materialized query tables (MQTs). Version 9.7 of DB2, which IBM released very recently, offers noteworthy enhancements to several of those features, most notably an aggressive compression scheme for indexes.

Fred Sobotka
Thanks Fred, actually Database optimization is not on my hand. But yes i will let tht mate know about these things
Rakesh Juyal