views:

331

answers:

4

I have a Java application that needs to display large amounts of data (on the order of 1 million data points). The data doesn't all need to be displayed at the same time but rather only when requested by a user. The app is a desktop app that is not running with an app server or hitting any centralized database.

My thought was to run a database on the machine and load the data in there. The DB will be read only most of the time, so I should be able to index to help optimize queries. If I'm running on a local system, I'm not sure if I should try and implement some caching (I'm not sure how fast the queries will run, I'm currently working on them).

Is this is a logical way to approach the problem or would there be better approaches?

Thanks, Jeff

A: 

Well, depends on data size. 1 Million integers for example isnt that much, but 1 Million data structures/classes or whatever with, lets say, 1000 Bytes size is much.

For small data: keep them in memory For large data: i think using the DB would be good.

Just my opinion :)

edit:

Of course it depends also on the speed you want to achieve. If you really need high speed and the data is big you could also cache some of them in memory and leave the rest in the db.

George B.
+1  A: 

It really depends on your data. Do multiple instances request the data? If not, it is definitely worth to look for a simple SQLite database as the storage. It is just a single file on your file system. No need to set up a server.

Christian Stade-Schuldt
Thanks for the input. It will be a single instance for now, so I'll take a look at SQLite. Could I use derby or another embedded java db? I haven't used sql lite before.
Jeff Storey
It is pretty straight forward. Check out the following thread for an introduction:http://stackoverflow.com/questions/41233/java-and-sqliteAnother database you could check out is HSQLDB (link inside another answer inside this thread). That one is a relational database engine written in Java also based on a single file.
Christian Stade-Schuldt
+2  A: 

Display and data are two different things.

You don't give any details about either, but it could be possible to generate the display in the background, bringing in the data one slice at a time, and then displaying when it's ready. Lots of anything could cause memory issues, so you'll need to be careful. The database will help persist things, but it won't help you get ten pounds of data into your five pound memory bag.

UPDATE: If individuals are only reading a few points at a time, and display isn't an issue, then I'd say that any database will be able to handle it if you index the table appropriately. One million rows isn't a lot for a capable database.

duffymo
The display part will be minimal. Only a few points of data will be displayed at a time so I'm not too concerned about that part now. Basically I'll be retrieving a small amount of data (the records are made of a few numbers and strings, nothing too large) from an overall large set.
Jeff Storey
I think if you're only showing a few of these at any point in time a database would be great. Query a table for 10s of points within millions should be pretty quick for any DB implementation (correctly indexed of course).
Matt
+1  A: 

Embedded DB seems reasonable. Check out JavaDB/Derby or H2 or HSQLDB.

Sqlite with a java wrapper is fine too.

ykaganovich