views:

117

answers:

3

I need to store about 100 thousands of objects representing users. Those users have a username, age, gender, city and country.

The users should be searchable by a range of age and any of the other attributes, but also a combination of attributes (e.g. women between 30 and 35 from Brussels). The results should be found quickly as it is one of the Server's services for many connected Clients). Users may only be deleted or added, not updated.

I've thought of a fast database with indexed attributes (like h2 db which seems to be pretty fast, and I've seen they have a in-memory mode)

I was wondering if any other option was possible before going for the DB.

Thank you for any ideas !

+2  A: 

How much memory does your server have? How much memory would these objects take up? Is it feasible to keep them all in memory, or not? Do you really need the speedup of keeping in memory, vs shoving in a database? It does make it more complex to keep in memory, and it does increase hardware requirements... are you sure you need it?

Because all of what you describe could be ran on a very simple server and put in a very simple database and give you the results you want in the order of 100ms per request. Do you need faster than 100ms response time? Why?

bwawok
The objects are simple POJO's containing some integers and Strings, maybe a small List of Strings also. Not too expensive I guess, however there may be 100 thousands of them. I really can't guess if that will take up a huge amount of RAM on a decent coputer.I was thinking about alternatives because SQL queries will mostly involve I/O disk operations. Getting the result from memory will be a lot faster. Now if there aren't any easy alternatives (maybe I was missing something easy to use), then of course I'll go for the DB.
Matthew
The database will naturally keep the in use stuff in memory. It will also use indexes to speed up your queries. For a few 100k simple records, you can query and retrieve the info in, on the order of, 100ms. Is 1/10th of a second too long? There is nothing wrong with doing this in memory, but you really need a requirement of fast (maybe 1/100th of a second vs 1/10th of a second) to bother with it.
bwawok
+1  A: 

Most definitely a relational database. With that size you'll want a client-server system, not something embedded like Sqlite. Pick one system depending on further requirements. Indexing is a basic feature, most systems support it. Personally I'd try something that's popular and free such as MySQL or PostgreSQL so you can more easily google your way out of problems. If you make your SQL queries generic enough (no vendor-specific constructs), you can switch systems without much pain. I agree with bwawok, try whether a standard setup is good enough and think of optimizations later.

reinierpost
Why not something embedded ? Isn't it faster ? Could you clarify that ? I was going for something like H2 DB.
Matthew
H2 may or may not be faster. But you really need the business need before you go down that path, as you could end up boxing yourself into a corner in the future.
bwawok
I must say I've never tried a 100K row table with Sqlite 3, maybe it works fine, as long as you never have multiple users concurrently trying to update the database. But it will all be in a single regular file on your regular file system, it just seems fishy to me. By all means try it out; you can also try Firebird which supports both embedded and client-server access and has some great features, but is not as popular as some other systems.
reinierpost
+2  A: 

I would use a RDBMS - there are plenty of good ORMs available, such as Hibernate, which allow you to transparently stuff the POJOs into a db. Once you've got the data access abstracted, you then have the freedom to decide how best to persist the data.

For this size of project, I would use the H2 database. It has both embedded and client/server modes, and can operate from disk or entirely in memory.

mdma
+1 for in-memory DB if storing in memory is a requirement. Wouldn't recommend using hibernate for this case as the object model is trivial (1 table/class).
Mike Q
I was thinking of the searchability factor - the hibernate criteria API makes building searches over arbitrary attributes and values easier than dynamically constructing an SQL query. Also hibernate grows with your project bringing useful features, especially when combined with Spring (declarative transactions, auditing, and various hooks to plug into the persistence layer - Interceptors) which help enforce a good structure.
mdma