views:

315

answers:

1

I have a domain object that stores some metadata and some raw bytes. This is used for storing binary objects such as PDF documents and images.

I would like to persist the metadata in a database so it can be easily queried but I want to store the raw bytes in the file system for performance reasons. What is a good design for achieving this?

Should I have a domain object representing the raw bytes with its own DAO to perform CRUD and a separate JPA DAO to do the same for the metadata?

If that is the case would the domain object for the metadata contain a reference to the raw byte object that is marked as transient so JPA won't attempt to persist it?

Am I following an overly complex design for little benefit over storing raw bytes in the database? I'm using PostgreSQL 8.x if that makes a difference.

Many thanks.

+1  A: 

I really wouldn't do this. Have you measured the supposed performance hit ? How are you going to maintain transactionality between your data in the database and your data on the filesystem. e.g. are you going to write to the filesystem, write to the db, and if that fails then rollback your filesystem change (which isn't as easy as simply deleting the file - do you have a previous version of the binary data?). How do you manage database backups etc. and keep everything in sync ? I would strongly recommend keeping all the data in one place.

Since you're talking about storing PDFs and the like, perhaps you need a document management system ?

Brian Agnew
Thanks for your comments. Yeah, I was planning on wrapping the save in a transaction and if there was an exception delete the file and roll back the database. Do you think I should store the bytes in the DB?
Stephen
I do. What happens if you modify the document. You'll have to roll back to a previous file. I would store all document data in the db (in the first instance anyway). You *may* want to separate metadata and the binary data into different tables for manageability or performance - that's a different question and one for someone more knowledgeable about databases.
Brian Agnew