views:

1798

answers:

7

I have a HashMap that I am serializing and deserializing to an Oracle db, in a BLOB data type field. I want to perform a query, using this field. Example, the application will make a new HashMap, and have some key-value pairs. I want to query the db to see if a HashMap with this data already exists in the db. I do not know how to do this, it seems strange if i have to go to every record in the db, deserialize it, then compare, Does SQL handle comparing BLOBs, so i could have...select * from PROCESSES where foo = ?....and foo is a BLOB type, and the ? is an instance of the new HashMap? Thanks

A: 

I haven't had the need to compare BLOBs, but it appears that it's supported through the dbms_lob package.

See dbms_lob.compare() at http://www.psoug.org/reference/dbms_lob.html

BQ
+4  A: 

Here's an article for you to read: Pounding a Nail: Old Shoe or Glass Bottle

I haven't heard much about your application's underlying architecture, but I can tell you immediately that there is never a reason why you should need to use a HashMap in this way. Its a bad technique, plain and simple.

The answer to your question is not a clever Oracle query, its a redesign of your application's architecture.

For a start, you should not serialize a HashMap to a database (more generally, you shouldn't serialize anything that you need to query against). Its much easier to create a table to represent hashmaps in your application as follows:

HashMaps
--------
MapID (pk int)
Key   (pk varchar)
Value

Once you have the content of your hashmaps in your database, its trivial to query the database to see if the data already exists or produce any other kind of aggregate data:

SELECT Count(*) FROM HashMaps where MapID = ? AND Key = ?
Juliet
+1  A: 

i cannot disagree, but i'm being told to do so. i appreciate your solution, and that's sort of what i had previously. thanks

bmw0128
+2  A: 

Storing serialized objects in a database is almost always a bad idea, unless you know ahead of time that you don't need to query against them.

How are you serializing the HashMap? There are lots of ways to serialize data and an object like a HashMap. Comparing two maps, especially in serialized form, is not trivial, unless your serialization technique guarantees that two equivalent maps always serialize the same way.

One way you can get around this mess is to use XML serialization for some objects that rarely need to be queried. For example, where I work we have a log table where a certain log message is stored as an XML file in a CLOB field. This xml data represents a serialized Java object. Normally we query against other columns in the record, and only read/write the blob in single atomic steps. However once or twice it was necessary to do some deep inspection of the blob, and using XML allowed this to happen (Oracle supports querying XML in varchar2 or CLOB fields as well as native XML objects). It's a useful technique if used sparingly.

Mr. Shiny and New
+1  A: 

Look into dbms_crypto.hash to make a hash of your blob. Store the hash alongside the blob and it will give you something to narrow down the search to something manageable. I'm not recommending storing the hash map, but this is a general technique for searching for an exact match between blobs. See also http://stackoverflow.com/questions/110587/sql-how-do-you-compare-a-clob

Gary
Two hashmap instances have different hashcodes regardless of their contents, unless you write your own structural hashing algorithm. This isn't a very good solution because its brittle, inefficient, and a hash in this way doesn't allow a user to search for a subset of hashmap in the database.
Juliet
A: 

Oracle can have new data types defined with Java (or .net on windows) you could define a data type for your serialized object and define how queries work on it.

Good lack if you try this...

Ian Ringrose
A: 

If you serialize your data to xml, and store the data in an xml you can then use xpaths within your sql query. (Sorry as I am more of a SqlServer person, I don’t know the details of how to do this in Oracle.)

  • If you EVERY need to update only part of the serialized data don’t do this.
  • Likewise if any of the data is pointed to by other data or points to other data don’t do this.
Ian Ringrose