views:

42

answers:

2

If I have a set of tags (<100), and a set of objects (~25000), where each object has some sub-set of the tags, do you know of an existing data-structure that would allow for fast retrieval of those objects that satisfy some boolean function of the tags?

Addition/deletion of tags and objects need not be particularly fast, but selection of those objects with tags that satisfy the boolean function should be.

Now that I have written my question down, it looks as if I'm describing an in-memory database, but originally I was thinking of some binary tree like structure for the objects where, for each branch, taking the left/right branch would be equivalent to deciding on have/have-not some tag. But that would not allow don't-care tags? i am asking as I wondered if this has been done before and find it hard to google for data structures.

  • Thanks in advance - Paddy.
A: 

How fast you would need? How complex you boolean function are i.e. how many tags are used in single typical function?

How about using some in memory SQL database? You could then express the boolean function with simple SELECT query.

Juha Syrjälä
+2  A: 

Here's a suggestion: Use a bit-array for each tag, with as many elements as there are objects; each index of which represents one object. The value at each index is 1 if that object has that tag.

Boolean functions on tags are then fast set operations on this bit-array. And the resulting bit-array gives you the documents that satisfy the criteria.

This not very efficient if the tags or objects keep changing frequently but is perhaps applicable for you.

Hemal Pandya
Dooh, Of course!Thanks.
Paddy3118
@paddy3118 Glad you find it useful.
Hemal Pandya