views:

103

answers:

5

I have expressions like this:

(cat OR cats OR kitten OR kitty) AND (dog OR dogs) NOT (pigeon OR firefly)

Anyone having idea how to make tables to save those?

Before I got request for usage of brackets, I limited usage of operators to avoid ambiguous situations. So only ANDs and NOTs or only ORs and saved those in this manner:

operators

id | name  
1  | AND  
2  | OR  
3  | NOT  

keywords

id | keyword  
1  | cat  
2  | dog  
3  | firefly  

expressions

id | operator | keywordId  
1  | 0        | 1   
1  | 1        | 2  
1  | 3        | 3  

which was: cat AND dog NOT firefly

But now, I'm really puzzled...

+1  A: 

Saving them as strings or serialized data structures (a parse tree for example) is most likely the best solution unless you really need to modify parts of the expression in the database itself.

ThiefMaster
let's say that I need those broken into references because of a lot of other reasons. e.g. keywordId 1 is cat but it could be word "cat" in Spanish or German etc...
zarko.susnjar
A: 

I think you need a possibility to concat "subexpressions". Why not have a nullable foreignkey in Table Expressions linking to itself(to the parent expression)?

Tim Schmelter
Nice idea, this would give me an opportunity to implement nested expressions inside (( ) OR ) NOT ( AND ())...I'll have to make POC for this one.
zarko.susnjar
+1  A: 

I'd store them as reverse Polish in text format with operators/operands by blanks, for your examples:

cat cats OR dog dogs OR AND
pigeon firefly OR NOT

This allows you to implement an boolean expression evaluator really easily and simply, and I presume is what you want to with them.

If you wanted to make it even easier to evaluate, I'd store bindings of object names to a small vocabulary (e.g., A-Z) and a similar vocabulary for AND, OR, NOT:

cat A cats B dog C dogs : DAB+CD+&
pigeon A firefly : AB+~

Then the basic expression evaluator only has to work on invidual characters and is really, really easy to code.

Ira Baxter
This would be great solution for parsing and evaluation. Unfortunately my bigger concern is database model since i already have thing working and I'd just use my current evaluator method recursively.
zarko.susnjar
If you don't have to evaluate them very fast, you can store them as "bad trees" in multiple database records. If you have to evaluate them quickly, the difference between these two methods will be enormous. I don't know your specific application requirements.
Ira Baxter
A: 

(cat OR cats) AND (dog OR dogs) NOT (pigeon OR firefly)

and

cat AND dog NOT firefly

Are both invalid boolean expressions as NOT is a unary operator (it only takes 1 operand).

Having said that, this is a toughie. It's hard enough to do localization on a database level. Hmmm.. The below comes to mind:

Expressions

Id|Operator|Keyword1|Keyword2|Expression1|Expression2

So here, keyword1, keyword2, expression1 and expression2 are all nullable, and every expression is stored as either a unary operation on a keyword or another expression, or a binary operation on 0, 1 or 2 keywords and 0, 1, 2 other expressions. 1 record per expression, with aditional records for all subexpressions. You can have the evaluation in code done recursively.

This way you won't have duplicate Ids. The only downside I see is that it would be hard to preserve things like (CAT AND CATS) AND DOG vs CAT AND (CATS AND DOG). Both of these evaluate to the same, but the order of computation changes.

Pretty sure this would work. Hit me up in the comments if you need more details.

Plamen

Big Endian
Thanks for an answer, there are two problems here. First, these are going to be created by nontechnical personnel and that's why these are not REAL Boolean expressions, more common language translated into Google like Boolean. Second is that I also could get large lists of chained keywords, "cats OR kitten OR cat OR pussy OR kitty"...
zarko.susnjar
[cats or kitten or cat or pussy or kitty] would work fine it would save as exp = cats or [exp1] exp1 = kitten or [exp2] exp2 = cat or [exp3] exp3 = pussy or kitty Your code would be responsible for parsing out the user input back and forth from the db, maybe I am not understanding the application correctly. In the event that a binary operator is missing, I suggest designating one (configurably) as default. Either OR or AND, probably AND if it's for online searching.Cheers
Big Endian
+1  A: 

In situations like this in the past I have created an integer column that bitwise operations can be performed against. Explanation below:

Begin by assigning a single binary digit to each value-

Cat  Dog  Firefly
---  ---  ------
1     2     4

Next you will add an integer column to your main table, we will call it options. When the number is converted to binary each digit will represent weather cats, dogs or fireflys are allowed. Example:

5 = 101 in binary = Cats allowed, dogs not allowed, fireflys allowed.

id | locationName | options
---------------------------
1  | loc 1        | 5
2  | loc 2        | 2
3  | loc 3        | 7
4  | loc 4        | 6

We can now use bitwise operations against the options column to determine what options are allowed. Examples:

To get all records that allow dogs when we don't about cats or fireflys you would perform the following bitwise operation:

2 & options = 2

This would return records 2,3 and 4.


To get all records that allow dogs and fireflys we would perform he following bitwise operation:

6 & options = 6

This would return records 3 and 4


To get all records that allow cats and fireflys we would perform the following bitwise operation:

5 & options = 5

This would return records 1 and 3.


ONLY accepts fireflys:

4 | Options = 4


Doesn't accept fireflys:

4 & options = 0


This is probably a hard concept to grasp so please let me know if you have any questions. It seems to me that it might be the simplest way to accomplish what you are trying to do once you can grasp the concept.

Abe Miessler
Yeah, I always admired those solutions with bit-masks and bitwise operations. I think I have an idea how to "upgrade" current model but I'll definitely give your idea a chance this weekend at home ;)
zarko.susnjar
Glad to hear it. Let me know how it works out.
Abe Miessler