views:

173

answers:

4

Hello,

I need to store scientific information in a database (sql server). What is the best way to store a value in a database where "smaller than", larger than" is part of the information.

Example:

PatientId: 123 Mutation rate: <3%

PatientId: 999 Mutation rate: 3%

PatientId: 456 Mutation rate: 10%

I need to be able to sort and filter the data. A mutation rate of <3% is better than 3%

How can I sove this?

Thank you for your help

+9  A: 

How about adding a 3rd column for a clarifying int?

0 = Less than
1 = Less than or equal to
2 = Equal to
3 = Greater than or equal to
4 = Greater than

yankee2905
If you were to do this but reorder it to 0=Less than, 1=Less than or equal to, 2=Equal to, 3=Greater than or equal to, and 4=Greater than, you could do an order by on this with the other column to get the desired results.
StrixVaria
updated to reflect StrixVaria's suggestion!
yankee2905
I like that idea
cafenervosa
In the math optimization world, it is custom to note that A > B <=> -B < -A, so you could even keep only 3 values in the columns and cover all cases: <, <=, =.
Mathias
or you could just use the actual < > <=, =, >= and store them as text (varcahr) as long as you properly escape these values. Create far less confusion than trying to remember what 3 equals to and what 4 equels to. AVOID creating new conventions if older ones will suffice.
Nick Gorbikoff
+1  A: 

The easiest, is typically to use predefined values for these cases, for example here, a value of 3.0 means 3% whereby 2.99 means "less than 3%".

Since these "Less than" and "More than" values typically only apply at the ends of the range, such a convention allows handling all filtering and ordering with a single field value,in a standard fashion. The main drawback to this approach is that it implies hard-coding these limit values at the level of the application, for display purposes and such.

The alternative is to introduce a two columns value, with a numeric value and a "qualifier" column which contains a code indicating "exact value" or "Less than" or "more than". While this approach appears to be more generic (and it is, at the level of the display etc.), some hard-coding is often required at the level of the "input".

mjv
But shouldn't such limits be in the application? The database is for storing data. The business logic applies rules to the data and the UI shows the data to the user in a meaningful format.
Corin
@Corin, Yes, such limits and generally implicit info about the scale etc. should be handled at the application level.
mjv
A: 

Unless you want to break up data into 3 fields like less_equal_more(varchar) (<.=, > or text equivalents) , measurement (deciaml) (3.5 or whatever is your level of precision) and units_of_measurment(varchar) (like percentage, or absolute in mile or meters - whateever) - you should store in varchar. But I would breakup your data - will make your search quesries easier. Dealing with full-text search ( which you will have to do if you store everything in one field) is pretty difficult compared to queries you would have to do if you break up your data.

Nick Gorbikoff
A: 

Hi

I'd suggest storing your data as intervals, ie <3% would become [0,3), 3% would become [3,3], and so on. This might take 4 columns in your database, one for each of the endpoints and one to denote whether the interval is open or closed at each end point. Keep the numeric data numeric so you can do arithmetic on it, encode the values 'open' and 'closed' in the way that best suits your access and manipulation requirements.

Regards

Mark

High Performance Mark