tags:

views:

68

answers:

2

This is continuing questions from http://stackoverflow.com/questions/3539673/sqlite-selecting-the-highest-salary

Assuming a table 'wagetable'

  name     lowhours  highhours  wage  priority 
  Default  0.0       40.0       100   0        
  Default  40.0      50.0       150   0        
  Default  50.0      70.5       154   0        
  Default  70.5      100.0      200   0        
  Brian    0.0       40.0       200   1        
  Brian    40.0      50.0       250   1        
  Brian    50.0      60.0       275   1        
  Brian    60.0      70.0       300   1        
  Brian    70.0      80.0       325   1        
  Brian    80.0      9999.0     350   1        
  Chad     0.0       40.0       130   1        
  Chad     40.0      9999.0     170   1  

I currently performing two types of queries 3-4 times a second, so performance is key, not really readability (but preferred).

This query choose the wage if and only if the $Hour is between lowhours and highhours:

SELECT wage 
FROM wagetable 
WHERE name LIKE '$Employee' OR name LIKE 'Default' 
AND '$Hour' BETWEEN lowhours AND highhours 
ORDER BY priority DESC 
LIMIT 1

This 2nd query takes off if the first failed to find $Hour between lowhours and highhours:

SELECT wage 
FROM wagetable 
WHERE name LIKE '$Employee' OR name LIKE 'Default' 
ORDER BY priority DESC, highhours DESC
LIMIT 1

I am looking to see if there could be a query where I could combine both to do the same thing in just one query

Edit:

Because I didn't correctly test the above queries.. I will tell you what I want in english and examples of answers.

It will check rather $Employee exists on the table, then use it. If it doesnt, then check for 'Default' instead. Once it knows the name, it will check to see if $Hour is between a known lowhours and highhours, if it does, SELECT that wage, if its higher than anything listed, automatically take the wage for the highest hour.

Here are some values. Please note that "sam" is not on the table, so he will be 'Default' The examples follow this format: (Name, Hour) EXPECTED ANSWER

(Sam, 1)        100
(Sam, 51)       154
(Sam, 999999)   200

(Brian, 1)      200
(Brian, 51)     275
(Brian, 999999) 350

Here is the table again so you can quick reference, remember Sam will be 'Default'

name     lowhours  highhours  wage  priority 
Default  0.0       40.0       100   0        
Default  40.0      50.0       150   0        
Default  50.0      70.5       154   0        
Default  70.5      100.0      200   0        
Brian    0.0       40.0       200   1        
Brian    40.0      50.0       250   1        
Brian    50.0      60.0       275   1        
Brian    60.0      70.0       300   1        
Brian    70.0      80.0       325   1        
Brian    80.0      9999.0     350   1      

Edit 2:

The point of this is so that you can define a table of wages where if someone gets more money than the rest, they get paid a certain amount plus overtime. If it is a standard worker, then will be get Default wages. If the $Employee is on the list, then he gets special wages.

The lowest lowhours will -always- be 0.0, The lowhours and highhours pair will -never- have any gaps (It won't allow 40-50 and then skip and do 60-70). What is uncertain is the highest highhours. Therefore is $Hour is higher than the $Employee's highest highhours, then it should use the $Employee's highest highhours's wage.

For example; If "Betty Sue" worked 200 hours, it will get Default's highest highhours wage... which is 200. Therefore the wage "Betty Sue" makes on her 200th hour is 200 per hour.

Now if Brian works 10000 hours, he will NOT earn Default's highest highhours wage...but instead he will earn Brian's highest highhours wage, which is 350.

The results of the benchmark:

This is based on 2000 queries (1000 for a match and 1000 for a default non match):

Timwi's Method: 4348 milliseconds

OMG Ponie's Method: 5843 milliseconds

My method using up to 5 queries and atleast 2: 5844 milliseconds

+1  A: 

Here's what I mentioned in my comment to Alex:

SELECT s.wage
  FROM (SELECT x.wage
          FROM WAGETABLE x
         WHERE x.name LIKE '$Employee'
           AND '$Hour' BETWEEN x.lowhours AND x.highhours
        UNION ALL
        SELECT y.wage AS wage
          FROM WAGETABLE y
          JOIN (SELECT wt.name,
                       MAX(wt.highhours) AS max_hours
                  FROM WAGETABLE wt
              GROUP BY wt.name) bb ON bb.name = y.name
                                  AND bb.max_hours = y.highhours
         WHERE y.name LIKE '$Employee'
        UNION ALL
        SELECT t.wage
          FROM WAGETABLE t
         WHERE t.name = 'Default'
           AND '$Hour' BETWEEN t.lowhours AND t.highhours
        UNION ALL
        SELECT z.wage
          FROM WAGETABLE z
          JOIN (SELECT wt.name,
                       MAX(wt.highhours) AS max_hours
                  FROM WAGETABLE wt
              GROUP BY wt.name) aa ON aa.name = z.name
                                  AND aa.max_hours = z.highhours
         WHERE z.name LIKE 'Default') s
 LIMIT 1

Previously:

SELECT s.wage
  FROM (SELECT x.wage
          FROM WAGETABLE x
         WHERE x.name LIKE '$Employee'
           AND '$Hour' BETWEEN x.lowhours AND x.highhours
        UNION ALL
        SELECT y.wage
          FROM WAGETABLE y
         WHERE y.name LIKE 'Default'
           AND y.highhours = (SELECT MAX(highhours)
                                FROM WAGETABLE wt
                               WHERE wt.name = y.name)) s
 LIMIT 1

Because of selecting for a match on the user/etc is in the upper portion of the UNION, if such a record exists - it will be the first row. If there are no matches, the Default match will be the first row. The Default will always be in the resultset, hence the need for the LIMIT...

OMG Ponies
I don’t believe this query does what the question is asking for.
Timwi
@Timwi: See the previous question - the OP wants to get a default value if there's no match, within a single query.
OMG Ponies
OK, I posted a new answer with a much shorter solution than yours.
Timwi
Unless I am doing something wrong; Using "Brian2" (this isnt found, so it should go on Default) returns 100, not the expect 200 (200 because its the highest and hours are 99999)SELECT s.wage FROM (SELECT x.wage FROM WAGETABLE x WHERE x.name LIKE 'Brian2' AND '999999' BETWEEN x.lowhours AND x.highhours UNION ALL SELECT y.wage FROM WAGETABLE y WHERE y.name LIKE 'Default' ) s LIMIT 1;@@ Debug(540) 8/21/2010 11:10:26 PM $msg = Wage='100'
BHare
@Brian: Doesn't make sense - you're getting the default (correct so far), but the subquery isn't working? Try running the last half of the UNION separately -- the default portion. I can rewrite it as a JOIN...
OMG Ponies
In your new edit WAGETABLE wg should be wt
BHare
Everything fits the bill on my example EXCEPT for when Brian works 99999 hours, it should reply with 350...Instead yours replies with 200.
BHare
@Brian: Thx, just noticed typo too cuz I was testing. So how do you want to handle when someone works more than their hour limit?
OMG Ponies
It will take that person's highest hour wage. For example, when Brian works 9999999 hours, it will take 350.... I assume that if the hour isnt defined between low and high...yours takes the default's highest hour... It should take the person's highest hour (default if not found)..
BHare
@Brian: See update
OMG Ponies
The edit seems to do worse than before, now everything is 200 except for when Brian works 51 hours, its correct.
BHare
@Brian: You are aware that BETWEEN is inclusive - it means you can't use the highhour value as the next lowhour value because both will satisfy. If you take off the inclusiveness in the query, you won't get edge cases like 40/50/etc.
OMG Ponies
@OMG Ponies, Yes I understand that. The `$Hour` will be a float thats almost never going to be exactly 40.0000000. For the testing, I am using numbers that go in the range.. Which is why I used **51** instead of 50
BHare
OMG Ponies
@OMG Ponies, please read edit 2 of the post. I can't really explain it much better? The problem your newest edit has is that it says everything is 200....the problem you had with the edit before that is that it for over 9999 hours, it said brian was making 200...not the expected 350. So basically on Brian's 9999th hour he was making 350 an hour, and then when he reached 10000, he dropped down to 200 an hour...that is **not** correct. He should make atleast the highest `highhours` wage. I should not have used 9999 in my original example because its large and unrealistic.
BHare
OMG Ponies
If you are testing the queries with the exact table dataset I posted, and testing them again the exact scenarios that I posted, then let me know. It's my mistake. Edit: I posted this before you had posted you were going to recreate the data.
BHare
OMG Ponies
Because this answer was perfected first and also because you had the most effort put it, I am going to accept it. I will reply explain in the edit that both solutions work with benchmarks, once I get his code working.
BHare
+2  A: 

I believe this is the most straightforward way to do what you want:

SELECT IFNULL(
    (SELECT wage FROM WAGETABLE WHERE name LIKE '$Employee'
                 AND '$Hour' BETWEEN lowhours AND highhours),
    (SELECT wage FROM WAGETABLE WHERE name LIKE 'Default'
                 ORDER BY highhours DESC LIMIT 1)
)

It does what you describe in the other question:

  • Retrieve the wage for the $Employee;
  • If there is no row for the $Employee, this results in NULL, so the ISNULL kicks in;
  • Only if the first result was NULL, the Default row is retrieved.

HOWEVER, I feel that I should point out that the queries you have posted in this question do something different. My query does not find a Default row that has a matching lowhours / highhours interval, it only ever returns the one with the highest highhours for Default. I don’t know whether the query in the question is really what you want, but it seems more likely, therefore here is a query that is equivalent to what you’re actually asking in this question:

SELECT IFNULL(
    (SELECT wage FROM WAGETABLE WHERE (name LIKE '$Employee' OR name='Default')
                 AND '$Hour' BETWEEN lowhours AND highhours
                 ORDER BY priority DESC LIMIT 1),
    (SELECT wage FROM WAGETABLE WHERE name LIKE '$Employee' OR name='Default'
                 ORDER BY priority DESC, highhours DESC LIMIT 1)
)
Timwi
OMG Ponies
Very simple looking, however upon testing; I think something is not working right. SQLite using IFNULL over ISNULL...Maybe that is the issue... also the x.lowhours and x.highhours, the x. shouldnt be there.
BHare
@Brian: Thanks, fixed.
Timwi
Sorry; I guess I should have tested my examples better first...It was a straight continuation from the previous question. I didn't want to ask so many questions and only able to reward 1 person... I will edit what I am looking for...because what you gave me works better, but still off.
BHare
Based on the example in my edit, this is what your current thing returns me...Format is EXPECTED ANSWER, YOUR ANSWER::100,100154,154200,200200,200275,200350,200...... So when it expected 275, this gave me 200.. When it expected 350, this gave me 200.
BHare
I can’t see how it can return 200 for the input `(Brian, 51)` and the data you quoted, but of course I could have made a mistake. I’m afraid I’ll have to let you debug this on your own. For example, make sure that the `BETWEEN` operator actually does what you think (that `$Hour` is not treated as a string, for example).
Timwi
@Timwi, No problem. I still intend to do a performance benchmark as requested by OMG ponies and I will make an account on Stack and up you the rep either way. If/when I find the bug (mine or yours); I will update accordingly :) thanks for your effort.
BHare
Could Operator precedence be an issue here.Is it reading WHERE `name LIKE '$Employee'` OR `name='Default' AND '$Hour' BETWEEN lowhours AND highhours` rather than............... say WHERE `name LIKE '$Employee' OR name='Default'` AND '$Hour' BETWEEN lowhours AND highhours
BHare
Well spotted! You are right! `AND` has higher precedence. You need to put `name LIKE '$Employee' OR name='Default'` into parentheses. I feel really incompetent now ☺
Timwi
I cannot get this to work at all when putting double quotes around what you stated. The wage is always blank, ""
BHare
I have no idea what double-quotes you are talking about, but yeah, I forgot to edit the post to fix the missing parentheses. I’ve edited it now.
Timwi
I confused parentheses with double quotes. LOL
BHare