tags:

views:

72

answers:

6

I have a simple report sending framework that basically does the following things: It performs a SELECT query, it makes some text-formatted tables based on the results, it sends an e-mail, and it performs an UPDATE query.

This system is a generalization of an older one, in which all of the operations were hard coded. However, in pushing all of the logic of what I'd like to do into the SELECT query, I've run across a problem.

Before, I could get most of the information for my text tables by saying:

SELECT Name, Address FROM Databas.Tabl WHERE Status='URGENT';

Then, when I needed an extra number for the e-mail, also do:

SELECT COUNT(*) FROM Databas.Tabl WHERE Status='URGENT' AND TimeLogged='Noon';

Now, I no longer have the luxury of multiple SELECT queries. What I'd like to do is something like:

SELECT Tabl.Name, Tabl.Address, COUNT(Results.UID) AS Totals
FROM Databas.Tabl
LEFT JOIN Databas.Tabl Results
    ON Tabl.UID = Results.UID
    AND Results.TimeLogged='Noon'
WHERE Status='URGENT';

This, at least in my head, says to get a total count of all the rows that were SELECTed and also have some conditional.

In reality, though, this gives me the Mixing of GROUP columns with no GROUP columns illegal if no GROUP BY error. The problem is, I don't want to GROUP BY. I want this COUNT to redundantly repeat the number of results that SELECT found whose TimeLogged='Noon'. Or I want to remove the AND clause and include, as a column in the result of the SELECT statement, the number of results that that SELECT statement found.

GROUP BY is not the answer, because that causes it to get the COUNT of only the rows who have the same value in some column. And COUNT might not even be the way to go about this, although it's what comes to mind. FOUND_ROWS() won't do the trick, since it needs to be part of a secondary query, and I only get one (plus there's no LIMIT involved), and ROW_COUNT() doesn't seem to work since it's a SELECT statement.

I may be approaching it from the wrong angle entirely. But what I want to do is get COUNT-type information about the results of a SELECT query, as well as all the other information that the SELECT query returned, in one single query.

=== Here's what I've got so far ===

SELECT Tabl.Name, Tabl.Address, Results.Totals
FROM Databas.Tabl
LEFT JOIN (SELECT COUNT(*) AS Totals, 0 AS Bonus
           FROM Databas.Tabl
           WHERE TimeLogged='Noon'
           GROUP BY NULL) Results
     ON 0 = Results.Bonus
WHERE Status='URGENT';

This does use sub-SELECTs, which I was initially hoping to avoid, but now realize that hope may have been foolish. Plus it seems like the COUNTing SELECT sub-queries will be less costly than the main query since the COUNT conditionals are all on one table, but the real SELECT I'm working with has to join on multiple different tables for derived information.

The key realizations are that I can GROUP BY NULL, which will return a single result so that COUNT(*) will actually catch everything, and that I can force a correlation to this column by just faking a Bonus column with 0 on both tables.

It looks like this is the solution I will be using, but I can't actually accept it as an answer until tomorrow. Thanks for all the help.

A: 

Does this do what you need?

SELECT   Tabl.Name                   ,
         Tabl.Address                ,
         COUNT(Results.UID) AS GrandTotal,
         COUNT(CASE WHEN Results.TimeLogged='Noon' THEN 1 END) AS NoonTotal
FROM     Databas.Tabl
         LEFT JOIN Databas.Tabl Results
         ON       Tabl.UID = Results.UID
WHERE    Status            ='URGENT'
GROUP BY Tabl.Name,
         Tabl.Address
WITH ROLLUP;
Martin Smith
Not quite. I don't even 100% understand the logic of it, but I ran it for a quick test, and that counts up two distinct and useful values. But I'm after counts based on the result of the SELECT query. If I add LIMIT 10 to the end (and ignore the TimeLogged bit), then I'd want each row to contain a GrandTotal of 10. The number I want isn't based on something in the table as much as something based on what I selected from it, and I don't want to have to do a subquery since I might want multiple different counts, and that gets ugly.
Hammer Bro.
@Hammer - What if you add `WITH ROLLUP`?
Martin Smith
Adding it to the example there doesn't seem to change the results for me, but WITH ROLLUP might actually be the key. If I get all the results I want and a column that merely counts by 1 (results grouped by the unique identifier), and WITH ROLLUP can get them to count the number of results, then my framework allows (actually enforces) me to get the last row's column value when I want a single value and not a list of them. Give me a few to try to adapt that -- I'm a bit new to both GROUPING and ROLLUP.
Hammer Bro.
A: 

The API you're using to access the database should be able to report to you how many rows were returned - say, if you're running perl, you could do something like this:

my $sth  = $dbh->prepare("SELECT Name, Address FROM Databas.Tabl WHERE Status='URGENT'");
my $rv   = $sth->execute();
my $rows = $sth->rows;
zigdon
I don't think that's his problem. He needs to run two queries and return them in a single result set.
Phil Gilmore
I'm not actually using an API -- just something that periodically scans a directory for .properties files which specify a SELECT query, table formatting information, e-mail formatting information, and an UPDATE query. If I want a table to include all of the Names, I can just put ${Name} on that line, and it'll pick up "Name" from the ResultSet. But that only gets information from one SELECT query, so I'm trying to find a way to tack on this unusual COUNT information to the single SELECT statement, so I can get, say, ${ResultCount} for the e-mail subject.
Hammer Bro.
A: 

You can always put a scalar subquery into the select-list:

SELECT Name, Address, 
  (SELECT COUNT(*) FROM Databas.Tabl 
   WHERE Status='URGENT' AND TimeLogged='Noon') AS C
FROM Databas.Tabl WHERE Status='URGENT';

But this seems fairly idiotic. Why do you need to do this in one single query? Do you write all your programs in a single line of code?

Bill Karwin
In terms of optimizing, it is beneficial to not call queries so many times. It is better to keep it all consolidated
Ascherer
You are micro-optimizing, solving a problem that does not exist. The cost of executing the query virtually always dwarfs the additional overhead of submitting two separate queries.
Bill Karwin
I need to do this in a single query because the report system is designed around SELECT, E-MAIL, UPDATE, Done, and I'd like to avoid redesigning it to accept a variable number of select queries, because it seems like I'd practically have to create my own scripting language to make sure they were properly addressable. I may use sub-queries if I have to, but it's actually quite a large query I'm doing (takes about 4 seconds), and I'd need a handful of these sub-queries, which were all just copies of the big original but with COUNT(*) on them.
Hammer Bro.
+1  A: 

You could probably do a union instead. You'd have to add a column to the original query and select 0 in it, then UNION that with your second query, which returns a single column. To do that, the second query must also select empty fields to match the first.

SELECT Cnt = 0, Name, Address FROM Databas.Tabl WHERE Status='URGENT'
UNION ALL
SELECT COUNT(*) as Cnt, Name='', Address='' FROM Databas.Tabl WHERE Status='URGENT' AND TimeLogged='Noon';

It's a bit of a hack, but what you're trying to do isn't ideal...

Phil Gilmore
What he's trying to do is horrible and wrong, but UNION is a remarkably good tool for doing wrong things. It can glue any data together as long as it's the right shape; it doesn't care about semantics.
Tom Anderson
This doesn't seem to work (It gives an error about not being able to find 'Cnt' in 'field list'), but it seems like the second SELECT would also give the error about having a GROUP BY function (COUNT) together with non GROUP columns. In typing a response, I realized I could GROUP BY NULL, which is actually quite helpful to my cause. I think I can get something worked out with LEFT JOINs on nested SELECTs that end up duplicating the query once for each count, but that may be necessary, and looking back the things I have to count are never as complex as the main query.
Hammer Bro.
@Hammer Bro: Try changing `Cnt = 0` in Phil's query to `0 as Cnt`.
Mark Bannister
That gets past the 'field list' errors but does give me the expected Mixing of GROUP Columns (COUNT) with non GROUP Columns Illegal error. But that's all right, the LEFT JOIN GROUP BY NULL trick in my edit seems to work, so I'll give points where points are due and figure out how to mark it as resolved.
Hammer Bro.
A: 

Grouping by Tabl.id i dont believe would mess up the results. Give it a try and see if thats what you want.

Ascherer
That's what I tried first, but that would return a count of how many Results rows matched up with a single Tabl row, giving me a bunch of 1s. What I want is for each Tabl row to have the entire count of all different Results rows, trying to trick SQL into counting the number of rows returned by a SELECT query using a self-join. I doubt this is the best way to go about getting this info, but it's the best I can think of, and the single-SELECT-query restraint is a bit unusual.
Hammer Bro.
+1  A: 
SELECT Tabl.Name, Tabl.Address, Results.Totals
FROM Databas.Tabl
LEFT JOIN (SELECT COUNT(*) AS Totals, 0 AS Bonus
           FROM Databas.Tabl
           WHERE TimeLogged='Noon'
           GROUP BY NULL) Results
     ON 0 = Results.Bonus
WHERE Status='URGENT';

I figured this out thanks to ideas generated by multiple answers, although it's not actually the direct result of any one. Why this does what I need has been explained in the edit of the original post, but I wanted to be able to resolve the question with the proper answer in case anyone else wants to perform this silly kind of operation. Thanks to all who helped.

Hammer Bro.