views:

315

answers:

2

I want to combine

Select top 2 scouting.* From scouting Where scouting.astroLoc Like 'D01%' AND scouting.ownerGuild = 'SWARM' Order By scouting.jumpGate Desc

with

Select top 2 scouting.* From scouting Where scouting.astroLoc Like 'D02%' scouting.ownerGuild = 'SWARM' Order By scouting.jumpGate Desc

with

Select top 2 scouting.* From scouting Where scouting.astroLoc Like 'D03%' scouting.ownerGuild = 'SWARM' Order By scouting.jumpGate Desc

continued until

Select top 2 scouting.* From scouting Where scouting.astroLoc Like 'D79%' scouting.ownerGuild = 'SWARM' Order By scouting.jumpGate Desc

... into 1 SQL query whereby the TOP 3 records are grouped by scouting.astroLoc ascending.

+1  A: 

This is the "greatest-n-per-group" problem that is posted frequently on StackOverflow. Here's a solution:

SELECT s1.*
FROM scouting s1
LEFT OUTER JOIN scouting s2
  ON (s1.astroLoc = s2.astroLoc AND s1.jumpGate < s2.jumpGate)
WHERE s1.astroLoc LIKE 'D[3-7][0-9]%' AND s1.astroLoc NOT LIKE 'D3[0-2]%'
GROUP BY s1.* -- here you need to name all fields in the select-list
HAVING COUNT(*) < 2;

This works because the query tries to match a given row s1 to the set of rows s2 that have the same astroLoc and a greater jumpGate value. The HAVING clause restricts the result to s1 rows that match fewer than two, which means that the row would be in the top 2.

This assumes rows are unique over [astroLoc, jumpGate]. If not, you may need to add another term to the join condition to resolve ties.


Re your comment, try the following alteration:

SELECT s1.*
FROM scouting s1
LEFT OUTER JOIN scouting s2
  ON (SUBSTRING(s1.astroLoc, 1, 3) = SUBSTRING(s2.astroLoc, 1, 3)
      AND (s1.jumpGate < s2.jumpGate OR (s1.jumpGate = s2.jumpGate AND s1.ID < s2.ID))
WHERE s1.astroLoc LIKE 'D[3-7][0-9]%' AND s1.astroLoc NOT LIKE 'D3[0-2]%'
GROUP BY s1.* -- here you need to name all fields in the select-list
HAVING COUNT(*) < 2;

This compares only the first three characters of astroLoc for purposes of testing a row is in the same "group" as the other, and it also resolves ties in jumpGate by using the primary key.


Re your other answer with new requirements:

where does the scouting.ownerGuild = 'SWARM' go?

It's hard to follow what you're asking for, since I don't know what are your table definitions or the meanings of columns. Do you want the outer query to be matched to the top three jumpgates that are owned by the SWARM guild?

SELECT s1.astroLoc, g.[galaxy_aename], s1.jumpGate, s1.ownerGuild
FROM galaxy g INNER JOIN scouting s1 ON g.[galaxy_ID] = s1.galaxy 
WHERE s1.jumpGate IN (SELECT TOP 3 s2.jumpGate FROM scouting AS s2
        WHERE s2.galaxy = g.[galaxy_ID] AND s2.ownerGuild = 'SWARM'
        ORDER BY s2.jumpGate DESC)
ORDER BY scouting.astroLoc DESC, scouting.jumpGate DESC

That would be a different query from this one, which makes the outer query return the jumpgates owned by SWARM that match the top three jumpgates owned by anyone.

SELECT s1.astroLoc, g.[galaxy_aename], s1.jumpGate, s1.ownerGuild
FROM galaxy g INNER JOIN scouting s1 ON g.[galaxy_ID] = s1.galaxy 
WHERE s1.jumpGate IN (SELECT TOP 3 s2.jumpGate FROM scouting AS s2
        WHERE s2.galaxy = g.[galaxy_ID]
        ORDER BY s2.jumpGate DESC)
  AND s1.ownerGuild = 'SWARM'
ORDER BY scouting.astroLoc DESC, scouting.jumpGate DESC

It's possible the second query will return an empty result, if none of the SWARM jumpgates are in the top three.

PS: It's customary on StackOverflow to edit your original question post at the top, when you need to add more detail or followup questions.

Bill Karwin
This is now grouping properly, but it is displaying more than 2 records per scouting.astroloc
jtromans
What is the primary key of the scouting table?
Bill Karwin
ok so i understand why. The astroloc value may be D33:12:12:12i.e. It is a string. It is just the First 3 characters that can be equal. Any ideas?
jtromans
To clarify:ON (s1.astroLoc = s2.astroLoc)is incorrect. It plain english it should be on FIRST THREE CHARACTERS OF s1.astroloc = FIRST THREE CHARACTERS OF s2.astroLocNot if the entire string matches, just the 1st 3 characters.
jtromans
and the primary key is 'ID'
jtromans
RE: Your update. I not get at Syntax Error in Join Operation when I try and run this on the Access database for testing (ultimately it will be an ASP script).
jtromans
Hmm. I don't do MS Access -- does it not have `SUBSTRING()`? I was assuming you're using MS SQL Server from your `TOP` syntax.
Bill Karwin
I think i should replace substring() with mid()
jtromans
Even with this function I recieve the same error - it is something to do with AND (s1.jumpGate < s2.jumpGate OR (s1.jumpGate = s2.jumpGate AND s1.ID < s2.ID))
jtromans
Could be a bracket missing one sec
jtromans
OK I think this has resolved it, thank you for your help. Replacing the SUBSTRING function with MID and adding a closing ) did the trick :)
jtromans
Before I close this thread with a correct answer, an outstanding issue has now arisen whereby the query takes a very very long time to execute because there are a lot of records in the database. Is there anything I can do to speed up the query. By its design I can see why it is slow, but it would appear as though this is necessary for it to work.
jtromans
If this were one of the RDBMS products I'm familiar with, I'd recommend that you split `astrLoc` into two fields, so you can do the comparison without using `MID()` and you could create an index on it. But I have such little experience with MS Access that I'm not sure if that would make a difference. Sorry I can't be of any more help.
Bill Karwin
Hey Bill, tbanks for all the help. I have found *almost* the soluition to the problem. I hear what you're saying regarding your experience with MS Access, but wondered whether you could cast your eyes over my solution, which I will post separately.
jtromans
A: 

The problem has changed only very slightly with more astroloc D[0-7][0-9] values required, constrained by owner.Guild = 'SWARM' ... I did something very similar to splitting astroloc into 2 fields. Instead I created a new table (galaxy) which contains the part of the astroloc data I require (the first 3 characters, minus the 'D'), and I reference this with a unique ID:

Select scouting.astroLoc, galaxy.[galaxy_aename], scouting.jumpGate, scouting.ownerGuild From galaxy Inner Join scouting On galaxy.[galaxy_ID] = scouting.galaxy Where (scouting.jumpGate) In (Select Top 3 scouting.jumpGate From scouting Where scouting.galaxy = galaxy.[galaxy_ID] Order By scouting.jumpGate Desc) Order By scouting.astroLoc Desc, scouting.jumpGate Desc

The problem is it MISSES OUT SOME OF THE CATEGORIES (galaxy.[galaxy_ID]) values. The galaxy table has 0 to 79 entries, and in the scouting table, each galaxy is referenced and cross-linked to at least once, so you would not expect it to miss out a category (galaxy.[galaxy_ID]). Any thoughts?

jtromans
The question is where does the ownerGuild = 'SWARM' go ? 'cos Where I put it now it misses records :(
jtromans
This questions continues: http://stackoverflow.com/questions/1803396/complicated-ms-access-greatest-n-per-group-problem
Remou