views:

61

answers:

4

Ok, I have over a million records and I need to select all information from the database where the count of a column is greater or equal to 10.

I know that using having I must use group by and using where I cannot use aggregate functions with operators so what I want in pseudo code is

Select column1,column2,column3
From MYdatabase
Where count(column1) > 10

This is where I get stuck. Do I use a subquery and add a column called count? I am not sure how I would code that off hand and being that I am a beginner SQL user I might be looking in the completely wrong direction.

My whole query is below, I want to modify if so only records with a count(callid) >=10 will be selected.

Select
    FiscalMonthYear,
    'MyCenter' = Case EP.Center
        When 'Livermore Call Center' Then 'LCC'
        When 'Natomas Call Center' Then 'NCC'
        When 'Concord Call Center' Then 'CCC'
        When 'Virtual Call Center' Then 'VCC'
        When 'Morgan Hill Call Center' Then 'MHCC'
        Else Center
        End,
    ECH.segstart,
    ECH.consulttime,
    EP.Queue,
    (EP.MgrFName +' '+ EP.MgrLName)AS Manager,
    (EP.SupFName +' '+ EP.SupLName)AS Supervisor,
    (EP.RepFName +' '+ EP.RepLName)As Agent,
    EP.RepPERNR,
    LEFT(ECH.segstart, 19) as SegmentDateTime,
    ECH.origlogin,
    ECH.dialed_num, 
    ECH.segment,
    ECH.SegStart_Date,
    ECH.callid
FROM CMS_ECH.dbo.CaliforniaECH ECH 
    INNER JOIN CAPLESQL02.InfoQuest.dbo.IQ_Employee_Avaya_Id A ON ECH.origlogin = A.AvayaID
    AND getdate () BETWEEN StartDate AND EndDate
    INNER JOIN CAPLESQL02.InfoQuest.dbo.IQ_Employee_Profiles_v3 EP ON A.IQID = EP.RepID
    AND getdate () BETWEEN RepToSup_StartDate and RepToSup_EndDate
    AND getdate () BETWEEN SupToMgr_StartDate and SupToMgr_EndDate
    AND getdate () BETWEEN RepQueue_StartDate and RepQueue_EndDate 
    INNER JOIN Cal_RemReporting.dbo.udFiscalMonthTable f on ECH.SegStart_Date = f.Tdate
Where dialed_num not like '______' 
AND dialed_num not like '' 
AND dialed_num not like '_______' 
and EP.Center is NOT Null  
and EP.Center not like 'Comm Ops' 
and EP.Center not like 'Same Day Group' 
and MgrLName not like 'Hollman'
and consulttime > 0 
and ECH.SegStart_Date between getdate()-90 and getdate()-1 
and EP.Queue not IN ('BST','Collections', 'DST','DSR','Escalations','Cable Store')

I have solved the problem myself by using inner joining a query which selects the top 10 and joins all the data on the dialed_num column. Works Flawlessly, thank you all for your help!

A: 

You can do this:

Select column1,column2,column3 
From MyTable
Where (Select count(*) From MyTable Group By Column1) > 10 

More specifically to your example, add:

AND 
 (SELECT count(*) 
  FROM CMS_ECH.dbo.CaliforniaECH iECH 
  WHERE iECH.callid = ECH.callid) > 10
Graphain
Thank you for your help! Adding that at the bottom of query yields this error: Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
CodingIsAwesome
My mistake amending now.
Graphain
Amended - please see the correction.
Graphain
Thank you again, my query now returns 0 results and it should return around 140,000. What is interesting is that I ran into this problem earlier. I used a HAVING clause: Having count(callid) >= 10 but then my results have to be grouped. I need individual rows not groups. :(What can I do to help you help me?
CodingIsAwesome
A: 

First determine which ECH data have more than 10 callid (pseudo-code):

select ECH.callid
from CMS_ECH.dbo.CaliforniaECH ECH
group by ECH.callid
having count(*) > 10

Then you can use this as a subquery to filter your results:

...
from ECH
...
where ECH.callid IN (
 select ECH.callid
 from CMS_ECH.dbo.CaliforniaECH ECH
 group by ECH.callid
 having count(*) > 10
)
RMorrisey
Thanks, but your query returns 0 results where there should be around 140,000. I ran into this problem earlier. I used a HAVING clause: Having count(callid) >= 10 but then my results have to be grouped. I need individual rows not groups. :( What can I do to help you help me?
CodingIsAwesome
Try pulling away some of the complexity of the query, and run the top statement first:select ECH.callid, count(*)from CMS_ECH.dbo.CaliforniaECH ECHgroup by ECH.callidhaving count(*) > 10Verify that this gives you the correct result.Once that's done, do just a plain select from ECH using the inner query as a filter. If that works, then the problem stems from some other part of your query. You can add joins and where clauses in one at a time until the results drop off, and that will help you identify the source of the issue.
RMorrisey
A: 

Corrected:

SELECT * FROM (SELECT 
    FiscalMonthYear,
    'MyCenter' = CASE EP.Center
        WHEN 'Livermore Call Center' THEN 'LCC'
        WHEN 'Natomas Call Center' THEN 'NCC'
        WHEN 'Concord Call Center' THEN 'CCC'
        WHEN 'Virtual Call Center' THEN 'VCC'
        WHEN 'Morgan Hill Call Center' THEN 'MHCC'
        ELSE Center
        END,
    ECH1.segstart,
    ECH1.consulttime,
    EP.Queue,
    (EP.MgrFName +' '+ EP.MgrLName)AS Manager,
    (EP.SupFName +' '+ EP.SupLName)AS Supervisor,
    (EP.RepFName +' '+ EP.RepLName)AS Agent,
    EP.RepPERNR,
    LEFT(ECH1.segstart, 19) AS SegmentDateTime,
    ECH1.origlogin,
    ECH1.dialed_num, 
    ECH1.segment,
    ECH1.SegStart_Date,
    ECH1.callid
FROM CMS_ECH.dbo.CaliforniaECH ECH1 GROUP BY ECH1.dialed_num HAVING COUNT(ECH1.callid) >= 10) ECH 
    INNER JOIN CAPLESQL02.InfoQuest.dbo.IQ_Employee_Avaya_Id A ON ECH.origlogin = A.AvayaID
        AND getdate () BETWEEN StartDate AND EndDate
    INNER JOIN CAPLESQL02.InfoQuest.dbo.IQ_Employee_Profiles_v3 EP ON A.IQID = EP.RepID
        AND getdate () BETWEEN RepToSup_StartDate AND RepToSup_EndDate
        AND getdate () BETWEEN SupToMgr_StartDate AND SupToMgr_EndDate
        AND getdate () BETWEEN RepQueue_StartDate AND RepQueue_EndDate 
    INNER JOIN Cal_RemReporting.dbo.udFiscalMonthTable f ON ECH.SegStart_Date = f.Tdate
WHERE dialed_num NOT LIKE '______' 
    AND dialed_num NOT LIKE '' 
    AND dialed_num NOT LIKE '_______' 
    AND EP.Center IS NOT NULL  
    AND EP.Center NOT LIKE 'Comm Ops' 
    AND EP.Center NOT LIKE 'Same Day Group' 
    AND MgrLName NOT LIKE 'Hollman'
    AND consulttime > 0 
    AND ECH.SegStart_Date BETWEEN getdate()-90 AND getdate()-1 
    AND EP.Queue NOT IN ('BST','Collections', 'DST','DSR','Escalations','Cable Store')
laurent-rpnet
Thanks, but that returns an error: An expression of non-boolean type specified in a context where a condition is expected, near ')'.
CodingIsAwesome
yes... sorry for the AND... this is the table you have to select from... I'm putting it right and will post soon adapted
laurent-rpnet
OK - looks right now - not sure about some mispelled things as I can't run the query to check but I checked on a simpler structure (with less fields and no joins) and it works
laurent-rpnet
A: 

I have solved the problem myself by using inner joining a query which selects the top 10 and joins all the data on the dialed_num column. Works Flawlessly, thank you all for your help!

CodingIsAwesome
I work with mysql that has no TOP (uses LIMIT) but if I remember well, TOP selects the TOP 10 results at maximum but if there are less results than 10, it will return all of them so you won't select only the calls with `callid > 10` so I think TOP is not usable but I may be wrong on that. Anyways, I think doing a join with a TOP 10 dialed_num will restrict data to the 10 most dialed numbers and not remove data below 10 calls.
laurent-rpnet