tags:

views:

1343

answers:

5

Hi, 2 tables:

Employees
- EmployeeID
- LeadCount


Leads
- leadID
- employeeID

I want to update the Employees.LeadCount column by counting the # of leads in the Leads table that have the same EmployeeID.

Note: There may be more than 1 lead with the same employeeID, so I have to do a Distinct(SUM(employeeID)).

A: 
UPDATE Employees SET LeadCount = (
  SELECT Distinct(SUM(employeeID)) FROM Leads WHERE Leads.employeeId = Employees.employeeId
)
Jason Cohen
sum(employeeid) doesn't make any sense, and distinct around a single value will always return the same value, so it's redundant.
Torsten Marek
IIRC that will be really slow (MySQL is all I've used)
BCS
+5  A: 
UPDATE
    Employees E
SET
    E.LeadCount = (
        SELECT COUNT(L.EmployeeID)
        FROM Leads L
        WHERE L.EmployeeID = E.EmployeeID
    )
insin
A: 

Steeling from above and removing the dependent subquery.

// create tmp -> TBL (EmpID, count)

insert into TBL 
   SELECT employeeID COUNT(employeeID) Di
   FROM Leads WHERE Leads.employeeId = Employees.employeeId GROUP BY EmployeeId
UPDATE Employees SET LeadCount = (
  SELECT count FROM TBL WHERE TBL.EmpID = Employees.employeeId
)

// drop TBL

EDIT It's "group By" not "distinct" :b (thanks Mark Brackett)

BCS
+3  A: 

Joins work the same for updates (and deletes) just like they do for selects (edit: in some popular RDBMS', at least*):

UPDATE Employees SET
  LeadCount = Leads.LeadCount
FROM Employee
JOIN (
  SELECT EmployeeId, COUNT(*) as LeadCount 
  FROM Leads 
  GROUP BY EmployeeId
) as Leads ON
  Employee.EmployeeId = Leads.EmployeeId

The SUM(DISTINCT EmployeeId) makes no sense - you just need a COUNT(*).

  • MS SQL Server supports UPDATE...FROM, and DELETE...FROM syntax, as does MySql, but the SQL-92 standard does not. SQL-92 would have you use a row expression. I know that DB2 supports this syntax, but not sure of any others. Frankly, I find the SQL-92 version confusing - but standards and theory wonks will argue that the FROM syntax violates relational theory and can lead to unpredictable results with imprecise JOIN clauses or when switching RDBMS vendors.
Mark Brackett
Do you have a link? I'd like to learn more.
BCS
for some reason the update query is more complicated, hopefully now I can think of them the same way!
public static
Joins in UPDATE and DELETE are not standard SQL but some implementations support them as an extension. Which brand of RDBMS supports the syntax above?
Bill Karwin
+5  A: 

You're setting yourself up for a data synchronization problem. As rows in the Leads table are inserted, updated, or deleted, you need to update the Employees.LeadCount column constantly.

The best solution would be not to store the LeadCount column at all, but recalculate the count of leads with a SQL aggregate query as you need the value. That way it'll always be correct.

SELECT employeeID, COUNT(leadId) AS LeadCount
FROM Leads
GROUP BY employeeID;

The other solution is to create triggers on the Leads table for INSERT, UPDATE, and DELETE, so that you keep the Employees.LeadCount column current all the time. For example, using MySQL trigger syntax:

CREATE TRIGGER leadIns AFTER INSERT ON Leads
FOR EACH ROW BEGIN
  UPDATE Employees SET LeadCount = LeadCount + 1 WHERE employeeID = NEW.employeeID;
END

CREATE TRIGGER leadIns AFTER UPDATE ON Leads
FOR EACH ROW BEGIN
  UPDATE Employees SET LeadCount = LeadCount - 1 WHERE employeeID = OLD.employeeID;
  UPDATE Employees SET LeadCount = LeadCount + 1 WHERE employeeID = NEW.employeeID;
END

CREATE TRIGGER leadIns AFTER DELETE ON Leads
FOR EACH ROW BEGIN
  UPDATE Employees SET LeadCount = LeadCount - 1 WHERE employeeID = OLD.employeeID;
END

Another option, if you are using MySQL, is to use multi-table UPDATE syntax. This is a MySQL extension to SQL, it's not portable to other brands of RDBMS. First, reset the LeadCount in all rows to zero, then do a join to the Leads table and increment the LeadCount in each row produced by the join.

UPDATE Employees SET LeadCount = 0;
UPDATE Employees AS e JOIN Leads AS l USING (employeeID)
  SET e.LeadCount = e.LeadCount+1;
Bill Karwin
if the exact value is not needed then what was asked for might do very well if run as a scheduled task in some way.
BCS
+ 1 for not storing the LeadCount, but -1 for the the trigger. I think they ought to be separate replies really.
David Aldridge