tags:

views:

324

answers:

4

I have a table like this (MySQL 5.0.x, MyISAM):

response{id, title, status, ...} (status: 1 new, 3 multi)

I would like to update the status from new (status=1) to multi (status=3) of all the responses if at least 20 have the same title.

I have this one, but it does not work :

UPDATE response SET status = 3 WHERE status = 1 AND title IN (
  SELECT title FROM (
   SELECT DISTINCT(r.title) FROM response r WHERE EXISTS (
     SELECT 1 FROM response spam WHERE spam.title = r.title LIMIT 20, 1)
   )
  as u)

Please note:

  • I do the nested select to avoid the famous You can't specify target table 'response' for update in FROM clause
  • I cannot use GROUP BY for performance reasons. The query cost with a solution using LIMIT is way better (but it is less readable).

EDIT:

  • It is possible to do SELECT FROM an UPDATE target in MySQL. See solution here
  • The issue is on the data selected which is totaly wrong.
  • The only solution I found which works is with a GROUP BY:

    UPDATE response SET status = 3
     WHERE status = 1 AND title IN (SELECT title 
                                      FROM (SELECT title 
                                              FROM response 
                                          GROUP BY title 
                                            HAVING COUNT(1) >= 20)
    

as derived_response)

Thanks for your help! :)

A: 

This is a funny peculiarity with MySQL - I can't think of a way to do it in a single statement (GROUP BY or no GROUP BY).

You could select the appropriate response rows into a temporary table first then do the update by selecting from that temp table.

Eric Petroelje
Actually it is possible to do it in one query: http://www.xaprb.com/blog/2006/06/23/how-to-select-from-an-update-target-in-mysql/ :)
Toto
@Toto - I think your case falls under the "Problems this doesn't avoid" situation. I could be mistaken there though.
Eric Petroelje
+1  A: 

MySQL doesn't like it when you try to UPDATE and SELECT from the same table in one query. It has to do with locking priorities, etc.

Here's how I would solve this problem:

SELECT CONCAT('UPDATE response SET status = 3 ',
  'WHERE status = 1 AND title = ', QUOTE(title), ';') AS sql
FROM response
GROUP BY title
HAVING COUNT(*) >= 20;

This query produces a series of UPDATE statements, with the quoted titles that deserve to be updated embedded. Capture the result and run it as an SQL script.

I understand that GROUP BY in MySQL often incurs a temporary table, and this can be costly. But is that a deal-breaker? How frequently do you need to run this query? Besides, any other solutions are likely to require a temporary table too.


I can think of one way to solve this problem without using GROUP BY:

CREATE TEMPORARY TABLE titlecount (c INTEGER, title VARCHAR(100) PRIMARY KEY);

INSERT INTO titlecount (c, title)
 SELECT 1, title FROM response
 ON DUPLICATE KEY UPDATE c = c+1;

UPDATE response JOIN titlecount USING (title)
SET response.status = 3
WHERE response.status = 1 AND titlecount.c >= 20;

But this also uses a temporary table, which is why you try to avoid using GROUP BY in the first place.

Bill Karwin
A: 

you'll have to use a temporary table:

create temporary table r_update (title varchar(10));

insert r_update
select title
  from response
group
    by title
having count(*) < 20;

update response r
left outer
  join r_update ru
    on ru.title = r.title
   set status = case when ru.title is null then 3 else 1;
longneck
A: 

I would write something straightforward like below

UPDATE `response`, (
  SELECT title, count(title) as count from `response`
     WHERE status = 1
     GROUP BY title
  ) AS tmp
  SET response.status = 3
  WHERE status = 1 AND response.title = tmp.title AND count >= 20;

Is using GROUP BY really that slow ? The solution you tried to implement looks like requesting again and again on the same table and should be way slower than using GROUP BY if it worked.

kriss
Hi, thanks for your response. Here are some results (1st value your query, 2nd "mine"): `**Created_tmp_tables** 3,1 - Handler_read_key 3415 0 - Handler_read_next: 3414,10003 - Handler_read_rnd: 393,0 - Handler_read_rnd_next: 788,2092544 - Handler_update: 3021,0 - Handler_write: 786 609 - Key_read_requests 259,746 - **Last_query_cost**: 12396,162 - Select_scan: 1,2`
Toto
Looks like your request could be faster using a temporary table and setting and index on title. That should be enough to lower the Handler_read_round_next.
kriss