ansaurus

Question

Answer 1

A:

Trying to count and sort by it is going to be a killer. I would suggest trying to make a temporary table with the counts, and then do a select...order by on that.

Not sure if this works in MySQL, but in PostreSQL or Oracle, that would be

create foo as 
   select expr1, operator, expr2, count(*) as c
   from log_keyword_fulltext 
   group by expr1, operator, expr2;
select * from foo order by c desc limit 2000;

Also, you're going to have to do all the counts in order to sort them, so the limit clause isn't going to prevent it from doing all those calculations.

Paul Tomblin 2008-12-11 13:39:46

That's already what MySQL does automagically in this case.

niXar 2008-12-11 13:50:44

@Nixar, that would explain the "using temporary" in the query plan? Ok, never mind.

Paul Tomblin 2008-12-11 14:41:57

Answer 2

+1 A:

Always try taking a count of some single column instead of taking count(*) as it takes a count in permutaiotion of each column of each row. So it takes longer time

Eg:

select expr1, operator, expr2, count(expr1) as c 
from log_keyword_fulltext 
group by expr1, operator, expr2 
order by c desc limit 2000;

Samiksha 2008-12-11 13:50:49

? That doesn't make any sense at all; if it's a single table then the number of rows is the same whether you use (*) or (expr1) and the query translator/optimizer is going to produce the same action no matter which of the two you choose.

Jason S 2008-12-11 14:16:41

its dont make sense because, if expr1 accept null it wont represent all records

VP 2008-12-11 15:18:44

It's documented as part of the SQL standard that "COUNT(*)" calculates without reference to any fields.

le dorfier 2008-12-11 19:58:43

Answer 3

A:

What do you mean by "killing your application"? What is the context? How often do you run this query? What is going on on the database while you're running this query? Does this particular result have to be real-time? What are the conditions (inserts / s, selects / s, db size, etc)

Here's what you could do:

Store the count in a separate table, which you'd update with triggers on insert/delete

If you can't coerce MySQL into doing this with a simple table swipe with some magic, try a stored procedure to do something like (pseudo code):

CREATE TEMP TABLE t (e1 EXP_T, op OP_T, e2 EXP_T, count INTEGER)
ADD AN INDEX ON count
FOR EACH LINE OF SELECT exp1,operator,exp2 FROM log_blah DO
       UPDATE t SET count=count+1 WHERE exp1=e1 AND operator=op AND exp2=e2
       IF IT DOES NOT WORK INSERT INTO t VALUES (exp1,operator,exp2,1)
DONE
SELECT * FROM t ORDER BY count DESC LIMIT 2000

1 is probably what you want. And forget indices, this query has to swipe the whole table anyway.

niXar 2008-12-11 14:10:50

it is a daily cron. It hangs with more then 200000 records.I will check your tip!

VP 2008-12-11 14:29:02

Isn't your "2" pretty much exactly what I said, and what you voted me down for?

Paul Tomblin 2008-12-11 14:44:30

BTW: I'm not challenging you, I just want to know if I didn't make myself clear.

Paul Tomblin 2008-12-11 14:59:52

i didnt vote you down...

VP 2008-12-11 15:14:22

Well, I was assuming it was @niXar who voted it down, but that's not really important compared to whether his "2" is any different from "create foo as select expr1, operator, expr2, count(*) as c from log_keyword_fulltext group by expr1, operator, expr2"

Paul Tomblin 2008-12-11 16:07:18

Did I vote you down? I can't vote down, AFAIK

niXar 2008-12-12 21:59:17

Answer 4

+2 A:

You are running a query which needs to scan the whole of a table, this does not scale. There is no WHERE clause, so it absolutely needs to scan the whole thing.

Consider maintaining some summary table(s) instead of doing this query often.

MarkR 2008-12-11 14:29:10

Answer 5

A:

The best way to prevent a table scan would be to add a cover index for those fields you regularly access. There is a one-time cost to create the index. There is also some additional cost for INSERT and DELETE operations on the table so the index can be updated.

Cover indexes prevent the database from having to read the entire record into memory in order to access the values for the few fields you care about. The entire query can be run off the index.

ALTER TABLE log_keyword_fulltext ADD INDEX idx_name(expr1, operator, expr2)

If these are not actual fields, but rather operations on the fields, such as left(foo,20), you can actually index the part of the field that you will use in future SELECT or WHERE clauses.

See this page for other optimization ideas: http://dev.mysql.com/doc/refman/5.0/en/query-speed.html

Chris 2008-12-11 19:35:29

hi, i do it already. thats my indexes.my problem is that my table in production has 12.277.737 records.Regards,Victor

VP 2008-12-11 19:50:31

Answer 6

+1 A:

What am I missing? I don't see a WHERE clause. It looks to me you're requesting a table scan.

If you are counting on your "LIMIT" clause, you're out of luck - that's the COUNT aggregate calculation.

le dorfier 2008-12-11 19:52:02

ansaurus

tags:

views:

answers:

Mysql Query Optimization

related questions