views:

690

answers:

3

I have a table with the following structure:

id bigNumber text

1 1200321030011010233 "an item with some text"

2 1200321030011014563 "another item with some more text"

3 3120323434432211133 "more...."

... ...

The table contains approximately 50.000 records.

I want to do the following query but it is slow:

  SELECT COUNT(*), 
         bigNumber 
    FROM items 
GROUP BY substr(bigNumber, 1, X)

...where X varies from 2 and 19.

Would it be quicker to create 19 columns with parts of the number in each column to be able to use an index?

A: 

I think the result you are looking for is LIKE _X%. This will not use the index though.

SELECT count(*) FROM items WHERE bignumber LIKE "_2%"
SELECT count(*) FROM items WHERE bignumber LIKE "_19%"

_ signifies one character

% signifies any number of characters

See MySQL docs for more information.

Kevin Peno
Thank you, but i need to use the GROUP BY method. X varies from 1 to 19, i.e. I want to group the items based on the first X numbers in bigNumber.
Eric
I would suggest seperating important parts of big number then and using order by.
Kevin Peno
A: 

You can use an index without adding any columns, just create an index on your bigNumber column:

create index bignum_index on items (bigNumber);

The index on a varchar (bigNumber is a varchar, right?) can be used to look up any string prefix.

That said, you'll need to do a full table scan to answer your query, so an index won't help you all that much.

Keith Randall
Thank you all for your unbelievably quick answers!!Does anyone know if I could justify using 19 columns for this instead? I know it's a dirty solution, but my only concern at the moment is speed :)Maybe there is another solution out there, something else than substr?To clarify: the actual query i'm running is a bit different and does require group by since i also request avg for a few cols based on the group.Thanks again!
Eric
As long as you still need a full table scan, adding any more data to your table is just going to make your query slower. So whether it is "dirty" isn't relevant - it isn't a solution at all.
Keith Randall
A: 

I would suggest instead of 19 separate columns, perhaps just 3 or 4, like so:

alter table items
add column bignumber_l1 char(1)
, add column bignumber_l3 varchar(3)
, add column bignumber_l6 varchar(6);

update items
set bignumber_l1 = left(bignumber, 1)
 , bignumber_l3 = left(bignumber, 3)
 , bignumber_l6 = left(bignumber, 6);

alter table items
add index bignumber_l1
, add index bignumber_l3
, add index bignumber_l6;

then when you are querying for strings of x length, write the query with the longest match without going longer:

  SELECT COUNT(*), 
         bigNumber 
    FROM items 
GROUP BY bignumber_l3, substr(bigNumber, 1, 4)

this query can use an index and might improve your performance significantly. note that since you're asking for the whole table, mysql may decide you need a table scan even with the indexes above so you may need to use FORCE INDEX.

longneck