ansaurus

Question

Answer 1

+1 A:

Part of the issue is that compound keys (such as your Date,Name PK) are created by concatenating the indexed values (see http://dev.mysql.com/doc/refman/5.1/en/create-index.html), and the name (the primary thing you're looking up by here) is second. This makes it much more work to look stuff up by name, because the index won't be sorted by name -- it'll be sorted by date, THEN name, meaning mysqld will have to search the whole index instead of just grabbing the section where the PK is between "Jack, 0000-00-00" and "Jack, 9999-12-31".

If you added an index just for the name, or at least switched the PK to (Name, Date), you'd probably find your original table working much better.

Alternatively, if you did the same thing to your Date,ID table, it should be faster still, because you're all but eliminating string comparisons.

cHao 2010-07-12 04:47:52

Actually, the new PK is indeed (ID,Date) and not (Date,ID). What I did at the moment is that I have (ID,Date) as PK, and an individual index for both ID and Date because it is as common to query over the name as it is to query over the date.But do you think that, essentially, PK (ID,Date) is generally better than PK(Name,Date)?

JSmaga 2010-07-12 05:25:58

@JSmega: If you're into performance, yeah -- it'll be faster. It does slightly complicate data retrieval, though, so i'd check the performance of both and see if the slowdown is worth adding an extra table (and thus, an extra join and/or lookup).

cHao 2010-07-12 05:29:18

@cHao: ok thanks. Any idea of a good reference I could give for string comparison complexity in MySQL?

JSmaga 2010-07-12 05:37:58

@JSmaga: The only reference i could give is to the MySQL manual. It explains a bit about how it compares strings (namely, that VARCHAR and CHAR columns are compared using the collation specified by the column, table, session, or DB defaults, in that order of preference).

cHao 2010-07-12 05:48:50

Answer 2

+1 A:

Assuming that there is a lot of duplication of data of the "Name" field, your query performance improved because integer comparisons are faster than string comparisons and you significantly reduced the size of the date table. This means less memory paging and less disk seeking.

If the name table has N rows, then you are doing N string comparisons, plus 40 million integer comparisons, instead of 40 million string comparisons. To increase query performance even more, you should add an index for the ID field of the date table.

CREATE INDEX date_id_index ON date_table (ID)

2010-07-12 04:57:40

Ok seems I was right then. Any idea about the complexity of string comparison in MySQL?

JSmaga 2010-07-12 05:27:28

Answer 3

+1 A:

As for books, "Applied Mathematics for Database Professionals" by Lex de Haan and Toon Koppelaars is really good book if you want advanced SQL-knowledge. I should point out that you don't just "mention" books, you read them and use them as reference - just referencing books because they sound cool but not reading them will come back to bite you in the ass.

Hagge 2010-07-12 06:05:15

@Hagge: Well I think you read the section you're interested in an then you cite as reference. Thanks. I was not planning to do some book fishing...

JSmaga 2010-07-12 06:39:12

Sorry, didn't mean to imply you were - I realize my language was a little harsh. I meant it in a more friendly manner :).

Hagge 2010-07-12 08:15:15

ansaurus

tags:

views:

answers:

Literature about Database performance

related questions