tags:

views:

87

answers:

5

HI all

I am using single database and near about 7 tables. do have data s filled with all tables. say near about 10k as of now. but will grow further and may strike millions but will take time.

my question is why my query is slow fetching results. its taking near about 10 to 12 seconds for a query on non load conditions. I am worried if what happens under load conditions say thousands of queries at one time??

here is my sample query...

$result = $db->sql_query("SELECT * FROM table1,table2,table3,table4,table5 WHERE table1.url = table2.url AND table1.url = table3.url AND table1.url = table4.url AND table1.url = table5.url  AND table1.url='".$uri."'")or die(mysql_error());

$row = $db->sql_fetchrow($result);

$daysA = $row['regtime'];
$days = (strtotime(date("Y-m-d")) - strtotime($row['regtime'])) / (60 * 60 * 24);
if($row > 0 && $days < 2){

$row['data'];
$row['data1'];
//remaining 

}else{ //some code}
A: 

It sounds likely that some of the columns in your WHERE clause are not indexed. Indexes are used to find rows with specific column values quickly. Without an index, MySQL must begin with the first row and then read through the entire table to find the relevant rows.

You might find EXPLAIN helpful in analyzing your queries.

Johannes Gorset
I have done Indexing for all tables on column 'url' and I have queried EXPLAIN and it returns the whole table structure...but what I need to understand from it??
mathew
oh I have added UNIQUE INDEX for url column. because I dont want any repetition...is this make any interfere??
mathew
UNIQUE INDEX is an index. As for understanding EXPLAIN, check out http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htm
Johannes Gorset
well it seems there is a indexing problem with table 4 it doesnt have primary key..I dont know what happend it was there when it created..can any one tell me how to copy data from table 4 to table4-1??
mathew
oops sorry everything is ok it seems I will post the result
mathew
well I will post a new question
mathew
A: 

definitely looks like an index on the url field in each table is the way to go

Rob Cooney
A: 

Look up JOINs and especially look at the difference between INNER JOINS, LEFT JOINS and OUTER JOINS. Also INDEX all the fields on which you are going to do a lookup.

Elf King
A: 

Probably something wrong with your indexes!

In any case long character strings like urls make for poorly performing primary keys. The take up a lot of room in the index and so the indexes are not as dense as they could be and less row pointers are loaded per IO. Also with urls the chances are that 99% of your strings start with "http://www." so the database engine has to compare 13 characters before it decides a row does not match.

One solution to this is to use some hash finction like MD5, SHA1 or even CRC32 to get a raw binary value from your strings and to use this value as the primary key for your tables. CRC32 makes a nice integer sized primary key but its almost certain that at some stage you will encounter two urls that hash to the same CRC32 value so you will need to store and compare the "url" string to be sure. The other hash functions return longer values (16 bytes and 20 bytes respectively in "raw" mode) but the chances of a collision are so small that its not worth bothering about.

.

James Anderson
actually http://www. is trim off before it goes to database
mathew
@mathew it's a mistake to strip off the www. For example, www.somedomain.com is NOT the same as somedomain.com. Yes, most websites these days have their DNS entries point both to the same web server, but that's just a convention. Regardless, stripping 3 characters from a URL isn't going to shorten most URLs by much.
George Marian
A: 

I'm not sure if you have resolved the problem or not, but here's some test data that I have produced. There are a number of factors that can affect the speed of your queries, so my simple test cases may not accurately reflect your tables or data. However, they serve as a useful starting point.

First, create 5 simple tables, each with the same structure. As with your tables, I have used a UNIQUE index on the url column:

CREATE TABLE `table1` (
    `id` int(11) NOT NULL auto_increment,
    `url` varchar(255) default NULL,
    PRIMARY KEY  (`id`),
    UNIQUE KEY `url` (`url`)
) ENGINE=InnoDB;

CREATE TABLE table2 LIKE table1;
CREATE TABLE table3 LIKE table1;
CREATE TABLE table4 LIKE table1;
CREATE TABLE table5 LIKE table1;

The following script creates a stored procedure which is used to fill each table with 10,000 rows of data:

DELIMITER //
DROP PROCEDURE IF EXISTS test.autofill//
CREATE PROCEDURE test.autofill()
BEGIN
    DECLARE i INT DEFAULT 5;
    WHILE i < 10000 DO
        INSERT INTO table1 (url) VALUES (CONCAT('wwww.stackoverflow.com/', i ));
        INSERT INTO table2 (url) VALUES (CONCAT('wwww.stackoverflow.com/', 10000 - i ));
        INSERT INTO table3 (url) VALUES (CONCAT('wwww.stackoverflow.com/', i + 6000 ));
        INSERT INTO table4 (url) VALUES (CONCAT('wwww.stackoverflow.com/', i + 3000 ));
        INSERT INTO table5 (url) VALUES (CONCAT('wwww.stackoverflow.com/', i + 2000 ));
        SET i = i + 1;
    END WHILE;
END;
//
DELIMITER ;

CALL test.autofill();

Each table now contains 10,000 rows. Your SELECT statement can now be used to query the data:

SELECT * 
FROM table1,table2,table3,table4,table5
WHERE table1.url = table2.url
AND table1.url = table3.url
AND table1.url = table4.url
AND table1.url = table5.url
AND table1.url = 'wwww.stackoverflow.com/8000';

This gives the following result almost instantly:

+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+
| id   | url                         | id   | url                         | id   | url                         | id   | url                         | id   | url                         |
+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+
| 7996 | wwww.stackoverflow.com/8000 | 1996 | wwww.stackoverflow.com/8000 | 1996 | wwww.stackoverflow.com/8000 | 4996 | wwww.stackoverflow.com/8000 | 5996 | wwww.stackoverflow.com/8000 |
+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+------+-----------------------------+

An EXPLAIN SELECT shows why the query is very fast:

EXPLAIN SELECT * 
FROM table1,table2,table3,table4,table5
WHERE table1.url = table2.url
AND table1.url = table3.url
AND table1.url = table4.url
AND table1.url = table5.url
AND table1.url = 'wwww.stackoverflow.com/8000';

+----+-------------+--------+-------+---------------+------+---------+-------+------+-------------+
| id | select_type | table  | type  | possible_keys | key  | key_len | ref   | rows | Extra       |
+----+-------------+--------+-------+---------------+------+---------+-------+------+-------------+
|  1 | SIMPLE      | table1 | const | url           | url  | 258     | const |    1 | Using index |
|  1 | SIMPLE      | table2 | const | url           | url  | 258     | const |    1 | Using index |
|  1 | SIMPLE      | table3 | const | url           | url  | 258     | const |    1 | Using index |
|  1 | SIMPLE      | table4 | const | url           | url  | 258     | const |    1 | Using index |
|  1 | SIMPLE      | table5 | const | url           | url  | 258     | const |    1 | Using index |
+----+-------------+--------+-------+---------------+------+---------+-------+------+-------------+

select_type is SIMPLE, which means that there are no JOIN statements to slow things down.

type is const, which means that the table has at most one possible match - this is thanks to the UNIQUE index, which guarantees no two URLs will be the same (see mysql 5.0 indexes - Unique vs Non Unique for a good description of UNIQUE INDEX). A const value in the type column is about as good as you can get.

possible_keys and key use the url key. That means that the correct index is being used for each table.

ref is const, which means that MySQL is comparing a constant value (one that does not change) with the index. Again, this is very fast.

rows equals 1. MySQL only needs to look at one row from each table. Once again, this is very fast.

Extra is Using index. MySQL does not have to do any additional non-indexed searches of the tables.

Provided you have an index on the url column of each table, your query should be extremely fast.

Mike
I am getting this errorErrorSQL query:CREATE PROCEDURE test.autofill() BEGIN DECLARE i INT DEFAULT 5;MySQL said: Documentation#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 3
mathew
hey done....I got exact result what you shown...is this means everything ok?? well then I need to change the whole structure of my table....but if it is fast then I am very happy
mathew
@mathew: Error 1064 refers to the use of a [reserved word](http://dev.mysql.com/doc/refman/5.0/en/reserved-words.html). What version of MySQL are you using? Did you copy and paste the function as-is, complete with `DELIMITER` statements?
Mike
Thank you very much for getting me this test and updating me with little SQL knowledge..and I am current ly using MyISAM so do I create a new table and populate it??if so how do I do it??
mathew
I will create new table but how do I populate it with data on the other table
mathew
@mathew: It doesn't mean that you need to change the whole structure of your tables. It just means that there's something not quite right with them. If you edit your question, and add the output from `SHOW CREATE TABLE <table name>` (for each table), someone might be able to spot the problem.
Mike