views:

78

answers:

3

Hi, I work at a fairly big website; we have ~400-500 million page views a month. We use PHP and MySQL.

Currently our page flow works like this (semi pseudo code for clarity):

mysql_connect();
mysql_select_db('red');
mysql_query('SELECT * FROM apples');
mysql_query('SELECT * FROM cakes');

One of my co-workers suggested that mysql_select_db slows down page loads and increases the load of the DB server, and suggested to change our "flow" to be like:

mysql_connect();
mysql_query('SELECT * FROM red.apples');
mysql_query('SELECT * FROM red.cakes');

Is it true that it will make a difference? I'm especially concerned about the time it will take to make this change; will we see any noticeable results?

Is it worth it?

+6  A: 

Benchmark:

$ cat 1.php 
<?php
mysql_connect('localhost') or die(mysql_error());
mysql_select_db('test') or die(mysql_error());
mysql_query('SELECT SQL_NO_CACHE * FROM foo') or die(mysql_error());
mysql_query('SELECT SQL_NO_CACHE * FROM bar') or die(mysql_error());
?>
$ cat 2.php 
<?php
mysql_connect('localhost') or die(mysql_error());
mysql_query('SELECT SQL_NO_CACHE * FROM test.foo') or die(mysql_error());
mysql_query('SELECT SQL_NO_CACHE * FROM test.bar') or die(mysql_error());
?>
$ time ( for i in {1..100} ; do php 1.php; done;)

real    0m3.554s
user    0m2.300s
sys     0m1.188s
$ time ( for i in {1..100} ; do php 2.php; done;)

real    0m3.555s
user    0m2.292s
sys     0m1.208s

I call bullshit.

Is your coworker perhaps confused with the mysql command-line client, which will load all table & columnnames if you switch to a database (which can be avoided with the -A switch)?

Wrikken
thank you for your answer =)
anon
A: 

i don't know for sure about your question but i doubt it would make significant difference.

but there's other things that can make. i suggest you write all the column names instead of using *. this i know for sure that will speed up your queries.

ex:

mysql_query('SELECT id, size, color FROM apples');

other thing you can do is to use LIMIT correctly. for instance, if you are selecting an user from database, and you KNOW for sure that it's unique, use LIMIT 1 at the end of the query.

ex:

SELECT id, username, access_level FROM users WHERE id = ? LIMIT 1
hugo_leonardo
I can't imagine this would ever make enough of a difference to be worthwhile, especially if you have more than three or four columns.
bnaul
Not from a performance point of view perhaps, but certainly with bugtracking (i.e.: table changes fail much more transparantly). Giving you that much more time to work on better causes :P
Wrikken
i ran some tests over big loops and the difference appeared to be significant. like half the waiting time. maybe i'm not testing it right. how should i ran the tests?
hugo_leonardo
@hugo_leonardo: at preferable an idle server, with query_cache disabled, etc. fwiw: if I run the test while selecting _all_ columns as in my other question, `select *` is about 2% slower, not really that much gain. When you can avoid BLOB / TEXT columns when you don't need them the gain could however be significant.
Wrikken
A: 

I doubt very much it would make a difference performance wise if you add the database name.

Persistent connections and refactoring your queries to never, EVER select * would be a good first step, I think. Then you might want to think about using query caching, and having a look at your slow query log. Those are going to help you more than some small semantic difference like specifying the schema name ever could.

Anthony Levensalor