ansaurus

Question

Are multiple sql querys a significant performance hit?

Answer 1

+4 A:

If you would do this all in one query, PHP would have to do the rest. So, the question is where the bottleneck is. Is the db server machine slower than the webserver machine? What about the network conneciton between them? Best way to optimize it for your conditions is, as I guess, experimenting.

joni 2010-10-25 21:29:00

Generally speaking, which guy is slower? Is it more time to parse through queries or seek though arrays?

Jason 2010-10-25 21:53:45

Answer 2

+2 A:

The fewer the queries, the better. Of those, only get the columns you actually need.

OMG Ponies 2010-10-25 21:29:27

Do you happen to know the difference between the sql server regurgitating one column as opposed to all of them? Seems like it would be all the same to it. It either has to find the column (in the row) your asking for, or spit out all the columns.

Jason 2010-10-25 21:44:05

@Jason: Each data type has a byte value associated with it -- the more columns, the more bytes over the wire. Worse, binary data (BLOB, CLOB) will *really* inflate that...

OMG Ponies 2010-10-25 21:46:12

@OMG - true true, I didn't consider it actually transmitting that data (duh!), just seek-and-find. I only have one BLOB, in my tables, but most of my columns are 100 bytes.

Jason 2010-10-25 21:53:01

@Jason: Yea, less of a concern if the database is on the same box as the rest of the application, which isn't common on medium to large scale applications.

OMG Ponies 2010-10-25 21:55:37

@OMG - on large scale, are they communicating over TCP/IP. I don't know much about the IT part of it?

Jason 2010-10-25 22:19:21

@Jason: Yes, pretty much all goes over TCP/IP. Some databases support others and have their own custom ones, so it depends on setup.

OMG Ponies 2010-10-25 22:20:49

@OMG - Now the question is, if I do a query for multiple rows, does it send one row per packet (assuming it's less than 2k), or does it fit as many rows as it can in a packet? I know this is much higher than optimization, just curious.

Jason 2010-10-25 22:29:19

@Jason there could be a difference between selecting 1 column vs. all columns depending on whether the table(s) queried have a covering index. A covering index includes additional column in addition to the indexed columns. I'm not a MySQL user, but I googled for "MySQL covering index" and got some results. I looked at the first couple and they looked useful, but I'll let you google and pick what looks good since I'm no MySQL expert. Also the size of each row might make a difference -- and potential future size too since someone might come along and and some columns with large datatype.

Adam Porad 2010-10-25 22:52:19

@Jason: Depends on the lowest Maximum Transmittable Unit (MTU) setting on the interfaces between the database and the destination, but there's still overhead (and additional overhead if piping through a VPN).

OMG Ponies 2010-10-25 23:09:59

@Adam Porad: An index, covering or otherwise, can help get the data out of the database faster but has nothing to do with speed over the wire. Having an index doesn't ensure it will be used -- MySQL only uses one index per statement it sees via the EXPLAIN plan.

OMG Ponies 2010-10-25 23:11:15

@OMG Ponies - in @Jason's comment he asked "regurgitating one column as opposed to all of them?" The comment thread started discussing network performance of selecting all columns without including other possible performance considerations, which is why I mentioned the index.

Adam Porad 2010-10-26 17:08:26

Answer 3

A:

Definitely, large queries and multiple queries are performance hit as DB query involves connect, query and disconnect.

For performance improvement, please make sure that you have right flags in place for mysql,conf and there are variables wrt cache and they will prevent you in hitting database if you are referring to same table again and again.

--Cheers

Koteswara sarma 2010-10-25 21:32:30

db usually only connects once per script

knittl 2010-10-25 21:33:28

@knittl - haha, actually I was called out in an early answer because I was actually opening and closing a connection in a query. I was trying to make it easier for developers later on and I was afraid a developer would open a connection with the same name. My excuse is that I don't have a formal teacher. I am a hobbyist and am teaching/taught myself!

Jason 2010-10-25 21:57:28

Answer 4

+3 A:

Don't pre-optimize. Obviously the fewer queries, the less overhead in running them, but this overhead is usually very small. The number of queries is not anywhere near as big a factor as properly indexing the tables and optimizing the actual queries themselves.

Generally, each query has to send some data over the network to the listening port on the database process, which parses, compiles and runs the query. Usually, query plan caching will ensure that minimal amount of time is spent doing this. As long as your network isn't bottlenecking anything, the RPC overhead is quite low, on the order of a few milliseconds or less.

Really, it will be dependent on the queries themselves and the situation you're in.

womp 2010-10-25 21:32:50

Answer 5

+2 A:

if you want to count your distinct types use the following query:

  SELECT `type`, COUNT(*) AS `count`
    FROM `a_table`
GROUP BY `type`

knittl 2010-10-25 21:34:41

Answer 6

+2 A:

I agree with other users. The less queries, the better.

On the other hand, consider this. Now days, average CPU speed is more than 1GHz, thats about 1,000,000 clock cycles per second, and in each clock cycles you can get multiple things done. Example would be that ALU (Arithmetic Logic Unit) can perform more than 1 additions per clock cycle.

Taking this into account, having more than 1 queries is not such a big deal. It would take a couple of more milliseconds (at most) to process extra one or two or three queries. Basically the user will probably not even notice this.

Warning though, if you will have like 100 extra queries, yes, this will probably cause some noticeable delays. So the idea is to do things in moderate amounts.

miki725 2010-10-25 21:37:34

100 probably isn't even enough. I recently consolidated 2000 queries into one query (on SQL Server 2008) and the RPC overhead savings was less than 500 ms. The savings in reducing the extraneous logical reads and getting all the data in bulk was very large though.

womp 2010-10-25 21:45:35

It be much more impressed if you knew how many clock cycles an average query could take! ;) In the grand scheme of things, a milli-second is pretty slow. Considering a clock cycle is on the order of nano-seconds (given your aforementioned 1 GHz). In my early days of engineering when it came to u-processor programming, 100s of milliseconds were a significant speed hit that I could not afford unless it was important (with a clock around 8 MHz). Proportionality says the same (to me).

Jason 2010-10-25 21:49:05

ansaurus

tags:

views:

answers:

Are multiple sql querys a significant performance hit?

related questions