My question is in regards to MySQL, but I also wonder how this affects other databases. I have several fields that are varchar(255) but my coworker insists if they were varchar(30) -- or any smaller size -- then queries would run faster. I'm not so sure, but if it's so I'll admit to it. :) Thanks.
If you're only ever using the first 30 characters, then there won't be a difference between a varchar(30) and a varchar(255) (although there would be a difference with varchar(1000), which would take an extra byte).
Of course, if you end up using more than 30 characters, it will be slower as you have more data to pass to the client, and your indexes will be larger.
It depends on the query and the data, but you're probably optimizing too soon to even be worried.
For SELECT queries, the statement itself will run just as fast within MySQL, and as long as the data doesn't get larger than it would be at the smaller sized field then it will transmit as fast. If the smaller field forces you to store the information in a smaller space (would you use the extra 225 chars?), then you will get fast transmission to other programs.
For INSERT queries the size of the field isn't an issue, but using variable length fields will slow the process done. INSERTs with fixed length rows are notably faster (at least in MySQL 5.0 and earlier).
Generally, use the size you need for the data. If you don't know if you need 255 chars or 30 chars you're probably optimizing too soon. Are large data fields causing a bottleneck? Is you program suffering from database performance problems at all? Find your bottlenecks first, solve the problem with them second. I'd guess the difference in time you're looking at here is unimportant to whatever problem you are trying to solve.
Anything smaller than VARCHAR(255) will use one byte to store it's size, so VARCHAR(30) and VARCHAR(255) won't make a difference.
But take a look if your data is consistent, I mean, always the same size, in that case using a CHAR would be more useful because you won't waste time with size information and your search would be simpler to find the data, not in account index here.
Even if your data isn't consistent but changes in a factor of let's say, one byte, a CHAR would be better, because you will waste one byte with size information anyway.
Very rarely will column width affect query performance. Certainly if you're using larger objects (BLOBs, LONGBLOBs, TEXTs, LONGTEXTs), there is the potential for a lot of data to get pulled. That could possibly affect performance, but it won't necessarily. That really only affects storage. If you care about storage size by data type, you can reference http://dev.mysql.com/doc/refman/5.0/en/storage-requirements.html to see the details.
And to reiterate: storage size of data does not necessarily impact the speed of queries. There are many other design considerations that will impact query speed. Design of the tables and relationships, key structure, indexes, query and join architecture, etc.
A few years ago many people suggested using tinytext
instead of varchar
in MySQL for performance, since row by row search was supposedly faster with constant row data size. Surely MySQL's query, storage and index handling algorithms evolved since then and it may not have that much of an impact now.
But you're probably optimizing too soon and shouldn't be worried about performance at this level.
Since you asked about other databases…
It ABSOLUTELY does affect query time.
In Oracle when data is moved from Server to Client, it’s done through a buffer. Nothing revolutionary there. The number of rows it puts in that buffer is based on the maximum row size. Say your query returns 4 columns of varchars. If the size of the columns is 100 and it should be 10, Oracle will fit 10x fewer rows in each fetch than it otherwise could with right-sized column definitions. This results in blocks being re-read unnecessarily. It forces more network traffic, more round trips.
In Oracle you can change the size of the buffer with SET ARRAYSIZE. Try it sometime, do a query with one size and then do it again with 10% of the space. You’ll see reads go up, network trips go up, and performance go down. Making columns way too big is just like making that buffer way too small.
But the real reason for accurately sized columns is data integrity. You keep bad stuff out. That’s just as important as performance.
Remember:
- It’s never too early to design for performance
- 99% of what you say come back to, you won’t
- It’s far easier, better, and cheaper to get something right the first time.
Most other answers here are focused on the fact that VARCHAR is stored in a variable-length manner, so it stores the number of bytes of the string you enter on a given row, not the maximum length of the field.
But during queries, there are some circumstances where MySQL converts a VARCHAR into a CHAR -- and hence the size goes up to the maximum length. This happens, for instance, when MySQL creates a temporary table during some JOIN or ORDER BY or GROUP BY operations.
Telling all the cases where it would do this is complicated, because it depends on how the optimizer treats the query, it depends on other table structure and indexes you define, it depends on the type of query, and it even depends on the version of MySQL because the optimizer is improved with each version.
The short answer is yes, it can make a difference whether you use VARCHAR(255) or VARCHAR(30). So define the column maximum length according to what you need, not a "big" length like 255 for the sake of tradition.