views:

1387

answers:

3

Apologies if this is a silly question, could use an opinion:

I've got a messages table in MySQL which records messages between users. Apart from the typical ids and message types (all integer types) I need to save the actual message text as either VARCHAR or TEXT. I'm setting a front-end limit of 3000 characters which means the messages would never be inserted into the db as longer than this.

Is there a rationale for going with either VARCHAR(3000) or TEXT? There's something about just writing VARCHAR(3000) that feels somewhat counter-intuitive. I've been through other similar posts on Stack Overflow but would be good to get views specific to this type of common message storing.

Any help would be appreciated. Thanks.

+6  A: 

TEXT and BLOB is stored off the table with the table just having a pointer to the location of the actual storage.

VARCHAR is stored inline with the table. VARCHAR is faster when the size is reasonable, the tradeoff of which would be faster depends upon your data and your hardware, you'd want to benchmark a realworld senerio with your data.

MindStalker
+1: Good explanation.
James
+1  A: 

Disclaimer: I'm not a MySQL expert ... but this is my understanding of the issues.

I think TEXT is stored outside the mysql row, while I think VARCHAR is stored as part of the row. There is a maximum row length for mysql rows .. so you can limit how much other data you can store in a row by using the VARCHAR.

Also due to VARCHAR forming part of the row, I suspect that queries looking at that field will be slightly faster than those using a TEXT chunk.

Michael Anderson
The row length limit is 65,535 bytes [ http://dev.mysql.com/doc/refman/5.0/en/column-count-limit.html ]. If your column is utf8-encoded, that means a 3000-character `varchar` column can take up to 9000 bytes.
Jan Fabry
Thanks for the utf8 clarification.
Tom
+2  A: 

Just to clarify the best practice:

1) Text format messages should almost always be stored as TEXT (they end up being arbitrarily long)

2) String attributes should be stored as VARCHAR (the destination user name, the subject, etc...).

I understand that you've got a front end limit, which is great until it isn't. grin The trick is to think of the DB as separate from the applications that connect to it. Just because one application puts a limit on the data, doesn't mean that the data is intrinsically limited.

What is it about the messages themselves that forces them to never be more then 3000 characters? If it's just an arbitrary application constraint (say, for a text box or something), use a TEXT field at the data layer.

James
Thanks, that's very useful.
Tom