views:

83

answers:

2

Basically for a plugin for a dynamic site (site can be fairly large) I am caching results of some sort of search (because results are from an external search), the results can be 400-1500 characters in length.

As the results come in an array, I use json_encode (faster than serialize) to store in the database, but ~1.5KB per entry (since there may be 10,000)=15MB seems a little large for me.

My questions are:
* Is this an acceptable (your opinion) size per entry?
* Will running GZip or similar and storing in a binary field in MySQL be more efficient or take too much CPU time in the end? Anything normally used similar?

I prefer to not use memcached or alike as it's needed to be portable (but would that be better as well?) this is mostly a theory question for me, I just require input before I implement anything solid.

A: 

For me it would be better to cache files on disk, not the database. You should always assume that your database is in China, and your webserver is in USA, and therefore you would have to ride all globe to get the data. When you access files in your webserver's filesystem it is more convenient, but also requires good drives and good load balancing not to kill them.

Tomasz Kowalczyk
Even though I think, too, that storing big chunks of data is better on the filesystem (as long as you only need lookup and no complex search, sorting, joining, aso) I think it is a really bad thing to assume that your database server and your webserver are far away. This is because this will never happen and because this way you won't ever use the performance benefits of a database. (Which are BIG!) -1
nikic
i thought of design assuming, not DOING such thing. get database do you favor for the data that is "databaseable", get filesystem for doing file management. please don't downvote if you have other thoughts but we are still both correct.
Tomasz Kowalczyk
Upvoted because of some sensable thinking, but I do not wish to use a flatfile system for this. It'd be somewhat of a nuisance for end user due to the amount it would be needed to be used, and nikic was right about the benifits of sheer things the queries can do.
John
By that logic you'd also have to assume that the filesystem is mounted via NFS over DTN to the moon. And the solution would be to make the deployment extra simple so you can move away from a host with that sub-optimal setup ;-)
VolkerK
+2  A: 

There will always be a CPU Cost for any kind of compression, it depends if you have the resources to handle it without any noticeable slowdown. Space is cheap and abundant, so 15megs is ok.

But if you really want to compress your field, then check out Mysql's COMPRESS() and UNCOMPRESS() functions.

This could be dropped into your code and it would work without changing any PHP/Logic.

Ollie
Ah. I read MySQL `COMPRESS()` is slower than PHP using gzcompress, so I benchmarked gzcompress. 40,000 iterations `gzcompress(700kbstr, 1)` took 1.2 seconds and length resulted in 394KBs. `gzcompress(700kbstr, 9)` took 2.5 seconds and result was 388kbs. I'm sure using `UNHEX` in a `BINARY` field will make it smaller, I shall test. I'll accept this one because it looks the most promising. __EDIT:__ misworded that, you know what I mean.
John