views:

7631

answers:

11

I need to store a multi-dimensional associative array of data in a flat file for caching purposes. I might occasionally come across the need to convert it to JSON for use in my web app but the vast majority of the time I will be using the array directly in PHP.

Would it be more efficient to store the array as JSON or as a PHP serialized array in this text file? I've looked around and it seems that in the newest versions of PHP (5.3), json_decode is actually faster than unserialize.

I'm currently leaning towards storing the array as JSON as I feel its easier to read by a human if necessary, it can be used in both PHP and JavaScript with very little effort, and from what I've read, it might even be faster to decode (not sure about encoding, though).

Does anyone know of any pitfalls? Anyone have good benchmarks to show the performance benefits of either method?

Thanks in advance for any assistance.

+17  A: 

JSON is simpler and faster than PHP's serialization format and should be used unless:

  • You're storing deeply nested arrays: json_decode(): "This function will return false if the JSON encoded data is deeper than 127 elements."
  • You're storing objects that need to be unserialized as the correct class
  • You're interacting with old PHP versions that don't support json_decode
Greg
Great answer. Haha, 127 levels deep seems a bit insane; thankfully I'm only going like 2-3 at the most. Do you have any data to back up the fact that json_decode/json_encode is faster than unserialize/serialize?
KyleFarris
I did test it a while ago and json came out faster - I don't have the data any more though.
Greg
@Kyle - I added a speed test to my answer. On my server, json_encode() is averaging about 100% faster that serialize()
Peter Bailey
+28  A: 

Depends on your priorities.

If performance is you absolute driving characteristic, then by all means use the fastest one. Just make sure you have a full understanding of the differences before you make a choice

  • JSON converts UTF-8 characters to unicode escape sequences. serialize() does not.
  • JSON will have no memory of what the object's original class was (they are always restored as instances of stdClass).
  • You can't leverage __sleep() and __wakeup() with JSON
  • Only public properties are serialized with JSON
  • JSON is more portable

And there's probably a few other differences I can't think of at the moment.

EDIT

A simple speed test to compare the two

<?php

ini_set( 'display_errors', 1 );
error_reporting( E_ALL );

//  Make a bit, honkin test array
//  You may need to adjust this depth to avoid memory limit errors
$testArray = fillArray( 0, 5 );

//  Time json encoding
$start = microtime( true );
json_encode( $testArray );
$jsonTime = microtime( true ) - $start;
echo "JSON encoded in $jsonTime seconds<br>";

//  Time serialization
$start = microtime( true );
serialize( $testArray );
$serializeTime = microtime( true ) - $start;
echo "PHP serialized in $serializeTime seconds<br>";

//  Compare them
if ( $jsonTime < $serializeTime )
{
    echo "json_encode() was roughly " . number_format( ($serializeTime / $jsonTime - 1 ) * 100, 2 ) . "% faster than serialize()";
}
else if ( $serializeTime < $jsonTime )
{
    echo "serialize() was roughly " . number_format( ($jsonTime / $serializeTime - 1 ) * 100, 2 ) . "% faster than json_encode()";
} else {
    echo 'Unpossible!';
}

function fillArray( $depth, $max )
{
    static $seed;
    if ( is_null( $seed ) )
    {
     $seed = array( 'a', 2, 'c', 4, 'e', 6, 'g', 8, 'i', 10 );
    }
    if ( $depth < $max )
    {
     $node = array();
     foreach ( $seed as $key )
     {
      $node[$key] = fillArray( $depth + 1, $max );
     }
     return $node;
    }
    return 'empty';
}
Peter Bailey
You make some great points. Fortunately, for my case, I'm storing simple arrays (of other arrays, ints, bools, and strings) no objects. If I were storing objects, IMO, serialize would be the way to go.
KyleFarris
Excellent work dude. This will benefit everyone. I ran the test about 30 times and json_encode won every single time with around 100% (average) performance increase over serialize. I added json_decode and unserialize tests and json_decode won everytime in about 10 tests with an average performance benefit of ~20% over unserialize. Thanks for this.
KyleFarris
Too bad json_encode/json_decode is php 5.2 and above only. disgusts_uncover_akin_umbriel
dreftymac
useless tests and useless caching
Col. Shrapnel
@Col. Shrapnel - Useless? Care to expand on that?
Peter Bailey
imagine it is real case. would you use such a silly cache? instead of caching HTML, HTTP and APC/memcache use? And these stupid microtime tests again. Why not to profile your app first and see if such a task ever need any optimization?
Col. Shrapnel
@Col. Shrapnel. So you downvoted *me* because you disagree with the merits of the question? Thanks.
Peter Bailey
Not to mention that not every PHP developer has the access or even technical know-how to use a profiler. And caching large data structures into native PHP isn't exactly a novel concept - quite a few frameworks do it out of the box.
Peter Bailey
+4  A: 

If you are caching information that you will ultimately want to "include" at a later point in time, you may want to try using var_export. That way you only take the hit in the "serialize" and not in the "unserialize".

Jordan S. Jones
This is most probably the fastest way possible. I wrote an example on the SO "PHP - *fast* serialize/unserialize": http://stackoverflow.com/questions/2545455/php-fast-serialize-unserialize/3369942#3369942
dave1010
+7  A: 

I've written a blogpost about this subject: "Cache a large array: JSON, serialize or var_export?". In this post it is shown that serialize is the best choice for small to large sized arrays. For very large arrays (> 70MB) JSON is the better choice.

Takkie
Very cool man. Thanks. ;-)
KyleFarris
A: 

This is really awsome. and I think json_encode /json_decode is absouletly superb for me!

A: 

Before you make your final decision, be aware that the JSON format is not safe for associative arrays - json_decode() will return them as objects instead:

$config = array(
    'Frodo'   => 'hobbit',
    'Gimli'   => 'dwarf',
    'Gandalf' => 'wizard',
    );
print_r($config);
print_r(json_decode(json_encode($config)));

Output is:

Array
(
    [Frodo] => hobbit
    [Gimli] => dwarf
    [Gandalf] => wizard
)
stdClass Object
(
    [Frodo] => hobbit
    [Gimli] => dwarf
    [Gandalf] => wizard
)
too much php
Indeed, you are right. I mean, it *is* Javascript **object** notation afterall! Thankfully, if you *know* that what you encoded using `json_encode` was an associative array, you can easily force it back into an array like so: `$json = json_encode($some_assoc_array); $back_to_array = (array)json_decode($json);` Also it's good to note that you can access objects the same way as arrays in PHP so in a typical scenario, one wouldn't even know the difference. Good point though!
KyleFarris
@toomuchphp, sorry but you are wrong. There is a second parameter for json_decode 'bool $assoc = false' that makes json_decode produce an array. @KyleFarris, this should also be faster than using the typecast to array.
Jan P.
@Jan thanks for the correction
too much php
+2  A: 

I augmented the test to include unserialization performance. Here are the numbers I got.

Serialize

JSON encoded in 2.5738489627838 seconds
PHP serialized in 5.2861361503601 seconds
Serialize: json_encode() was roughly 105.38% faster than serialize()


Unserialize

JSON decode in 10.915472984314 seconds
PHP unserialized in 7.6223039627075 seconds
Unserialize: unserialize() was roughly 43.20% faster than json_decode()

So json seems to be faster for encoding but slow in decoding. So it could depend upon your application and what you expect to do the most.

Jeff Whiting
A: 

Thanks all for comparing two method :)

usually, i use json just for ajax, and serialize for storing in database .... regards

Ferri Sutanto
A: 

Seems like serialize is the one I'm going to use for 2 reasons:

  • Someone pointed out that unserialize is faster than json_decode and a 'read' case sounds more probable than a 'write' case.

  • I've had trouble with json_encode when having strings with invalid UTF-8 characters. When that happens the string ends up being empty causing loss of information.

urraka
+2  A: 

Y just tested serialized and json encode and decode, plus the size it will take the string stored.

JSON encoded in 0.067085981369 seconds. Size (1277772)
PHP serialized in 0.12110209465 seconds. Size (1955548)
JSON decode in 0.22470498085 seconds
PHP serialized in 0.211947917938 seconds
json_encode() was roughly 80.52% faster than serialize()
unserialize() was roughly 6.02% faster than json_decode()
JSON string was roughly 53.04% smaller than Serialized string

We can conclude that JSON encodes faster and results a smaller string, but unserialize is faster to decode the string.

Blunk
Thanks for your input Blunk. Interesting study.
KyleFarris
A: 

THX - for this benchmark code:

My results on array I use for configuration are as fallows: JSON encoded in 0.0031511783599854 seconds
PHP serialized in 0.0037961006164551 seconds
json_encode() was roughly 20.47% faster than serialize() JSON encoded in 0.0070841312408447 seconds
PHP serialized in 0.0035839080810547 seconds
unserialize() was roughly 97.66% faster than json_encode()

So - test it on your own data.

mk182
both methods are bad, so, your tests are useless
Col. Shrapnel