tags:

views:

137

answers:

7

On the Facebook FQL pages it shows the FQL table structure, here is a screenshot below to show some of it.

You will notice that some items are an array, such as meeting_sex, meeting_for current_location. I am just curious, do you think they are storing this as an array in mysql or just returning it as one, from this data it really makes me think it is stored as an array. IF you think it is, or if you have done similar, what is a good way to store these items as an array into 1 table field and then retrieve it as an array on a PHP page?

alt text

+4  A: 

There are two options for storing as an array:

The first, which you mentioned, is to make one, or several, tables, and enumerate each possible key you intend to store. This is the best for searching and having data that makes sense.

However, for what you want to do, use serialize(). Note: DO NOT EVER EVER EVER try to search against this data in its native string form. It is much faster (and saner) to just reload it, call unserialize(), and then search for your criteria than to develop some crazy search pattern to do your bidding.

EDIT: If it were me, and this were something I was seriously developing for others to use (or even for myself to use, to be completely honest), I would probably create a second lookup table to store all the keys as columns; Heck, if you did that, mysql_fetch_assoc() could give you the array you wanted just by running a quick second query (or you could extract them out via a JOINed query). However, if this is just quick-and-dirty to get whatever job done, then a serialized array may be for you. Unless you really, really don't care about ever searching that data, the proper column-to-key relationship is, I think most would agree, superior.

Dereleased
oh, but when he eventually gets around to fixing all the little bugs and extra features he wants to implement, and discovers that searching those "serialized" fields is one of them, it's pretty much too late to try to fix the problem without a cost that exceeds the savings of just making it "simple" in the first place.
Lasse V. Karlsen
Yeah, I added a little addendum to the end of my post to that effect, I definitely agree with you on that one. It's going to be six kinds of a nightmare to do anything with that data once you've gone down that road, except for the super-fun task of just loading all of it up and parsing it yourself in PHP/whatever, at which point you're really just implementing your own very sloppy and inefficient DBMS.
Dereleased
so if it is just a few key values anyways instead of array or another table maybe I should just add them into the current table?
jasondavis
If the structure of the data is "2 values" or "N values" where 2 or N is "constant, forever, in all cases, for everyone", then go ahead and make the data a part of the main table, as extra columns (address1, address2). However, you need to consider how many rows will have most of these as NULL. If many, then perhaps you should use a separate table anyway. And if you cannot guarantee, beyond any doubt, that the number of columns will never change, you should just go for the extra table. Adding address3 at some point will be a *real* pain.
Lasse V. Karlsen
What's up with the questionable ideas of this serialize-unserialize? Does that mean, that all I've been taught about normal forms and such was a big lie? Maybe next time I write a web app, I should just store all the data in ini-files or something?
shylent
One thing I can say for sure is that there is many ways to do this incorrectly, and trying to store many values inside a single field, in whatever format, is definitely one of them. However, there are many "right" ways to do it, and which one is most right depends entirely on your data and their meaning, so without knowing more about the case, it's hard to give a definite answer. Personally I would go for a separate table in nearly all cases, because I know that the future is unknown, and planning for it today might be easy, but changing the past is not.
Lasse V. Karlsen
Well now I've been downvoted, even though I advocate using a normalized secondary table to store data which might change in format on rare occasions, but I was sure to include references to serialize and unserialize because I, if one would care to notice, can read the OP's question, which states, and this is copypasta here, "what is a good way to store these items as an array into 1 table field and then retrieve it as an array on a PHP page?" The two possible answers are "serialize/unserialize" and "use that one column to point at another table." What's your beef?
Dereleased
hey thanks this all makes perfect sense to me, in fact this is how I normally do it, it's just when I saw that on facebook FQL tables, it made me wonder if using a serialised array was a good idea possibly but looks like the concensus is to stick with seperate row for each key/value, thanks.
jasondavis
Glad to see you're going with the voice (consensus) of reason on this one. You might never have the "Aha!" moment with this particular app of extreme gratitude you went in this direction over any other, but never having that still makes the extra 3 minutes of effort worth it to never have the "D'oh!" moment of wishing you hadn't done it the sloppy way.
Dereleased
+1  A: 

you can serialize the array, insert it, and then unserialize it when you retrieve it.

Galen
A: 

OR how about an cross-reference table?

ctshryock
+5  A: 

The correct way to store an array in a database is by storing it as a table, where each element of the array is a row in the table.

Everything else is a hack, and will eventually make you regret your decision to try to avoid an extra table.

Lasse V. Karlsen
A: 

They might be using multiple tables with many-to-many relationships, but use joins and MySql's GROUP_CONCAT function to return the values as an array for those columns in one query.

Ryan B
+2  A: 

Split it out into other tables. You can serialize it but that will guarantee that you will want to query against that data later. Save yourself the frustration later and just split it out now.

Stephen
+4  A: 

I guarantee you that Facebook is not storing that data in arrays inside their database.

The thing you have to realize about FQL is that you are not querying Facebook's main data servers directly. FQL is a shell, designed to provide you access to basic social data without letting you run crazy queries on real servers that have performance requirements. Arbitrary user-created queries on the main database would be functional suicide.

FQL provides a well-designed data return structure that is convenient for the type of data that you are querying, so as such, any piece of data that can have multiple associations (such as "meeting_for") gets packaged into an array before it gets returned as an API result.

As other posters have mentioned, the only way to store a programming language structure (such as an array or an object) inside a database (which has no concept of these things), is to serialize it. Serializing is expensive, and as soon as you serialize something, you effectively make it unusable for indexing and searching. Being a social network, Facebook needs to index and search almost everything, so this data would never exist in array form inside their main schemas.

Usually the only time you ever want to store serialized data inside a database is if it's temporary, such as session data, or where you have a valid performance requirement to do so. Otherwise, your data quickly becomes useless.

zombat
yeah I see that now, it just made me stop and think for a moment when I saw that so I fugured I would ask now, in the early stages of my new project
jasondavis
There are definitely valid reasons to store serialized data, but you don't want to make a habit out of it. ;)
zombat