ansaurus

Question

How to store a list in a column of a database table.

Answer 1

A:

I'd just store it as CSV, if it's simple values then it should be all you need (XML is very verbose and serializing to/from it would probably be overkill but that would be an option as well).

Here's a good answer for how to pull out CSVs with LINQ.

David Neale 2010-06-18 14:23:07

I though about that. It still means that I would have to serialize and deserialize... but I suspect that's doable. I wish there was some *condoned* way to do what I want, but I suspect there isn't.

John Berryman 2010-06-18 14:38:32

Answer 2

+10 A:

No, there is no "better" way to store a sequence of items in a single column. Relational databases are designed specifically to store one value per row/column combination. In order to store more than one value, you must serialize your list into a single value for storage, then deserialize it upon retrieval. There is no other way to do what you're talking about (because what you're talking about is a bad idea that should, in general, never be done).

I understand that you think it's silly to create another table to store that list, but this is exactly what relational databases do. You're fighting an uphill battle and violating one of the most basic principles of relational database design for no good reason. Since you state that you're just learning SQL, I would strongly advise you to avoid this idea and stick with the practices recommended to you by more seasoned SQL developers.

The principle you're violating is called first normal form, which is the first step in database normalization.

At the risk of oversimplifying things, database normalization is the process of defining your database based upon what the data is, so that you can write sensible, consistent queries against it and be able to maintain it easily. Normalization is designed to limit logical inconsistencies and corruption in your data, and there are a lot of levels to it. The Wikipedia article on database normalization is actually pretty good.

Basically, the first rule (or form) of normalization states that your table must represent a relation. This means that:

You must be able to differentiate one row from any other row (in other words, you table must have something that can serve as a primary key. This also means that no row should be duplicated.
Any ordering of the data must be defined by the data, not by the physical ordering of the rows (SQL is based upon the idea of a set, meaning that the only ordering you should rely on is that which you explicitly define in your query)
Every row/column intersection must contain one and only one value

The last point is obviously the salient point here. SQL is designed to store your sets for you, not to provide you with a "bucket" for you to store a set yourself. Yes, it's possible to do. No, the world won't end. You have, however, already crippled yourself in understanding SQL and the best practices that go along with it by immediately jumping into using an ORM. LINQ to SQL is fantastic, just like graphing calculators are. In the same vein, however, they should not be used as a substitute for knowing how the processes they employ actually work.

Your list may be entirely "atomic" now, and that may not change for this project. But you will, however, get into the habit of doing similar things in other projects, and you'll eventually (likely quickly) run into a scenario where you're now fitting your quick-n-easy list-in-a-column approach where it is wholly inappropriate. There is not much additional work in creating the correct table for what you're trying to store, and you won't be derided by other SQL developers when they see your database design. Besides, LINQ to SQL is going to see your relation and give you the proper object-oriented interface to your list automatically. Why would you give up the convenience offered to you by the ORM so that you can perform nonstandard and ill-advised database hackery?

Adam Robinson 2010-06-18 14:25:01

So you believe strongly that storing a list in a column is a bad idea, but you fail to mention why. Since I'm just starting out with SQL, a little bit of the "why" would be very helpful indeed. For instance, you say that I'm "fighting an uphill battle and violating one of the most basic principles of relational database design for no good reason" ... so what is the principle? Why are the reasons that I cited "no good"? (specifically, the sorted and atomic nature of my lists)

John Berryman 2010-06-18 14:35:12

Basically, it comes down to years of experience condensed into best practices. The basic principal in question is known as 1st [Normal Form](http://en.wikipedia.org/wiki/Database_normalization#Normal_forms).

Toby 2010-06-18 14:46:11

@John: See if the edit helps explain some things.

Adam Robinson 2010-06-18 14:53:24

Thanks Adam. Very informative. Good point with your last question.

John Berryman 2010-06-18 15:05:01

I can only agree with this. In my view there is only one exception to the rule of not storing multiple values in a single column. An attribute that is a set of individual enumerate values.Eg. TEnum = (mcUgly, mcEvil, mcBad) and a property Character: set of TEnum. So Character can be [mcUgly, mcBad] or [mcEvil, mcBad], or ...Character can be stored as an integer. Storing it as a csv of the individual enumerate values can be more self-explanatory.Only reason it would be acceptable is that Character is still a single attribute and the enum values are finite (hardcoded).

Marjan Venema 2010-06-19 11:25:44

Answer 3

+1 A:

If you need to query on the list, then store it in a table.

If you always want the list, you could store it as a delimited list in a column. Even in this case, unless you have VERY specific reasons not to, store it in a lookup table.

hometoast 2010-06-18 14:28:01

Answer 4

+1 A:

You can just forget SQL all together and go with a "NoSQL" approach. RavenDB, MongoDB and CouchDB jump to mind as possible solutions. With a NoSQL approach, you are not using the relational model..you aren't even constrained to schemas.

jaltiere 2010-06-18 14:34:44

Answer 5

A:

If you really wanted to store it in a column and have it queryable a lot of databases support XML now. If not querying you can store them as comma separated values and parse them out with a function when you need them separated. I agree with everyone else though if you are looking to use a relational database a big part of normalization is the separating of data like that. I am not saying that all data fits a relational database though. You could always look into other types of databases if a lot of your data doesn't fit the model.

David Daniel 2010-06-18 14:35:18

Answer 6

+2 A:

In addition to what everyone else has said, I would suggest you analyze your approach in longer terms than just now. It is currently the case that items are unique. It is currently the case that resorting the items would require a new list. It is almost required that the list are currently short. Even though I don't have the domain specifics, it is not much of a stretch to think those requirements could change. If you serialize your list, you are baking in an inflexibility that is not necessary in a more-normalized design. Btw, that does not necessarily mean a full Many:Many relationship. You could just have a single child table with a foreign key to the parent and a character column for the item.

If you still want to go down this road of serializing the list, you might consider storing the list in XML. Some databases such as SQL Server even have an XML data type. The only reason I'd suggest XML is that almost by definition, this list needs to be short. If the list is long, then serializing it in general is an awful approach. If you go the CSV route, you need to account for the values containing the delimiter which means you are compelled to use quoted identifiers. Persuming that the lists are short, it probably will not make much difference whether you use CSV or XML.

Thomas 2010-06-18 14:39:48

+1 for anticipating future changes - always design your data model to be extensible.

coolgeek 2010-06-18 15:27:41

Answer 7

A:

Only one option doesn't mentioned in the answers. You can de-normalize your DB design. So you need two tables. One table contains proper list, one item per row, another table contains whole list in one column (coma-separated, for example).

Here it is 'traditional' DB design:

List(ListID, ListName) 
Item(ItemID,ItemName) 
List_Item(ListID, ItemID, SortOrder)

Here it is de-normalized table:

Lists(ListID, ListContent)

The idea here - you maintain Lists table using triggers or application code. Every time you modify List_Item content, appropriate rows in Lists get updated automatically. If you mostly read lists it could work quite fine. Pros - you can read lists in one statement. Cons - updates take more time and efforts.

Alsin 2010-06-18 15:01:36

ansaurus

tags:

views:

answers:

How to store a list in a column of a database table.

related questions