views:

88

answers:

3

If I want to store some strings or matrices of different sizes in a single variable, I can think of two options: I could make a struct array and have one of the fields hold the data,

structArray(structIndex).structField

or I could use a cell array,

cellArray{cellIndex}

but is there a general rule-of-thumb of when to use which data structure? I'd like to know if there are downsides to using one or the other in certain situations.

+5  A: 

In my opinion it's more a matter of convenience and code clarity. Ask yourself would you prefer to refer your variable elements by number(s) or by name. Then use cell array in former case and struct array in later. Think about it as if you have a table with and without headers.

By the way you can easily convert between structures and cells with CELL2STRUCT and STRUCT2CELL functions.

yuk
With cell arrays you need some meta data in order to identify cell content. Carefully chosen field names make your code self explaining.
zellus
A: 

If you use it for computation within a function, I suggest you use cell arrays, since they're more convenient to handle, thanks e.g. to CELLFUN.

However, if you use it to store data (and return output), it's better to return structures, since the field names are (should be) self-documenting, so you don't need to remember what information you had in column 7 of your cell array. Also, you can easily include a field 'help' in your structure where you can put some additional explanation of the fields, if necessary.

Structures are also useful for data storage since you can, if you want to update your code at a later date, replace them with objects without needing to change your code (at least in case you did pre-assignment of your structure). They have the same sytax, but objects will allow you to add more functionality, such as dependent properties (i.e. properties that are calculated on the fly based on other properties).

Finally, note that cells and structures add a few bytes of overhead to every field. Thus, if you want to use them to handle large amounts of data, you're much better off to use structures/cells containing arrays, rather than having large arrays of structures/cells where the fields/elements only contain scalars.

Jonas
A: 

First and foremost, I second yuk's answer. Clarity is generally more important in the long run.

However, you may have two more options depending on how irregularly shaped your data is:

Option 3: structScalar.structField(fieldIndex)

Option 4: structScalar.structField{cellIndex}

Among the four, #3 has the least memory overhead for large numbers of elements (it minimizes the total number of matrices), and by large numbers I mean >100,000. If your code lends itself to vectorizing on structField, it is probably a performance win, too. If you can't collect each element of structField into a single matrix, option 4 has the notational benefits without the memory & performance advantages of option 3. Both of these options make it easier to use arrayfun or cellfun on the entire dataset, at the expense of requiring you to add or remove elements from each field individually. The choice depends on how you use your data, which brings us back to yuk's answer -- choose what makes for the clearest code.

Arthur Ward