views:

550

answers:

6

I am maintaining a fixed-length table of 10 entries. Each item is a structure of like 4 fields. There will be insert, update and delete operations, specified by numeric position. I am wondering which is the best data structure to use to maintain this table of information:

  1. array - insert/delete takes linear time due to shifting; update takes constant time; no space is used for pointers; accessing an item using [] is faster.

  2. stl vector - insert/delete takes linear time due to shifting; update takes constant time; no space is used for pointers; accessing an item is slower than an array since it is a call to operator[].

  3. stl list - insert and delete takes linear time since you need to iterate to a specific position before applying the insert/delete; additional space is needed for pointers; accessing an item is slower than an array since it is a call to operator[].

Right now, my choice is to use an array. Is it justifiable? Or did I miss something?

Which is faster: traversing a list, then inserting a node or shifting items in an array to produce an empty position then inserting the item in that position?

What is the best way to measure this performance? Can I just display the timestamp before and after the operations?

+5  A: 

Use STL vector. It provides an equally rich interface as list and removes the pain of managing memory that arrays require.

You will have to try very hard to expose the performance cost of operator[] - it usually gets inlined.

I do not have any number to give you, but I remember reading performance analysis that described how vector<int> was faster than list<int> even for inserts and deletes (under a certain size of course). The truth of the matter is that these processors we use are very fast - and if your vector fits in L2 cache, then it's going to go really really fast. Lists on the other hand have to manage heap objects that will kill your L2.

Frank Krueger
I read somewhere in some post it was looping through more than 1000 items comparing array and vector and in the vector's case, it was mentioned it took longer... and the answers said that it is because it is a template, using operator[] produces overhead...?
jasonline
@jasonline I use vectors all over a game I wrote for the iPhone. I have never once run into a performance issue with the use of operator[]. These are what we call "micro optimizations" - they're micro because they have little effect on the overall performance of the app. While I spent years learning the lessons of micro optimization, I highly recommend you avoid them and focus on creating a great app.
Frank Krueger
Actually vector<int> is always faster than list<int> for inserts and deletes **if you have to search for the element**, especially on large datasets. While vector operations are always O(N) and list operations are on average O(N/2), the list operations are dominated by unordered memory access time which makes them individually much more expensive than the extra N/2 operations the vector version uses.
Don Neufeld
Ok, you mentioned vector<int> is faster under a certain size. In what case will it be that list<int> will become faster than the vector<int>?
jasonline
I also said I had no numbers. :-) At the time, the number was around 10,000. But that was years ago and processors and memory buses have changed since then. Fortunately, these are trivial performance tests to run. Write a test app and find out for yourself. Also take into account the @don.neufeld.
Frank Krueger
Profile it and find out, really. There's no way to know. We can only make educated guesses, and often those are wrong anyway.
GMan
@Frank Krueger: Thanks. I was just curious because I never considered that if a vector<int> is faster than list<int>, there will come a time when list<int> will become faster than the vector<int> due to its size.
jasonline
+7  A: 

Premature optimization is the root of all evil.

Based on your post, I'd say there's no reason to make your choice of data structure here a performance based one. Pick whatever is most convenient and return to change it if and only if performance testing demonstrates it's a problem.

Don Neufeld
Yes, I know the performance doesn't matter so much in this case since it is only a few data but I was just wondering...
jasonline
When your writing small application your faced with Log(1) problems. When you've gotten the basics down, you can start to know what you need to scale too. When you know what you need to scale too, not some theoretical limit eg..."We need to scale in the future". Then you can choose algorithms and data structures to suit. Until then, just use std::vector.
Chad
+2  A: 

You're making assumptions you shouldn't be making, such as "accessing an item is slower than an array since it is a call to operator[]." I can understand the logic behind it, but you nor I can know until we profile it.

If you do, you'll find there is no overhead at all, when optimizations are turned on. The compiler inlines the function calls. There is a difference in memory performance. An array is statically allocated, while a vector dynamically allocates. A list allocates per node, which can throttle cache if you're not careful.

Some solutions are to have the vector allocate from the stack, and have a pool allocator for a list, so that the nodes can fit into cache.

So rather than worry about unsupported claims, you should worry about making your design as clean as possible. So, which makes more sense? An array, vector, or list? I don't know what you're trying to do so I can't answer you.

The "default" container tends to be a vector. Sometimes an array is perfectly acceptable too.

GMan
Yes, I can agree with the default container being the vector.
jasonline
+3  A: 

Prefer an std::vector over and array. Some advantages of vector are:

  • They allocate memory from the free space when increasing in size.
  • They are NOT a pointer in disguise.
  • They can increase/decrease in size run-time.
  • They can do range checking using at().
  • A vector knows its size, so you don't have to count elements.

The most compelling reason to use a vector is that it frees you from explicit memory management, and it does not leak memory. A vector keeps track of the memory it uses to store its elements. When a vector needs more memory for elements, it allocates more; when a vector goes out of scope, it frees that memory. Therefore, the user need not be concerned with the allocation and deallocation of memory for vector elements.

Vijay Mathew
Regarding the size, I don't think it matters in this case because I'm maintaining a fixed size of 10. I agree though that at() provides security in accessing elements.
jasonline
@jasonline Well, if all you want is to maintain a list of 10 elements and don't need the conveniences offered by the standard library (like templates), just use a statically allocated array. C++ is a flexible language, anyway!
Vijay Mathew
+1  A: 

For data size as small as 10 entries, i dont think you would see any noticeable difference in performance. Try with a data size of 1 mil numbers and check. If possible post the numbers here as well :)

Pigol
+2  A: 

First a couple of notes:

A good rule of thumb about selecting data structures: Generally, if you examined all the possibilities and determined that an array is your best choice, start over. You did something very wrong.

STL lists don't support operator[], and if they did the reason that it would be slower than indexing an array has nothing to do with the overhead of a function call.

Those things being said, vector is the clear winner here. The call to operator[] is essentially negligible since the contents of a vector are guaranteed to be contiguous in memory. It supports insert() and erase() operations which you would essntially have to write yourself if you used an array. Basically it boils down to the fact that a vector is essentially an upgraded array which already supports all the operations you need.

Graphics Noob
Yes, you're right. Lists don't have operator[], my mistake. What made my choice of array wrong? Because of the vector operator[] and the thought that I still need to implement the operations in an array compared to the vector. Right?
jasonline
Vijay Mathew did a good job outlining the benefits of the vector over the array, but also keep in mind that since the vector keeps it's contents contiguous in memory the array has no benefits over the vector. You aren't entirely wrong to assume that operator[] has worse performance than indexing an array, but under the hood the complier will probably reduce them to exactly the same thing.
Graphics Noob