views:

422

answers:

4

My application makes heavy use of TList, so I was wondering if there are any alternative implementations that are faster or optimized for particular use case.

I know of RtlVCLOptimize.pas 2.77, which has optimized implementations of several TList methods.

But I'd like to know if there is anything else out there. I also don't require it to be a TList descendant, I just need the TList functionality regardless of how it's implemented.

It's entirely possible, given the rather basic functionality TList provides, that there is not much room for improvement, but would still like to verify that, hence this question.

edit: In my particular use case no lists are sorted. There are lots of lists, with various number of elements in. I did replace TList with my own class in order to log number of Add/Remove calls and count of elements. It reports (toatal for all lists):

ListAdd = 15766012; ListRemove = 10630000; ListCount = 5136012

I could also find out what the highest number of elements in a single list is.

I have no particular issue, I just wonder if there is a way to make it faster all around as with these numbers even small improvement would add up.

+1  A: 

The fastest data structure is usually not a data structure at all, but instead a mock that pulls data only as it's needed, much like Virtual Treeview does. Perhaps you can write some sort of TVirtualList that calls the appropriate functions to gather the required data when the elements are requested.

Ignacio Vazquez-Abrams
From a performance point of view that can't be a good idea: TList reads from a block of memory, calling a "callback" method to make up the data as you're working with the list can't possibly be faster. The second problem with this idea is that usually data is not "made up", and I doubt it can be generated "on the fly". VirtualTree-like GUI's assume there's some underlying data structure that holds the data and they call a method to get the data as needed. They do that to avoid duplication of data in memory (and they're faster because they don't duplicate data).
Cosmin Prund
+7  A: 

It looks like you're doing a lot of adds. I don't know how many lists that's spread over, but if your individual lists are growing very large, you might want to implement a list that grows faster.

Take a look at TList.Grow, which is called when you try to add an item to a list where all its array elements are in use, and you'll see that it grows by 25%. This is to keep memory use down to a reasonable level. But if you need really large lists, make your own descendant class and override Grow so that in the second line, instead of Delta := FCapacity div 4 it says Delta := FCapacity. This makes your list grow twice as large each time, which means less reallocs and less copies.

But the thing that's probably killing you is all those Remove calls. Remove has to find the item before it can remove it, which involves a call to IndexOf, which is a linear scan of the entire array. If you've got a large list and you're doing a lot of removes, that's gonna kill your performance.

What are you using these lists for, especially the big ones? Depending on what you're doing with them, there may be better data structures for the job.

Mason Wheeler
Overriding Grow is a nice suggestion. Alternately, List.Capacity could be set when the list is created which would save a lot of realloc as well.
Ken Bourassa
Yeah, that's a good idea too, especially if you have a good estimate as to how many items the list will end up holding.
Mason Wheeler
Related idea: there may be an opportunity to mark items as deleted, and then re-use them instead of actually deleting. Only useful if you don't have to spend a lot of time searching for an "open" (marked deleted) item.
Chris Thornton
+6  A: 

One of the biggest bottleneck I know about TList is the Delete/Extract on large list. Removing item[0] is a lot slower than removing Item[Count-1] because of the memory move that follows it.

For exemple, on a list containing 65536 elements:

while list.Count > 0 do List.Delete(0) //Takes 2 mins to complete

for I := List.Count-1 downto 0 do List.Delete(I) //Takes less than 1 sec

So if you have a TList with millions of elements, deleting a low index item can be expensive performance-wise. Also, consider that having a list that isn't sorted makes it very slow to find an element in it. IndexOf is very slow on large list. You might want to consider keeping the list sorted.

Also, considering your item count can be pretty large, you might want to consider using a List of TList to store your elements, which will help reduce the Delete/Extract overhead I already mentioned.

Ken Bourassa
+5  A: 

Are you using the Notify procedure? If not, make your own TList implementation. Because of the Notify procedure, the TList.Clear (which is called at destruction) is a O(n) operation. The TList.Clear method calls SetCount which in turn calls Delete for all the items it contains so the Notify procedure will be called for every removed item. When you don't need to override the Notify method, you can adjust the SetCount procedure to not call Delete. This can save you the time of 15.766.012 - 10.630.000 = 5.136.012 Delete calls.

NB: the performance gain you get will never be as big as the performance gain you get by sorting your list and optimizing your remove procedure, as suggested by Mason Wheeler. Unless the list you have contain a very small number of items and the compare function takes a lot of time.

The_Fox
Good point, though it's worth noting that if this is a TObjectList and not just a vanilla TList, that Notify does have to be there. If not, though, there's most likely no good reason to use Notify.
Mason Wheeler