tags:

views:

165

answers:

4

My application builds many objects in memory based on filenames (among other string based information). I was hoping to optimise memory usage by storing the path and filename separately, and then sharing the path between objects in the same path. I wasn't trying to look at using a string pool or anything, basically my objects are sorted so if I have 10 objects with the same path I want objects 2-10 to have their path "pointed" at object 1's path (eg object[2].Path=object[1].Path);

I have a problem though, I don't believe that my objects are in fact sharing a reference to the same string after I think I am telling them to (by the object[2].Path=object[1].Path assignment).

When I do an experiment with a string list and set all the values to point to the first value in the list I can see the "memory conservation" in action, but when I use objects I see absolutely no change at all, admittedly I am only using task manager (private working set) to watch for memory use changes.

Here's a contrived example, I hope this makes sense.

I have an object:

TfileObject=class(Tobject)
  FpathPart: string;
  FfilePart: string;
end;

Now I create 1,000,000 instances of the object, using a new string for each one:

var x: integer;
MyFilePath: string;
fo: TfileObject;
begin
  for x := 1 to 1000000 do
  begin
    // create a new string for every iteration of the loop
    MyFilePath:=ExtractFilePath(Application.ExeName);
    fo:=TfileObject.Create;
    fo.FpathPart:=MyFilePath;
    FobjectList.Add(fo);
  end;
end;

Run this up and task manager says I am using 68MB of memory or something. (Note that if I allocated MyFilePath outside of the loop then I do save memory because of 1 instance of the string, but this is a contrived example and not actually how it would happen in the app).

Now I want to "optimise" my memory usage by making all objects share the same instance of the path string, since it's the same value:

var x: integer; begin for x:=1 to FobjectList.Count-1 do begin TfileObject(FobjectList[x]).FpathPart:=TfileObject(FobjectList[0]).FpathPart; end; end;

Task Manager shows absouletly no change.

However if I do something similar with a TstringList:

var x: integer;
begin
  for x := 1 to 1000000 do
  begin
    FstringList.Add(ExtractFilePath(Application.ExeName));
  end;
end;

Task Manager says 60MB memory use.

Now optimise with:

var x: integer;
begin
  for x := 1 to FstringList.Count - 1 do
    FstringList[x]:=FstringList[0];
end;

Task Manager shows the drop in memory usage that I would expect, now 10MB.

So I seem to be able to share strings in a string list, but not in objects. I am obviously missing something conceptually, in code or both!

I hope this makes sense, I can really see the ability to conserve memory using this technique as I have a lot of objects all with lots of string information, that data is sorted in many different ways and I would like to be able to iterate over this data once it is loaded into memory and free some of that memory back up again by sharing strings in this way.

Thanks in advance for any assistance you can offer.

PS: I am using Delphi 2007 but I have just tested on Delphi 2010 and the results are the same, except that Delphi 2010 uses twice as much memory due to unicode strings...

A: 

Because task manager does not tell you the whole truth. Compare with this code:

var
  x: integer;
  MyFilePath: string;
  fo: TfileObject;
begin  
  MyFilePath:=ExtractFilePath(Application.ExeName);
  for x := 1 to 1000000 do
  begin
    fo:=TfileObject.Create;
    fo.FpathPart:=MyFilePath;
    FobjectList.Add(fo);
  end;
end;
Pham
Pham, thanks, I know about this one. That definitely works, each object only stores the one instance of "MyFilePath" and memory usage it massively reduced. What I am looking at doing is setting the path AFTER it has been allocated the first time. One of the primary reasons for this is that I have a lot of objects that get loaded from a stream and at that time each object gets its own version of the same string. I was simply hoping to iterate over the list and set them to the "shared" version of the string. Task manager recognises when this happens with Tstringlist, not with the objects.
Jenakai
+4  A: 

When your Delphi program allocates and deallocates memory it does this not by using Windows API functions directly, but it goes through the memory manager. What you are observing here is the fact that the memory manager does not release all allocated memory back to the OS when it's no longer needed in your program. It will keep some or all of it allocated for later, to speed up later memory requests in the application. So if you use the system tools the memory will be listed as allocated by the program, but it is not in active use, it is marked as available internally and is stored in lists of usable memory blocks which the MM will use for any further memory allocations in your program, before it goes to the OS and requests more memory.

If you want to really check how any changes to your programs affect the memory consumption you should not rely on external tools, but should use the diagnostics the memory manager provides. Download the full FastMM4 version and use it in your program by putting it as the first unit in the DPR file. You can get detailed information by using the GetMemoryManagerState() function, which will tell you how much small, medium and large memory blocks are used and how much memory is allocated for each block size. For a quick check however (which will be completely sufficient here) you can simply call the GetMemoryManagerUsageSummary() function. It will tell you the total allocated memory, and if you call it you will see that your reassignment of FPathPart does indeed free several MB of memory.

You will observe different behaviour when a TStringList is used, and all strings are added sequentially. Memory for these strings will be allocated from larger blocks, and those blocks will contain nothing else, so they can be released again when the string list elements are freed. If OTOH you create your objects, then the strings will be allocated alternating with other data elements, so freeing them will create empty memory regions in the larger blocks, but the blocks won't be released as they contain still valid memory for other things. You have basically increased memory fragmentation, which could be a problem in itself.

mghie
Thanks mghie, that explains the answer due to the different behaviour between the TstringList and the objects.
Jenakai
+1  A: 

As noted by another answer, memory that is not being used is not always immediately released to the system by the Delphi Memory Manager.

Your code guarantees a large quantity of such memory by dynamically growing the object list.

A TObjectList (in common with a TList and a TStringList) uses an incremental memory allocator. A new instance of one of these containers starts with memory allocated for 4 items (the Capacity). When the number of items added exceeds the Capacity additional memory is allocated, initially by doubling the capacity and then once a certain number of items has been reached, by increasing the capacity by 25%.

Each time the Count exceeds the Capacity, additional memory is allocated, the current memory copied to the new memory and the previously used memory released (it is this memory which is not immediately returned to the system).

When you know how many items are to be loaded into one of these types of list you can avoid this memory re-allocation behaviour (and achieve a significant performance improvement) by pre-allocating the Capacity of the list accordingly.

You do not necessarily have to set the precise capacity needed - a best guess (that is more likely to be nearer, or higher than, the actual figure required is still going to be better than the initial, default capacity of 4 if the number of items is significantly > 64)

Deltics
+1 for that nice hint!
Smasher
A: 

To share a reference, strings need to be assigned directly and be of the same type (Obviously, you can't share a reference between UnicodeString and AnsiString).

The best way I can think of to achieve what you want is as follow:

var  StrReference : TStringlist; //Sorted

function GetStrReference(const S : string) : string;
var idx : Integer;
begin
  if not StrReference.Find(S,idx) then
    idx := StrReference.Add(S);
  Result := StrReference[idx];
end;

procedure YourProc;
var x: integer;
MyFilePath: string;
fo: TfileObject;
begin
  for x := 1 to 1000000 do
  begin
    // create a new string for every iteration of the loop
    MyFilePath    := GetStrReference(ExtractFilePath(Application.ExeName));
    fo            := TfileObject.Create;
    fo.FpathPart  := MyFilePath;
    FobjectList.Add(fo);
  end;
end;

To make sure it has worked correctly, you can call the StringRefCount(unit system) function. I don't know in which version of delphi that was introduced, so here's the current implementation.

function StringRefCount(const S: UnicodeString): Longint;
begin
  Result := Longint(S);
  if Result <> 0 then
    Result := PLongint(Result - 8)^;
end;

Let me know if it worked as you wanted.

EDIT: If you are afraid of the stringlist growing too big, you can safely scan it periodically and delete from the list any string with a StringRefCount of 1.

The list could be wiped clean too... But that will make the function reserve a new copy of any new string passed to the function.

Ken Bourassa