tags:

views:

245

answers:

5

Using System.Move() to insert/delete item(s) from an array of string is not as easy as insert/delete it from other array of simple data types. The problem is ... string is reference counted in Delphi. Using Move() on reference-counted data types needs deeper knowledge on internal compiler behaviour.

Can someone here explain the needed steps for me to achieve that, or better with some snippet codes, or direct me to a good reference on the internet?

Oh, Please don't tell me to use the "lazy-but-slow way", that is, for loop, I know that.

+2  A: 

To insert a string, simply add a string (the lazy way) to the end of the array (which is an array of pointers), and then use Move to change the order of the elements of this array (of pointers).

Andreas Rejbrand
+2  A: 

If I wanted to insert a string into the middle of a list of strings, I'd use TStringList.Insert. (It does it quickly using System.Move.)

Any particular reason why you're using an array instead of a TStringList?

Mason Wheeler
For performance reason because my array would be possible to be filled with huge amount of items. I realize that TStringList.Capacity will help, but TStringList will still do unnecessary checking--according to the documentation-- "... assigning a value smaller than Count removes strings from the end of the list. Assigning a value greater than Count allocates space for more strings to be added." In my case, I don't need such a check.
Phantom
@Phantom: Well, as long as you measure the actual overhead of the `TStringList` approach. Many people are surprisingly tolerant to delays of a few milliseconds.
Andreas Rejbrand
@Phantom: Fair enough, but there are other performance implications to take into account. For example, if you have to fill your array with a huge number of strings, are you using `SetLength(MyArray, length(MyArray) + 1)` every time? That's a huge number of reallocs and copies, and it can really drag performance down. To head this issue off, you have to allocate more space than you're using, so you need something to keep track of capacity and current count. To manage that, you should encapsulate it within an object... and then you're well on the road to reinventing the TStringList. :P
Mason Wheeler
Yes, Mason is absolutely right. The example in his comment is one of the most common and severe Delphi performance issues. But, @Phantom, since you are talking about `Move`, I'd guess that you already are aware of this issue.
Andreas Rejbrand
@Mason, No, I will allocate memory, just said, for 16384 items, and will allocate more 16384 if it is apparently needed.
Phantom
@Mason, I doesn't intend to reinvent TStringList. There are so many feature in TStringList which I don't need to work in this array. To be honest, the array will be fild with all files in all hard-drives attached to a computer.
Phantom
@Phantom: But you do need to reorder the items of the array?
Andreas Rejbrand
@Andreas, No, the items doesn't need to be sorted.
Phantom
@Phantom: But then I do not see what the problem is. Simply `SetLength(strs, 16384)` (and increase the length in blocks when needed) and then use a for loop to set each string. There is no (reasonable) faster way.
Andreas Rejbrand
AFAIK, because either deletion or insertion items to an array doesn't need any sorting/indexing mechanism.
Phantom
@Andreas, TStringList use System.Move() instead of such a for loop. Looking TStringList implementation at a glance, I couldn't grasp the real requirements to use Move() on reference-counted data types. However, it seems I need to study it deeper, and also the @PatrickvL's comment.
Phantom
@Phantom: `TStringList` (of course!) only uses `Move` if it needs to change the order of the strings, for instance, if you insert a string in the middle of the array (for then all strings after the newly inserted one need to move down the array). If you simply add a string to the bottom of the array, no `Move` is needed, neither in a `array of string` nor in a `TStringList`!
Andreas Rejbrand
@Andreas, oops, I misunderstood what you meant, now I catch it. :-)
Phantom
A: 

If you use System.Move to put items into an array of string, you should be aware that the strings that where there before the Move (and now overwritten), had a reference count of either -1 for constant strings, or > 0 for variable strings. Constant strings should not be altered, but variable strings should be treated accordingly: You should manually lower their reference-count (before they're overwritten!). To do that, you should try something like this:

Dec(PStrRec(IntPtr(SomeString)-12).refCnt);

But if the reference-count reached zero, you should also finalize the associated memory - something Delphi itself does a whole lot better if you let it work it's compiler-magic for strings. Oh, and also : if the strings you're copying come from the same array as your writing into, the needed administration becomes very cumbersome, very quickly!

So if it's in some way possible to avoid all this manual housekeeping, I would advise to let Delphi handle it itself.

PatrickvL
@PatrickvL: So, do you still recommend me to use TStringList even if for my case, that is, the array will be filled with all files in all hard-drives attached to a computer. Will TStringList be fast enough?
Phantom
@Phantom: TStringList was designed for handling large numbers of strings. If you have Delphi you've got Classes.pas; browse through the code to TStringList and TStrings and see how many bits of code you can find that are there specifically for performance reasons. It's one of the most used classes in the entire RTL, and it's been tested extensively for 15 years now. You'd have to work pretty hard to come up with any solution that's noticeably faster than TStringList at string handling. If you want to know if it's slow, try using it and see if it feels slow or not.
Mason Wheeler
@Mason, I will try it.
Phantom
-1 for suggesting manual modification of internal, implementation-defined data structures. You could have just assigned `SomeString := ''` and let the compiler take care of the zero-reference-count situation, as well as thread-safety.
Rob Kennedy
@Rob : I was just answering Phantom's question how to decrease a reference count manually. I believe that's a valid question to answer - even when it's not advisable to do so! (Didn't I mention that in my last statement?) Sure, setting `Somestring := ''` works too, but that relies on compiler magic again, so my answer has some credibility, don't you think?
PatrickvL
No, Patrick, the question did *not* ask how to reduce the reference count of a string. Phantom merely indicated that he knew reference-counting was an issue to be concerned with when using Move, so he asked how to use Move to adjust an array of strings. There was nothing to suggest the array was actually just a blob of untyped memory, so there's no reason to think compiler magic's not available.
Rob Kennedy
+1  A: 

Call UniqueString() on it, before messing with it.

http://docwiki.embarcadero.com/VCL/en/System.UniqueString

Then you have a string with a single reference.

Fat chance that that is what delete and insert do too, and I doubt you'll be faster.

Marco van de Voort
@Marco, thanks, I will also experiment with this stuff as you told.
Phantom
"Fat chance" means something is *unlikely*. For example: "Do you think our boss will give us a bonus this year?" "Fat chance!" Calling UniqueString is *exactly* what Delete and Insert will do when changing the character contents of a string. There's no need to call it when changing the element contents of a dynamic array, though.
Rob Kennedy
Oops, expressions are always pain in foreign languages :-) Anyway, I missed the "array of" bit. Then just finalizing the deleted bits and initializing the non deleted bits (or filling them with zero) would do
Marco van de Voort
+2  A: 

I've demonstrated how to delete items from a dynamic array before:

In that article, I start with the following code:

type
  TXArray = array of X;

procedure DeleteX(var A: TXArray; const Index: Cardinal);
var
  ALength: Cardinal;
  i: Cardinal;
begin
  ALength := Length(A);
  Assert(ALength > 0);
  Assert(Index < ALength);
  for i := Index + 1 to ALength - 1 do
    A[i - 1] := A[i];
  SetLength(A, ALength - 1);
end;

You cannot go wrong with that code. Use whatever value for X you want; in your case, replace it with string. If you want to get fancier and use Move, then there's way to do that, too.

procedure DeleteX(var A: TXArray; const Index: Cardinal);
var
  ALength: Cardinal;
  TailElements: Cardinal;
begin
  ALength := Length(A);
  Assert(ALength > 0);
  Assert(Index < ALength);
  Finalize(A[Index]);
  TailElements := ALength - Index;
  if TailElements > 0 then
    Move(A[Index + 1], A[Index], SizeOf(X) * TailElements);
  Initialize(A[ALength - 1]);
  SetLength(A, ALength - 1);
end;

Since X is string, the Finalize call is equivalent to assigning the empty string to that array element. I use Finalize in this code, though, because it will work for all array-element types, even types that include records, interfaces, strings, and other arrays.

For inserting, you just shift things the opposite direction:

procedure InsertX(var A: TXArray; const Index: Cardinal; const Value: X);
var
  ALength: Cardinal;
  TailElements: Cardinal;
begin
  ALength := Length(A);
  Assert(Index <= ALength);
  SetLength(A, ALength + 1);
  Finalize(A[ALength]);
  TailElements := ALength - Index;
  if TailElements > 0 then begin
    Move(A[Index], A[Index + 1], SizeOf(X) * TailElements);
  Initialize(A[Index]);
  A[Index] := Value;
end;

Use Finalize when you're about to do something that's outside the bounds of the language, such as using the non-type-safe Move procedure to overwrite a variable of a compiler-managed type. Use Initialize when you're re-entering the defined part of the language. (The language defines what happens when an array grows or shrinks with SetLength, but it doesn't define how to copy or delete strings without using a string-assignment statement.)

Rob Kennedy
@Rob, thank you very much. :-) I will try your snippet code, and will accept this as accepted answer if it succeeds. But I have to go to the office now. :-)
Phantom
@Rob, just a suggestion, or maybe a request, we will be happy if you please to update your mentioned article to also include insertion. I like your webpage design.
Phantom