tags:

views:

265

answers:

5

The last few days we have had some strange problems with our database components developed by a third party. There has been no changes to these components for months. The code that HAS changed the last few days is our own code and we have also updated our gui-components developed by another third party.

After debugging I have found that a call to System.Move in one of the database component procedures occationaly gives wrong results!

Please take a look at the code below from the database components and read my comments. How can this inconsistent behaviour happen? Can anyone give me an idea of how to procede to find the cause of this inconsistent behaviour? NB! I dont think there is anything wrong with THIS code, it is only shown to explain the problem "symptoms". My guess is that there is some sort of memory corruption or something, caused by our code or the updated gui-component-code.

Edit: Take a look at the blogpost linked below. It seems that it could be related to my problem. At least as I read it it confirms that System.Move can give wrong results: http://blog.excastle.com/2007/08/28/delphi-bug-of-the-day-fpu-stack-leak/

Edit: Sorry for not posting my "solution" earlyer but here it comes: When using Delphi 2007 my problem was solved by using FastMove which replaces System.Move. After upgrading to Delphi 2010 i have yet to encounter the problem, an we are no longer using FastMove.

Procedure InternalDescribe;
var 
  cbufl: sb4; //sb4=LongInt
  cbuf: array[0..30] of char;
  cbufp: PChar;
  //....
begin
  //..Some code
  repeat
    //...Some code to initialize cbufp and cbufl

    //On the 15. iteration the values immediately Before Move are always these:
    //cbufp = 'STDPRODUCTSTOREDELEMENTSCOUNT'
    //cbuf = ('S', 'T', 'A', 'T', 'U', 'S', #0, 'E', 'V', 'A', 'R', 'R', 'E', 'C', 'I', 'D', #0, 'D', 'U', 'C', 'T', 'I', 'D', #0, #0, #0, #0, #0, #0, #0, #0)
    //cbufl = 29

    Move(cbufp^, cbuf, cbufl);

    //Values immediately After Move should then be:
    //cbuf = ('S', 'T', 'D', 'P', 'R', 'O', 'D', 'U', 'C', 'T', 'S', 'T', 'O', 'R', 'E', 'D', 'E', 'L', 'E', 'M', 'E', 'N', 'T', 'S', 'C', 'O', 'U', 'N', 'T', #0, #0)

    //But sometimes this Move results in this value( 1 in 5..15 times):
    //cbuf = ('S', 'T', 'D', 'P', 'R', 'O', 'D', 'U', 'C', 'T', 'S', 'T', 'O', 'R', 'E', 'D', #0, #0, #0, #0, #0, 'N', 'T', 'S', 'C', 'O', 'U', 'N', 'T', #0, #0) }

  until SomeCondition; 
  //...Some more code
end;
A: 

Is it possible to revert back to the older GUI component code without modification of your current code? This way, you could find out if it's your code or the GUI component.

Another question is, whether you're using multiple threads or not.

Edit: I just wanted to you revert the GUI component for testing reasons. You should update them to the newest version. But I have another try for you. Have to tried to zero the buffer before the move operation? See the FillChar procedure to achieve this. Does this help?

Scoregraphic
Yes it is possible to revert back. I had planned to try this but I never got to it. I am actually quite sure the problem comes from the gui-components because when i "detach" the gui from my code i have yet to see the problem. However I don't want to just revert back and be happy with it because we upgraded the components for a reason. But I will try it just to confirm/"disprove" my suspisions. Too bad I will not be at work for a few days.
Fredrik Loftheim
+5  A: 

Move doesn't give wrong results, or at least I've never seen any situation in which it did. It's more likely that you've got something unexpected in the buffer. Try adding calls to Windows.OutputDebugString in this routine to see what you're copying before and after.

Mason Wheeler
I know System.Move don't usually give wrong results but this time it does as far as I can see. If you look at my comments in the code i have actually watched the correct input to System.Move sometimes give the wrong result.
Fredrik Loftheim
+2  A: 

Careful - you're assuming that a Char = 1 byte. That was fine before D2009, but in D2009 and D2010 a char is 2 bytes. Move always works with bytes. Is it possible these problems happened after you upgraded to D2009 or D2010?

Jim
We are using Delphi 2007 so this is not the reason for my problems.
Fredrik Loftheim
A: 

I can confirm that it does fail sometimes. I've just spent a few days tracking it down. Could not believe it. In our case we have .NET 2.0, web site running under IIS 6 or IIS7 calling some COM components written in Delphi 2007, and under a moderate load it would all of sudden start failing to move bytes 16-19 of 28 bytes - sometimes. Most of the time it works. You are omost likely going to stike problems with moves on sizes in the range 9..31 bytes.

We ended up putting a CompareMem() check after each System.Move() and found that the ComparewMem failed sometimes - and this was moving between two buffers/arrays/structures allocated on stack! Boy was I surprised!

Took ages to duplicate. In essence, System.Move from D2006 onwards is unreliable due to stuff getting left on the FPU stack. Would all be fine if the FPU stack was clear.

The blog post entry noted above is correct. HOwever whatever the fix is, it does not effect system.Move() and therefore if you have a DLL/COM written in Delphi 2006 or later you will have problems at some stage.

I checked out D2010 and the code in System.Move has not been changed. In our case, I'm going to revert System.Move to the Delphi 7 version - just recompile all the system units using the make file.

Myles Penlington
A: 

I have the same problem: seems FPU stack is not always properly cleared by PNG + StretchBlt?
Memory corruption in System.Move due to changed 8087CW mode (png + stretchblt)

I think System.Move has to clear the FPU stack before moving?

André
Se my added "solution" in my edited question
Fredrik Loftheim