views:

190

answers:

4

I'm working with a team for a bigger application with Delphi 2007. It use a bigger lecacy framework to access the data. Both the app and framework use String as datatype for strings. I have started to modify the code in framework to support Delphi 2009 strings, see my previous questions about this.

I see 2 alternatives now:

Alt 1 - Continue to use string as before. This is probably the cleanest solution as the framework will then supports Unicode. But the code in framework must be modified a lot to make this working. This require in depth understanding of the internal algorithms in framework. It is also a bigger chance to introduce new bugs.

Alt 2 - Replace String with AnsiString and Char with AnsiChar. This is propably a much easier solution and also how I start to modify the code (but then I start thinking and ask this question...). The negative side of this is no support for Unicode. Unicode support is not a requirement as it worked before but is nice to have. It could also be useful in the future. Another problem is that the application must send Ansistring variables as parameters in the methods for the framework instead of String as before. There are thousands of calls to change...

So I don't know right now. Both options require a lot of work, but Alt 1 is probably more risky and time consuming. What I want from this forum is feedback and comments as I guess I am not the first who have this problem.

EDIT Another issue is the memory consumption. I wrote a quick test that allocate an array of one million strings. Each string was filled with 26 chars from A to Z.

With Delphi 2007 it took 40.011.600 bytes and the time was 4:15 minutes. With Delphi 2009 it took 72.015.580 bytes and the time was 4:45 minutes. The memory consumption was measure with GetHeapStatus.TotalAllocated.

I don't think we can afford to have the strings allocate twice as much memory. It is not unusual to have 500 MB in memory consumption for each client now. I guess much of this are as strings. Propably we try to use AnsiString as much as possible.

Regards

+1  A: 

It'll be more work, but I'd really recommend that you upgrade to Unicode strings, because that's the native string type of the VCL and so all your controls will be dealing with Unicode strings anyway. Trying to convert everything back and forth will cause you all sorts of hassles.

Mason Wheeler
As I said this is the cleanest solution. What worries me is that we have to change code that we not fully understand in the framework. This require much testing to ensure quality.
Roland Bengtsson
+3  A: 

Either stay with the old version of Delphi, or go all the way. You'll have to sooner and later anyway.

Note that the "replace everything with ansistring" scheme is also not entirely foolproof, specially if you touch streams and your fileformats need to stay the same. There are no explicite TStringlists,tstringstreams etc with ansistring anymore.

The same probably goes for Datasnap, Indy and other frameworks.

You can try to use this trick for certain string intensive parts at first, to avoid changing too much code directly. E.g. I had an own XML library, which I patched to remain mostly ansistring. The library was only used sideways, and unicode was of no importance to it.

Marco van de Voort
Yes, I have noticed that the AnsiString replace track is not 100%.
Roland Bengtsson
+1  A: 

Start with "alt 2", then gradually add unicode support to your framework, then move over to Unicode.

Rationale: you want a stable app; switching over to Delphi 2009+ will eventually require you to really support Unicode.

Edit: 20100125

While doing "alt 2" watch the Delphi compiler hints warnings.
The situation that Andreas describes will generate such hints and warnings.

I have explained this in my CodeRage 4 session about Unicode and other encodings.
The above link points to a page where you can view the replay of that session.

If you still have questions, just drop them here.

--jeroen

Jeroen Pluimers
Alt 2 could cause him more trouble than Alt 1 because the RTL and VCL are Unicode. For example if he uses the "Pos" function the compiler will prefer the Unicode version unless both parameters are AnsiStrings. If the first is a string literal or a char literal, the compiler will use the Unicode version. And there are some multibyte strings that are represented in Unicode by a different number of characters what causes Pos to return the wrong index.
Andreas Hausladen
Interesting, have you any reference or real cases where this can happen ?
Roland Bengtsson
I think he means to say that the conversion from literals to unicode is prefered over literals to ansistrings when evaluating which version to call. I think that is correct, though it shouldn't be overestimated. I'd expect that (ansistring,char) would pick the ansistring version though.
Marco van de Voort
+2  A: 

We evaluated the transition 2007 -> 2009 a year ago and tried a a smaller project (200k lines). The result was that everywhere where you do not use "fancy" things like pointers, set of char etc the porting is really not that difficult . Especially the GUI units we're ported within a day or so. This is equivalent to opt1.

The library units with low level routines, access to measurement systems etc etc was a whole different story. Here we choose to translate string -> ansistring, char -> ansichar etc etc. Porting these units is a pain to get correct and the customer won't pay for the transition. Hence opt2 for those units.

This mixed method gave us best of both worlds but we will keep some larger projects at Delphi 2007 and probably only port when a 64 bit version of the compiler will come out.

Ritsaert Hornstra
Probably this will be the way we do the conversation, but the whole team have to discuss and plan this.
Roland Bengtsson