ansaurus

Question

How can I use zlib on a Japanese machine?

Answer 1

+1 A:

Stop using a String when you mean to pass a Byte array?

Of course you're going to get automatic ANSI conversions and data recopying when you use Strings as you do here.

Bob Riemersma 2009-08-27 18:25:43

Using a byte array is annoying. Doing a strconv copy (to convert to ansi) doesn't work perfectly, doing a copymemory includes extraneous nulls.

Brian 2009-08-27 19:59:58

If you absolutely must use strings, use StrPtr so the runtime doesn't convert them to ANSI and back (also, buffer lengths must be doubled)

rpetrich 2009-08-30 11:40:50

Avoiding the byte array doesn't mean you avoid the ANSI conversion. It often just means VB6 does it *implicitly* - for instance when you call a DLL via a Declare and pass a string.

MarkJ 2009-09-02 15:06:56

Answer 2

A:

Bob is right. These are just my footnotes to his answer. Be warned I'm totally unfamiliar with zlib - I'm assuming you're calling a DLL using a Declare for compress.

Using a String instead of a Byte array doesn't mean you avoid the ANSI conversion. It often just means VB6 does the conversion implicitly and you can't control it - for instance when you call a DLL with a Declare statement and pass a string.

It's possible that the magic sequence of bytes returned from the compression is not a valid "ANSI" string on the Japanese code page. Some character sequences are undefined on the MSDN table for that code page. If you are calling a DLL with a Declare statement and expecting a string to be returned into sCompressed, that DLL had better write a valid "ANSI" string into the corresponding buffer. If it writes an invalid sequence of bytes, anything might happen. You will also have trouble on Chinese (936 and 950) and Korean (949).

What you're describing might well happen: when compress returns the invalid sequence of bytes might be converted into a "Unicode" string without errors being reported - perhaps a truncated Unicode string that matches the first portion of your byte sequence. Then, when you later attempt to decompress, that Unicode string is converted back into an ANSI string, and it doesn't match the original byte sequence you started from. It can't possibly match. There's no possible Unicode string that will convert to an "ANSI" string on code page 932 as a sequence of bytes that isn't a valid string.

Here's some more info on the terrible mishmash that is VB6's implementation of Unicode: a free chapter from Michael Kaplan's excellent book Internationalization With Visual Basic

I also suspect you may be confusing the number of characters in a string with the number of bytes it occupies in ANSI representation (I'm suspicious of lStringLen and lcompressedlen). Again, Japanese is a double-byte character set so the ANSI string may take up to 2*N bytes for N characters.

MarkJ 2009-09-02 15:13:09

@MarkJ: I have a dead-tree copy of the Kaplan Book sitting next to me. It's the only book on my desk, actually.

Brian 2009-09-02 17:15:11

Excellent! In that case you should be fine.

MarkJ 2009-09-03 08:22:53

ansaurus

tags:

views:

answers:

How can I use zlib on a Japanese machine?

related questions