views:

150

answers:

1

We have such a oracle database which contains "Tranditional Chinese" character and english, and the environment is :

PARAMETER   VALUE
NLS_LANGUAGE    AMERICAN
NLS_TERRITORY   AMERICA
NLS_CURRENCY    $
NLS_ISO_CURRENCY    AMERICA
NLS_NUMERIC_CHARACTERS  .,
NLS_CHARACTERSET    WE8PC850
NLS_CALENDAR    GREGORIAN
NLS_DATE_FORMAT DD-MON-RR
NLS_DATE_LANGUAGE   AMERICAN
NLS_SORT    BINARY
NLS_TIME_FORMAT HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT    DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT  HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY   $
NLS_COMP    BINARY
NLS_LENGTH_SEMANTICS    BYTE
NLS_NCHAR_CONV_EXCP FALSE
NLS_NCHAR_CHARACTERSET  UTF8
NLS_RDBMS_VERSION   9.2.0.4.0

And I export all the data in this database to a *.sql file as "ansi" encoding, and when I open it on the same computer, all the chinese characters are corrupted.

And when I import it to another oracle and the environment is :

> NLS_LANGUAGE|AMERICAN
> NLS_TERRITORY|AMERICA NLS_CURRENCY|$
> NLS_ISO_CURRENCY|AMERICA
> NLS_NUMERIC_CHARACTERS|.,
> NLS_CHARACTERSET|WE8MSWIN1252
> NLS_CALENDAR|GREGORIAN
> NLS_DATE_FORMAT|DD-MON-RR
> NLS_DATE_LANGUAGE|AMERICAN
> NLS_SORT|BINARY
> NLS_TIME_FORMAT|HH.MI.SSXFF AM
> NLS_TIMESTAMP_FORMAT|DD-MON-RR
> HH.MI.SSXFF AM
> NLS_TIME_TZ_FORMAT|HH.MI.SSXFF AM TZR
> NLS_TIMESTAMP_TZ_FORMAT|DD-MON-RR
> HH.MI.SSXFF AM TZR NLS_DUAL_CURRENCY|$
> NLS_COMP|BINARY
> NLS_LENGTH_SEMANTICS|BYTE
> NLS_NCHAR_CONV_EXCP|FALSE
> NLS_NCHAR_CHARACTERSET|AL16UTF16
> NLS_RDBMS_VERSION|10.2.0.1.0

All the chinese characters are still corrupted. Could someone give me any advices ?

And i still have another problem,why sometime we can save the "tranditional chinese" or "simplized chinese" into a text file with "ANSI" encoding, and It the characters will not get corrupted. But sometime it will... ??why can someone explain all these strange things ?

Thanks in advance!

A: 

Hi Hooligan,

the reason why the chinese characters get "corrupted" is simply because they are absent from the character set "ANSI" (windows-1252). This character set only includes the latin characters.

By chance you can sometimes get them to be recognized by some applications because some applications are clever enough to understand that you have used the wrong character set (ie they guess that you are wrong, this is the case for notepad for exemple). In your case however, you should either:

  • use a character set that includes the chinese characters, ie: UTF-8,
  • setup your client application to use a character set that includes the chinese characters (NLS_LANG=AMERICAN_AMERICA.UTF8) and use an utility that does the work for you (SQL*Loader, database link ...)
Vincent Malgrat