questions about cjk | ansaurus

cjk

Why doesn't my CID (type 11) font work in GS8.61 on Windows

I have a customer in Macau that uses Windows EUDC for custom Big5 glyphs. I used Fontforge on Linux to convert the .TTE into a type 11 (CID type 2) font and created a custom CMap to map the Big5 code points to the correct glyph in the font. This all works fine and dandy in GS8.60 on Windows and GS8.61 - GS8.63 on Linux. When loading t...

Validating Kana Input

I am working on an application that allows users to input Japanese language characters. I am trying to come up with a way to determine whether the user's input is a Japanese kana (hiragana, katakana, or kanji). There are certain fields in the application where entering Latin text would be inappropriate and I need a way to limit certain ...

language-agnostic

File upload mojibake

How do you do a file upload in an HTML form without running into mojibake? I have a form that has three fields: a file field a required text field a text field which accepts Japanese characters I've set up my HTML form with the attribute enctype='multipart/form-data'. But when the form submission fails due to the missing required f...

character-encoding

Newline control characters in multi-byte character sets

I have some Perl code that translates new-lines and line-feeds to a normalized form. The input text is Japanese, so that there will be multi-byte characters. Is it still possible to do this transformation on a byte-by-byte basis (which I think it currently does), or do I have to detect the character set and enable Unicode support? In ot...

character-encoding

Japanese characters in a latex \section{} cause an error.

I am working on getting Japanese documents created with latex. I have installed the latest version of texlive-2008 which includes CJK. In my document I have the following: \documentclass{class} \usepackage{CJK} \begin{document} \begin{CJK*}{UTF8}{min} \title{[Japanese Characters here 1]} \maketitle \section{[Japanese Characters here 2]...

how to make vim recognize CJK characters and render them larger than ASCII?

Hello there, I am using vim to work on both Chinese and Western text. The default font size is okay for Western text, but the Chinese characters, while readable, are too small for my taste. Can I tell vim to render CJK fonts with, say, 14pt while not affecting the font size of all other text? Thanks for your ideas/solutions! Guba ...

Zend_Lucene CJK support

Does someone know if Zend_Lucene class support CJK (Chinese Japanese Korean). I want to use it on my own website the only problem it should work for both English and Japanese language. Also if someone has some ressource about CJK version of the Java version would be appreciated also. Thanks ...

Word break in languages without spaces between words (e.g., Asian)?

I'd like to make MySQL full text search work with Japanese and Chinese text, as well as any other language. The problem is that these languages and probably others do not normally have white space between words. Search is not useful when you must type the same sentence as is in the text. I can not just put a space between every charact...

full-text-search

Detect CJK characters in PHP

Hello, I've got an input box that allows UTF8 characters -- can I detect whether the characters are in Chinese, Japanese, or Korean programmatically (part of some Unicode range, perhaps)? I would change search methods depending on if MySQL's fulltext searching would work (it won't work for CJK characters). Thanks! ...

language-detection

xelatex Invalid fontname

I want to use the openoffice chinese fonts, eg AR PL SungtiL GB, but the xelatex tells me that it is an invalid name (as shown below). It seems like the font name has spaces and so it doesn't recognize it? How should I get around this? (/usr/share/texmf-texlive/tex/latex/base/syntonly.sty)kpathsea: Invalid fontname `AR PL SungtiL GB', ...

MySQL search Chinese characters

Hello, Let's say I have a row: 一天吃一個蘋果 Someone enters as a query: 天蘋 Should I break up the characters in the query, and individually perform a LIKE % % match on each character against the row, or is there any easier way to get a row that contains one of the two characters? FULLTEXT won't work with CJK characters. Thanks! ...

Is it possible to use big5 character set in Drupal?

A potential client of ours wants to use the big5 character encoding on their new website. Is it possible to use the Drupal CMS to make a site in big5? Its possible this questions does not make sense because I don't find many google results with big5 drupal as a keyword... Please help! ...

internationalization

chinese-characters

How to pdflatex with CJK characters/font/encoding

What's the best way to combine pdflatex with CJK characters/font/encoding? I'd like to generate pdf that includes CJK characters, and in the future all possible unicode characters. I'm thinking about using 'The CJK package for LaTeX' for cjk characters specifically but it seems not to be maintained since 2006. Can you suggest somethi...

How do I use Unicode Character Combining with Kanji/Hanzi ?

I'm trying to find a workaround to display old and rare characters in unicode using character combining. Currently I'm converting some dictionaries from EPWING into text and there are 36 different characters which cannot be reproduced using normal UTF-8. Below is the problem section of the epwing gaiji to unicode mappings for one of the ...

Cleaning up UTF-16/CJK characters using PHP?

I have some files on my computer that are in UTF-16, though this seems to be because of errors or corruption of the files rather than intent - they're supposed to be plain english. I uploaded one of these (here). If I leave the encoding in Firefox (Viwe>Character Encoding) at UTF-8 then I get tons of gibberish (see screenshot). If I chan...

character-encoding

How to classify Japanese characters as either Kanji or Kana

Given such text 誰か確認上記これらのフ How can I classify each character as kana or kanji? To get some thing like this 誰 - kanji か - kanji 確 - kanji 認 - kanji 上 - kanji 記 - kanji こ - kana れ - kana ら - kana の - kana フ - kana (sorry if I did it incorrectly) ...

How do you sort CJK (Asian) characters in Perl, or with any other programming language?

How do you sort Chinese, Japanese and Korean (CJK) characters in Perl? As far as I can tell, sorting CJK characters by stroke count, then by radical, seems to be the way these languages are sorted. There are also some methods that sort by sounds, but this seems less common. I've tried using: perl -e 'print join(" ", sort qw(工然一人三 ...

chinese-characters

1