views:

99

answers:

2

Hi

While exploring globalsight.com ,I came across the segmentation rules(link).It uses full stop(.) as a language delimiter. which segmentaion rules can we use for segment the non latin based Languages for which a dot(.) mean something other than a delimiter or for the languages which don't have any delimiters Example –Chinese,Japanese and Korean

What are the language segmentation rules used for these “non latin”(Chinese,Japanese) languages? How are the segmentation rules developed ?

Thanks in advance, Manjushree

+1  A: 

Japanese uses kinsoku shori. Not sure about the other two though.

Ignacio Vazquez-Abrams
A: 

Trados, the leading translation memory application, uses the following segmentation rules:

For Japanese and Chinese:

Full Stop:

Colons: ::

Punctuation: ?!?!

Mike Sickler
Can you please provide more detailed explanation.
Manjushree