views:

23

answers:

1

Hi everybody,

Does somebody know how ICU Charset Detector's data is built. And is it difficult to add additional languages?

For example, I saw in the bug tracker that a ticket for the detection of Thai is opened since 2007 but nothing new until today.

Thanks

A: 

I would ask your question on the ICU mailing list or even file a bug and say you are willing to put in the work/data to do it. I couldn't find the ticket you referred to, but ICU is open source, so if you are willing to contribute time and data, that would make a difference in implementation.

Steven R. Loomis