Saturday, June 10, 2006

Wordsegmentation .. for Thai

When you want to work with languages that are not familiar to you, you hear of all kinds of issues with languages that are completely new. The Thai language is one such for me; when you have a sentence in Thai, you do not have spaces between the words and consequently it is not clear to me where I could break a sentence to a new line. This is really relevant for the localisation of MediaWiki; when a text is translated, the translator does not know where it has to fit. It is therefore important to be able to know how to do this.

Luckily there is software , even GPL software that does wordsegmentation for Thai, the next thing is how do we make it available in MediaWiki and, how are we going to use the results.



Post a Comment

Links to this post:

Create a Link

<< Home