Saturday, October 07, 2006

WiktionaryZ and RFC 4646

The RFC 4646 provides many things that I will gladly follow particularly in the way it allows for extentions. I have discussed the issues that I have with Felix Sasaki (W3C) and Gerhard Budin (ISO) among others.
  • Many recognized languages have to "fit" as a subtag of another language. This is for many people who think "politically" about languages not acceptable. The way it is implemented is also a travesty for people who know about languages. The motivation is to provide backwards compatibility even though ISO-639-2 was dismissed as useless and ISO-639-3 had to be created really quickly.
  • Some of the things not standardized yet in the ISO codes are worked on as new standards (eg dialects) the proposed codes are not known at this time and this creates a mess of its own.
  • Orthographies are not supported in the planned for extentions to ISO-639. This is acknowledged as an omission.
Consequently, it is much better to make a clean break while still conforming to standards. The standards complicance existing in WiktionaryZ is better than the standards compliance of the current crop of Wikimedia language codes. They do not conform to a standard because it breaks standards in many places. Some of the language codes used are voted in with a total disregard of what would be valid vis a vis the terms of usage of the ISO-639 codes.

Let me repeat that WiktionaryZ does include a code for Wikimedia language codes; it makes sense to have backwards compatibility. The basis for inclusion at WiktionaryZ for languages at this time are the ISO-639-3 codes. When need be the ISO-639-3 codes are extended using RFC 4646 to provide guidelines on how to do this and we will and do ask people in standards organizations on how to solve issues that are outside what RFC 4646 provides answers for.

We do have a code currently indicated as "ISO-639-2" in our database. We could enter something there that would be compatible with the RFC 4646 once it is figured out what would be a valid code under this regime. Problematic is that many of the ISO-639-2 codes are depreciated in ISO-639-3 and, it would be a nice academic exercise to create these codes where I do not want to deal with the political fall out that this creates.


Labels: ,


Post a Comment

Links to this post:

Create a Link

<< Home