Wednesday, August 30, 2006

500 users

Just a short note: User Olli was the 500th person to subscribe to WiktionaryZ.

Tuesday, August 22, 2006

The simple wiktionary

As well as doing things to get WiktionaryZ going, I usually have something like 6 bots running on many wiktionaries. With some regularity I do the smaller wiktionaries too. One wiktionary I really like because of it's crisp user interface is the simple wiktionary. Simple intends to be simply easy to use. It does not want to be difficult; when you want more information that it provides, it just refers you to the English wiktionary. It does not do translations, it does not do etymology but it does inflections and it uses colours and layout to good effect.

When I look at this, I want to have something similar for WiktionaryZ.. Now WiktionaryZ does have translations, it will have etymology and everything under the sun it will have everything and the challenge of having a neat user interface is therefore a challenge.

Today I copied from the simple wiktionary a Basic English alphabetical wordlist. This wordlist originally by Charles Kay Ogden, contains 850 words. There is bound to be a lot of theory behind it, for me it helps me remind me that we need both the simple and those difficult words, the words that people often have difficulties pronouncing like "immunoassay", we need them too, but we will also need a way to define them in such a way that people can understand what is meant.


Saturday, August 19, 2006

Armenian and Georgian

Well, yes, I missed two languages. Also
  • Armenian
  • Georgian
were added :-) As you can see: languages are really getting many, but are still far away from the approx. 7.000 that will be there one day (no, not kidding :-)

It's about new languages again :-)

I just had a look at a term where I thought I would add a translation and then I saw some more languages ... of course I now had to look which ones were added:
  • Afrikaans
  • Akan
  • Arabic (standard)
  • Ewe
  • Haussa (Arabic script)
  • Haussa (Latin script)
Quite a busy week when it comes to new languages, right? I hope I didn't miss any.

Thursday, August 17, 2006

TMX repository

The first TMX (Term Base eXchange) files were uploaded to WiktionaryZ. Translators use and create these TMX files during their daily work. Of course, confidential information will not be found there, but there are many translations that deal with absolute neutral issues where you don't know who a translation was for and there are plenty of OpenSource and OpenContent projects that can share their translation data in such a way.

This repository can also be very helpful for the localization of Wikipedia contents - imagine all that languages (like many regional languages) - where people have difficulties to write sentences, where alphabetisation level is around 4% ... these can look up how single words were used in other sentences before when they use such files in CAT-Tools like OmegaT. This means that they can imitate correct sentences and improve their language. They will have less hurdles when it comes to write and we obviously can get better quality and more contents in less time.

Then imagine all these countries where people anyway work offline ... they can use the data for their translations and do the same. In many countries, like Africa, online time is very expensive. Working with OmegaT and sharing TMX files they will assure quality and consistent terminology.

I know, we have only two files for now ... I am sure: the repository will grow over time.

Useful .. in some ways yes

When there is a word that I think should be in WiktionaryZ, I add it. Given that we know of Wikipedia, that people want to know about sex sport and the news, I add words that are in the news. Words like "truce" or "Somaliland" could be found on my favorite news site.

The insanely great thing, is that when I add a translation I know it will stay with that meaning of a word and, that for all the translations I add there is an article to access it for the concept for that language as well. As I want to demonstrate that WiktionaryZ does do other scripts, I have added a fair number of words in Greek, Russian, Hebrew and Japanese. Also a fair number of words in Esperanto but that is a different story.

Having many people add words, means that at some stage you cannot add a word .. it is already there. This means that WiktionaryZ starts to become useful in a VERY modest way. The good news is that the content only gets richer. It is no news that many of the issues with our content will have to be addressed when we get richer functionality..

The thing that truly amazes me, is that we are already at a stage where many ordinary words are there.. in many language..

NB since the last post we added Catalan to the list of languages that are supported :)

Tuesday, August 15, 2006

Word of the Day

Today we added the Word of the Day template to the main page. This means that from now on we plan to have a daily change there and will use, if possible, terms that are related to the news. Of course you are invited to have your daily portion of "WZ".

Friday, August 11, 2006

Piedmontese - Ukrainian - Venetian - three new languages added

As from yesterday evening people on WiktionaryZ now can also work on the languages Piedmontese, Ukrainian and Venetian.

Thursday, August 03, 2006

Identical meanings and not ...

Well, let's take an example that is the easiest way to describe things - the word: cousin

When you look at this word you will find three defined meanings:

- The child of a person's uncle or aunt
- A daughter of a persons's uncle or aunt
- A son of a person's uncle or aunt

Now when translating from English to other languages - let's say German, you have (at least) four translations for the first defined meaning and (at least) two for the other two.

This means in the first case you will for example add the German translations:
Cousin, Vetter, Cousine, Base
and the check box for identical meaning must remain empty.

In the second case you will have: Cousine, Base
And in the third case: Cousin, Vetter

When you look at this now (3 August 2006, 13:49 UTC) you will se that people did not add the translations to the first defined meaning. I suppose this is because it is still not a 100% clear how to handle these situations. (And this is the reason for this note :-)