User submitted translations

The German word “Hase” (Engl. hare) is translated incorrectly to at least FR, ES, EN, RU. It seems that it is mixed up with “Hass” (Engl. hate, Russ. Ненависть). The correct tanslation is hare (EN), liebre (ES), lièvre (FR), заяц (RU).

1 Like

Translations to non-English languages pivot through English first so if de->en is wrong then all others will be too. The way to correct issues like this is to have examples of correct translations in the data, there’s been some work on this but we currently don’t have a great system for collecting and publishing translation data.

Indeed for the moment suggestions can be made but there is nothing planned to verify them and use them correctly, currently only the owner of the instance, @pierotofy in the case of libretranslate.com can access them and upload them, but we would have to see where, how and in what format.
@argosopentech adding this data to the argos-parallel-corpus data.json will make this one file very hard to maintain i think. maybe think about another system?

1 Like

Yes agreed, I just wanted to get something up. A database like in your pull request seems reasonable to me but I can see a lot of systems working. We probably want to set up some centralized system for people to upload data automatically and then release it under an open license.

2 Likes

LibreTranslate could be setup to automatically (every day or every week) send a copy of the suggestions database to a URL, which could then manage this data.

As dingedi was pointing out time ago, I’m not sure how validation / filtering could be done automatically though; while most suggestions are probably valid, some might be incorrect or plain incorrect (trolling could also become a thing).

1 Like

My thinking would be to start collecting first and then deal with issues as they arise. Eventually we’ll probably need a system to have people rate translations and sort out the high quality ones.

1 Like

I had thought of an up / down voting system like google translate but for an optimal system (i think) users would have to create accounts, so as not to vote several times on the same word and that other words be proposed for the vote

That sounds reasonable to me, there’s probably going to be some tradeoff between being easy to contribute and keeping the quality of the translations high.

1 Like

I like the approach :+1:

1 Like

This will then require big additions, because it will require a registration area, connection, reset password, etc, and therefore a configuration has a mail server for libretranslate (for validate accounts and reset password) and all the security qestions for the accounts and also a confidentiality policy which should must be written I think.

1 Like

We don’t need all of that to start collecting right? With the current code we can have people upload translations now and then build a rating system once we have enough of them.

1 Like

yes of course, i hope this will improve argos which is already great !

2 Likes