I’m not entirely sure I’m posting in the right forum, so apologies in advance.

I started trying out LibreTranslate today, and quickly but totally randomly found out that some names will be translated, and some will be mangled.

For example, translating a sentence with the common Nordic name “Patrik” into other languages will give me “Patrick”, the English form of the same name. Now, this may stem from some sort of auto-correction, or it might actually try to translate it, I dunno.

However, it gets stranger. When trying out the German name “Jürgen”, it isn’t translated (and it shouldn’t be). The Swedish form, “Jörgen”, is translated into “the lake”, the English form “Jorgen” is translated into “the shirt” and the non-existing form “Jårgen” becomes “the hair”. Et cetera.

This is bizarre. :slight_smile: If it’s some sort of auto correction, can it be disabled? Otherwise, is some sort of huge name database needed?

We don’t have any explicit handling for names, the neural net decides which words should be translated and which should pass through as specific names.

Using more powerful language models should improve issues like this because the model will understand the sentence context better. We could also add or generate data for the LibreTranslate dataset that handles names in the expected way.

