Be inspired by features of other online translation services and discuss reverse engeneering them

The best non-free online translation service I know is DeepL. DeepL feeds its translation AI with the user input which makes it so accurate. Specifically when using DeepL it not just provides the user wit one possible translation but multiple ones. It displays only one possibility at first but if the translation doesn’t read right the user can simply click on that part and select from a list of different possibilities. If the user does this is reported back to DeepL somehow and according to DeepLs privacy policy the AI trained to translate differently in that case after checked by someone working there. After training the AI the user input is deleted. Also when clicking on a word its possible translations according to a dictionary are shown to the user which also is a help.

LibreTranslate could implement this in a much more privacy friendly way by asking the user before sending the correction/improvement showing the user exactly what data would be sent and letting the user decide whether to send it or not. There could be a reviewing page where other users can review the corrections. I believe implementing this would improve translation quality greatly.

What do you think about my idea?

1 Like

There’s currently a repo for submitting community data but it’s not really used:

Multiple translations are available through Argos Translate ITranslation.hypotheses. Automated data collection could be a cool idea though.

1 Like

How about implementing a dictionary (Wiktionary?) that shows relevant information about the word where the cursor is at that time?

The models are currently trained with Wiktionary data but making more explicit use of definition data would also be possible.

But what if a word has multiple meanings? Wouldn’t it be good to show them to the user like DeepL does for the word where the cursor is?

1 Like

CTranlate2 and Argos Translate support multiple hypotheses so we could do something like that for entire translations. It might also be possible to use more explicit dictionary data (definitions, synonyms, translations, word origins, pronunciations) to give to users.