Help Wanted: Estonian model for Argos Translate

lynxpda · February 11, 2024, 3:35pm

I’m glad to introduce translation models to the community.
The models have a Transformer DEEP architecture with 159M parameters, which is 30% smaller than the Transformer BIG and almost 2 times faster in inference (according to my observations, comparable to the Transformer Base, as the decoder part has comparable dimensions).

EN_ET

Model	BLEU	COMET-22
LT1.0	29,1	0,9017
GoogleTranslate	31	0,918
eng-est/opus…2022-03-13	28,30	-
facebook/m2m100_1.2B	24,5	-

ET_EN

Model	BLEU	COMET-22
LT1.0	38.7	0,8903
GoogleTranslate	42,4	0,8997
est-eng/opus…2022-03-09	38,6	-
facebook/nllb-200-3.3B	35,7	-

To train the models, the back translation method was also used:

All Estonian Wikipedia and news for 2 years.
About half of Wikipedia is in English, news for one year and commentary on the news.

In total there are about 35M pairs of back translation sentences.

To be honest, both models, although they slowed down, continued to train and improve metrics, but they exhausted the GPU time limit.

Models:

translate-en_et-1_0.argosmodel

translate-et_en-1_0.argosmodel