Bleu score of libre translate models

With jieba tokenization, flores 200 dataset and this config for ctranslate2:

output = self.translator.translate_batch(
            source_tokenized, 
            replace_unknowns=True,
            max_batch_size=32,
            beam_size=2,
            num_hypotheses=1,
            length_penalty=0.2,
            return_scores=False,
            return_alternatives=False,
            target_prefix=None
        )

en-ru 61.23
en-es 44.03
ca-en 59.65
en-ca 51.58
en-cs 41.87
pl-en 32.58
ga-en 50.27
fr-en 57.68
en-he 62.01
en-tr 43.57
en-id 52.46
sv-en 61.38
pt-en 61.61
en-uk 50.85
en-ko 32.84
ko-en 31.5
en-el 56.57
en-hi 62.86
id-en 47.26
nl-en 41.0
he-en 51.91
en-de 50.06
en-sk 42.61
eo-en 51.42
da-en 56.46
fi-en 43.04
en-hu 45.13
es-en 44.78
hu-en 32.61
de-en 55.6
ja-en 33.73
en-da 54.21
cs-en 43.54
it-en 48.1
ru-en 44.06
en-pt 62.19
uk-en 45.99
sk-en 44.31
en-ga 48.9
en-nl 38.64
en-ja 30.68
en-it 45.28
hi-en 47.15
en-sv 60.38
en-eo 44.95
en-fr 59.26
en-zh 12.0
en-fi 37.67
tr-en 43.21
en-pl 32.02
el-en 35.3
zh-en 33.23
2 Likes