Language Support

Yep, should be fixed. I want to do some more automation for generating the index JSON so I don’t keep having typos.

1 Like

Azerbaijani

1 Like

Czech

1 Like

Just updated libretranslate.com with the latest models :clinking_glasses:

1 Like

isn’t there a problem with the Czech in the json?

Czech → English = cs
English → Czech = zh

1 Like

@dingedi Good catch thanks, should be fixed

1 Like

Greek

Esperanto

1 Like

Persian

1 Like

w00t! Awesome. :clinking_glasses: Love seeing more languages getting added.

2 Likes

Hebrew

1 Like

Danish

Trained with CCAligned, Paracrawl, Europarl and WikiMatrix. Getting about ready for a retrain including CCMatrix…

I haven’t completely figured out this IPFS yet…

2 Likes

Great, once you’re done with Danish I can add it to the package index. No need to figure out ipfs, I can do it. I’ve been running ipfs add my-pkg.argosmodel to generate IPFS links but they’re mostly not used currently.

Well… I’m not sure what “done” means :slight_smile:
I ditched CCMatrix, did add OpenSubtitles instead. Trained to 30000

Installed locally at https://translate.fortytwo-it.com/ and the translations do seem a lot better. Are there plus/minus votes (obviously there’s suggest :slight_smile: )

For the time being I have the Danish data files and training JSON here, but I can/will move the .argosdata files to the main /argosdata/ URL for permanent.

The new Danish models are here

[This may also be useful for the Section IV of my tutorial… what to do with them now! :D)

(I can probably ipfs add, but I was trying to get current files to help share, but it just sat there, nothing downloaded anywhere)

1 Like

I added the Danish .argosmodel packages to the package index and the .argosdata packages to the Argos Train data index.

Thanks for training these and helping document the process!

1 Like

Of course.
Could you adjust the argosdata links to just Index of /argosdata ? I stuck them in the /training/ sub-directory to keep them separate for, well, training :slight_smile:

I’ll leave them in both locations for the time being and just make a /training_no/ for my Norwegian next project.

1 Like

What are the validation rules for a new language? I’m finishing the Brazilian Portuguese, Portuguese from Portugal is very different - after several days organizing the data sources it is almost complete, I’m working on validating the translations now, as soon as it’s complete I could help with how to send/share

2 Likes

There aren’t any specific validation rules. If you have a trained model or data you want to submit you can make a post or pull request.

Currently for Portuguese there are already trained models but no data packages. To replace the current models we would want a a demonstration that the new models are an improvement over the existing ones.

2 Likes

I would like to know if you have the argos-train repository with tensorflow ? or training checkpoints in google drive can be used in opennmt-py ? after a lot of research, I was in doubt because it is used in tensorflow
I found them here: Checkpoint Exports – Google Drive

After searching a lot I found the command

ct2-opennmt-tf-converter  

and worked fine

1 Like