Complete previous training

Herve · November 27, 2023, 2:20pm

Dear all,

The 1.9 French translation model is terrific. Thank you very much for your hard work on it.

Nevertheless, it is quite generic and I would like to specialize some translations to adapt them to the specific vocabulary used in my company. Would it be possible to complete an existing training with some specialized tokens and would I need to train the model from scratch using an argosdata file that contains my specific tokens ?

Thanks in advance

argosopentech · November 29, 2023, 8:11pm

Argos Train is designed to train each model from scratch on a single GPU in 24 hours and doesn’t support fine tuning. I also normally don’t save the model checkpoints after I quantize them and package them as a .argosmodel file.

github.com/argosopentech/argos-translate

Fine-tuning pretrained models

opened 11:53AM - 09 Jul 21 UTC

closed 01:16PM - 09 Jul 21 UTC

JakobEdding

First, thank you for this project! I would like to improve translation performan…ce for my specific use case using transfer learning. How would I go about fine-tuning a pretrained model from the [index](https://www.argosopentech.com/argospm/index/) to adapt it better to a specific domain? I did some digging myself and I guess I would need to "unconvert" the model.bin file (I'm looking at the German-to-English model) generated with ctranslate2 to get to the actual weights. However, I can't see how I could accomplish this.

If you want to train with custom data you should make your own argosdata file and then train from scratch. Since all of the existing data can be accessed automatically by Argos Train this should be pretty straightforward.

The 1.9 French model is actually an Opus-MT model and isn’t trained using Argos Train. So if you want to modify it you would have to find a way to modify their model and then re-convert it for Argos Translate.

Herve · November 30, 2023, 5:04pm

Thank you very much for your answer.

Just to play and learn, I downloaded the NLLB opus (40 GB French sentences, 34 GB English sentences) and run argos-tain against it (French to English).

After 100 GB RAM ad 60 GB swap consumption and a lot of disk I/O, it generated a 28 GB argosdatafile

Then, I read the “Sampled 1000000 sentences from 657187427 sentences” log line. So I believe that argos-train limits the training to 1 million sentences.

If this is the case, then I need to pay attention to limit the general purpose sentenses to ensure my specific sentences are taken in account.

I will also have a look to Opus-MT models and how to convert them to Argos Translate