Is it possible to train a new language model in our own environment?

Hi all,

I’ve just discovered LibreTranslate today and I’m very keen on trying it out. We have our own dedicated environment for this with enough GPU power, so we’d like to run it and, more importantly, train it within our environment. We’d use our own corpora which one more argument to use our own environment. The video tutorial for training a new language model shows it has to be done on vast.ai if I understand correctly. So, can we use our environment for training as well?

Secondly, the aim for us would be to use it as a backend service with enabled API which we would integrate with our translation tools. This is also possible, right?

Best, seba

1 Like

the tutorial presents vast.ai because it’s easy to use for people who don’t have gpu but if you have an environment with gpu you can do exactly like the tutorial but without the specific handling for vast.ai

then with the obtained module you can install it and it will be available with argos and libretranslate

1 Like

So just to confirm we can train the model without exposing our proprietary data outside our environment?

yes argos-train can be used on your computer

1 Like

So long as you respect the terms of the AGPLv3 license LibreTranslate/LICENSE at main · LibreTranslate/LibreTranslate · GitHub, yes. If you modify the software you’ll need to make the source code of your modifications available to your users. (I’m not a lawyer and this does not constitute legal advice).

1 Like

Please excuse my poor wording. We wouldn’t change the software at all. We would train LibreTranslate with our own data and have it running separately. Then, our own translation tool would call LibreTranslate API. There wouldn’t be any modification or integration. I was referring to integration from the process point of view.

2 Likes