Some Argos Translate updates

I’ve been working on Argos Translate 2 in the v2 branch but I’m planning to stay on 1.X for at least a decent while. I’m pretty happy with the current version (1.7) and have been focusing more on model training than upgrades to the core Python library.

I’m considering releasing a version 1.8 with some minor upgrades and new features including:

  • Support for upgrading packages by checking if there’s a newer version of the same package and installing it if available
  • Deleting the .argosmodel package files in ~/.local/cache/argos-translate after they’ve been unzipped and installed

I’d appreciate any feedback on these ideas or other possible features!

1 Like

The cache cleanup would be really nice; as the number of models grows this can become significant.

Feature-wise I think one of the most important features is speed improvements, so ways to improve the throughput of translations and reduce CPU usage.

1 Like

I agree the cache cleanup should good.

That’s good to know that CPU usage is a bottleneck; you probably run the biggest LibreTranslate instance at LibreTranslate.com. CPU performance is probably best improved in CTranslate2 though. I think the best way to improve performance is to make sure CTranslate2 is optimized for your hardware. If your CPU supports the Intel MKL Library or oneDNN making sure they are being used could yield speed improvements.

I think CTranslate2 is able to utilize multiple cores in parallel well so upgrading to a better CPU should help performance. LibreTranslate can use multiple CPU cores for different requests while CTranslate2 is also using ~4 cores per request so we’re able to effectively make use of 32+ cores. Using a GPU should also help, although I’ve generally just used CPUs since CTranslate2 works very well on CPUs in my experience.

1 Like

Inference performance is the primary reason I’m currently keeping Argos Translate V1 in the master branch. I think in the future I want to move to multilingual models but for now single language pair models are smaller and faster.

1 Like

Oh, another good feature would be language detection. Currently we rely on polyglot, but having a dedicated set of functions would allow improvements in language detection.

1 Like

I just released Argos Translate 1.8.0:

  • Delete cached package files after they’ve been installed
  • Automatically update packages to newer version with package.Package.update

I’m planning to add support for language detection eventually but probably not soon. Likely language detection will be a feature in Argos Translate 2.

2 Likes

Hi! Kudos for the great job on Argos-Translate! I’m stoked at the prospect of automatic quality translations between multiple languages!

I actually came across this project after using Whisper, which is also awesome.
I have done the first transcription of a 1h long video in Spanish using Whisper. And with the large model V2 it took almost 24h. It then takes me another hour to review the subtitles generated and making sure everything is ok.
I would like to use the .srt file generated by Whisper to ask LibreTranslate to translate those subtitles into English. Having made sure that the transcription is of quality, I would expect LibreTranslate to do a good job; and at a fraction of the computational cost of the speech recognition+translation of running Whisper.

Therefore here are 2 feature requests:

  1. Implement in LibreTranslate the capability of translating .srt files. They are essentially .txt files, but with time-stamps inserted in between the sentences. Therefore, the feature would temporally remove the timestamps for the translation purposes, and then somehow reinsert them.

I am well aware that there are differences between an interpretation (which is translating almost in real time and therefore making an effort to maintain the semantinc elements at more or less the same relative positions in the sentences) and a translation, which is not bound by the limitations of the interpretation.

And I understand that Argos-Translate does translations, not interpretations. And I would rather have translations of the subtitles than interpretations.

  1. The second feature request I’m not sure is very advisable, as maybe if you have opted for providing the services of LibreTranslate using a web server, that may be for a reason. But I would like to know if it is sensible to ask for a desktop version of LibreTranslate, or argos-translate-files for that matter.

I’ve been working on the v2 branch for some time now. It has some nice features; mainly support for multilingual models, better automated testing, more thorough typing, code refactoring for readability, improved config system, and more. However, maintaining and reconciling two separate branches is becoming difficult. I’m now planning to merge the v2 branch into the master branch and release it as version 1.10 with backwards compatibility instead of releasing an Argos Translate 2 version with breaking changes.

2 Likes