I’ve been working on Argos Translate 2 in the v2 branch but I’m planning to stay on 1.X for at least a decent while. I’m pretty happy with the current version (1.7) and have been focusing more on model training than upgrades to the core Python library.
I’m considering releasing a version 1.8 with some minor upgrades and new features including:
Support for upgrading packages by checking if there’s a newer version of the same package and installing it if available
Deleting the .argosmodel package files in ~/.local/cache/argos-translate after they’ve been unzipped and installed
I’d appreciate any feedback on these ideas or other possible features!
That’s good to know that CPU usage is a bottleneck; you probably run the biggest LibreTranslate instance at LibreTranslate.com. CPU performance is probably best improved in CTranslate2 though. I think the best way to improve performance is to make sure CTranslate2 is optimized for your hardware. If your CPU supports the Intel MKL Library or oneDNN making sure they are being used could yield speed improvements.
I think CTranslate2 is able to utilize multiple cores in parallel well so upgrading to a better CPU should help performance. LibreTranslate can use multiple CPU cores for different requests while CTranslate2 is also using ~4 cores per request so we’re able to effectively make use of 32+ cores. Using a GPU should also help, although I’ve generally just used CPUs since CTranslate2 works very well on CPUs in my experience.
Inference performance is the primary reason I’m currently keeping Argos Translate V1 in the master branch. I think in the future I want to move to multilingual models but for now single language pair models are smaller and faster.
Oh, another good feature would be language detection. Currently we rely on polyglot, but having a dedicated set of functions would allow improvements in language detection.
Hi! Kudos for the great job on Argos-Translate! I’m stoked at the prospect of automatic quality translations between multiple languages!
I actually came across this project after using Whisper, which is also awesome.
I have done the first transcription of a 1h long video in Spanish using Whisper. And with the large model V2 it took almost 24h. It then takes me another hour to review the subtitles generated and making sure everything is ok.
I would like to use the .srt file generated by Whisper to ask LibreTranslate to translate those subtitles into English. Having made sure that the transcription is of quality, I would expect LibreTranslate to do a good job; and at a fraction of the computational cost of the speech recognition+translation of running Whisper.
Therefore here are 2 feature requests:
Implement in LibreTranslate the capability of translating .srt files. They are essentially .txt files, but with time-stamps inserted in between the sentences. Therefore, the feature would temporally remove the timestamps for the translation purposes, and then somehow reinsert them.
I am well aware that there are differences between an interpretation (which is translating almost in real time and therefore making an effort to maintain the semantinc elements at more or less the same relative positions in the sentences) and a translation, which is not bound by the limitations of the interpretation.
And I understand that Argos-Translate does translations, not interpretations. And I would rather have translations of the subtitles than interpretations.
The second feature request I’m not sure is very advisable, as maybe if you have opted for providing the services of LibreTranslate using a web server, that may be for a reason. But I would like to know if it is sensible to ask for a desktop version of LibreTranslate, or argos-translate-files for that matter.
I’ve been working on the v2 branch for some time now. It has some nice features; mainly support for multilingual models, better automated testing, more thorough typing, code refactoring for readability, improved config system, and more. However, maintaining and reconciling two separate branches is becoming difficult. I’m now planning to merge the v2 branch into the master branch and release it as version 1.10 with backwards compatibility instead of releasing an Argos Translate 2 version with breaking changes.