Any high load recommendations?

aveDenis · October 23, 2021, 2:31pm

Hi

Thank you for awesome project. Can someone please provide any recommendation to get maximum perfomance?

I deployed LibreTranslate to Kubernetes. It’s better horizontally scale it (e.g to 5-10 pods), or have one instance with maximum ‘requests’ (k8s terminology) provided? Does GPU server required or it can work fast on CPU only? Can you recommend particular GPU (and why exactly it)?

argosopentech · October 23, 2021, 8:38pm

The underlying translation engine is CTranslate2 which supports CPU and GPU execution and has benchmarking information. In my experience CPU works well and the best way to scale up is to make many simultaneous requests to LibreTranslate on different threads.

pierotofy · October 25, 2021, 5:42am

Ah, very cool! Would you be interested in sharing the kubernetes scripts (or instructions on how you did that) with others? I know there’s been some interest from others. If you publish those on GitHub we can then link your repo in the README.

aveDenis · October 25, 2021, 7:10am

Yep! I will post soon own example with Docker build + autoscaling config + deploy to DigitalOcean K8s via GitHub Actions CI

aveDenis · October 25, 2021, 5:13pm

Done