Hello,
I am trying to self-host a LibreTranslate service on a GPU server but the translation is not to be faster than on my CPU server, which surprises me.
My first hypothesis was that the GPU was not used by the LibreTranslate setup I made but if I use a tool like nvtop
while translating, it shows that the GPU is used.
Maybe someone here already worked on a similar setup and knows what I am doing wrong ?
I am using the command docker-compose -f docker-compose.cuda.yml up -d --build
and I tried to add those lines to the docker-compose.cuda.yml
file:
runtime: nvidia
environment:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=all
Here is a description of my current setup:
Ubuntu: 20.04.5 LTS
Docker: 20.10.21 (+ nvidia-docker)
Docker-compose: 1.29.2
GPU: NVIDIA A100-SXM4-40GB
Driver: nvidia-driver-520
Thank you for your help !