Why does CPU mode appear to be faster than GPU mode?

I have an Nvidia graphics card (thus with “CUDA”) in my computer. I decided to make a bunch of speed tests, to see if the ARGOS_DEVICE_TYPE setting really did anything. I was surprised that cpu is faster by one second for the same test.

I tried the text This is a test to check the speed difference between various device types to see if there is any difference. from en to es:

With nothing set: 4.285677909851074 seconds.
With auto: 4.503474950790405 seconds.
With cuda: 4.536334037780762 seconds.
With cpu: 3.598237991333008 seconds.

The value cpu was pure guesswork by me because the values are not documented and also could not be found anywhere in the source code when I searched, even though it knows how to respond when you use a value that it doesn’t recognize, such as none.

Does this mean that my GPU is really bad? I cannot detect much GPU usage while it’s running (Task Manager), but also not CPU and IO. Like with so many other programs, I get the feeling that Argos Translate isn’t nearly using my machine’s full power. It’s frustrating!

There’s more latency in requests to GPU, but as you increase the size of requests or process more translations at once GPU will be faster.

Eg using CTranslate2 [library used to run models] with batching provides a significant speedup, allowing parallel execution of translations as opposed to slower sequential.

An example of batching an 1000 sentence corpus and translating it with GPU [3090] vs. CPU is 1.6s vs. 74s

Sequential would most likely be significantly more for each

1 Like