Why does CPU mode appear to be faster than GPU mode?

Mcadow · April 15, 2023, 9:17pm

I have an Nvidia graphics card (thus with “CUDA”) in my computer. I decided to make a bunch of speed tests, to see if the ARGOS_DEVICE_TYPE setting really did anything. I was surprised that cpu is faster by one second for the same test.

I tried the text This is a test to check the speed difference between various device types to see if there is any difference. from en to es:

With nothing set: 4.285677909851074 seconds.
With auto: 4.503474950790405 seconds.
With cuda: 4.536334037780762 seconds.
With cpu: 3.598237991333008 seconds.

The value cpu was pure guesswork by me because the values are not documented and also could not be found anywhere in the source code when I searched, even though it knows how to respond when you use a value that it doesn’t recognize, such as none.

Does this mean that my GPU is really bad? I cannot detect much GPU usage while it’s running (Task Manager), but also not CPU and IO. Like with so many other programs, I get the feeling that Argos Translate isn’t nearly using my machine’s full power. It’s frustrating!

ArtanisTheOne · April 27, 2023, 1:51am

There’s more latency in requests to GPU, but as you increase the size of requests or process more translations at once GPU will be faster.

Eg using CTranslate2 [library used to run models] with batching provides a significant speedup, allowing parallel execution of translations as opposed to slower sequential.

An example of batching an 1000 sentence corpus and translating it with GPU [3090] vs. CPU is 1.6s vs. 74s

Sequential would most likely be significantly more for each