Yes, of course. To clarify: I was referring to the cooling of the VRAM modules on the graphics card. If they get hotter than 84 degrees Celsius, the card seems to throttle. I’ve already replaced the thermal pads and installed a heat sink and fan on the backplate, which has lowered the temperature by about 6 degrees Celsius, but it still gets hot. When NOT all 15 threads are used, the RAM temperature rises above 84 degrees.
Here are the specifications. The other PC components may not be as important, as the main load is on the graphics card.
In regard to cooling, consumer graphics cards have vertical cooling slots, not designed for server cases. I got me a RTX 4060 Ti 16GB and it got >80 in a normal PC tower case, even with extra fans. I put that into a 4u server case with adequate cooling and this got the GPU down to 70 degrees at full load. At 70 degrees the GPU quickly ups its fans to more than 45+% load so the temperature only rarely reaches 71 degrees. You need to modify the GPU’s cooling curve to favor low temperature over “quiet”. I’ll try this soon, will report back.
Edit:
How much RAM does a llama.cpp server need? 2x GPU RAM?