Last year I upgraded to an AMD RX 7900XTX, mainly to play Alan Wake II. Just like my previous card the XTX has a “Zero RPM” feature: it turns off its fans fully if the junction temperature, the hottest part of the GPU, is below a certain threshold. With the fans off, the GPU relies on its massive heatsink for passive cooling. Even in a very well-ventilated case, however, this will mean that the area around the GPU will heat up considerably. For me the fans turn off at around 55°C; the component closest to the GPU, an NVMe M.2 SSD, will usually slowly heat up to around 48°C whilst idling.
Even under load the SSD never exceeds any temperature threshold, so realistically it should be fine, but I’m simply not happy with the amount of thermal energy sitting around in there if it could be expelled easily by turning on the fans. Worse still, the logic for toggling the fans is not very well thought-out, and in the worst case the fans are on for one minute only to be off for the next one, ad nauseam.
With my previous GPU turning off “Zero RPM” was pretty simple. Using the upp(1) tool you could toggle the feature in the GPU’s so-called PowerPlay tables. It’s a simple job, then, to write a systemd service to turn off “Zero RPM” on system boot.
Sadly this is no longer possible on 7000 series cards as there is no more direct access to the PowerPlay tables. Instead a new framework using sysfs for managing PowerPlay features was introduced. Fan curve controls were added after a while (and a lot of moaning by users), but there was no such knob for the “Zero RPM” feature. A couple of months ago a feature request was opened for it, but nothing much happened on AMD’s side.
Initially hopeful for a reasonably quick resolution, I was getting more and more annoyed after a while by the lack of this seemingly simple toggle, so I finally caved and proceeded to have a look at it myself. The hardest part was getting started with reading amdgpu code. The code base is absolutely massive and I had no real idea where to start. Since fan curve controls already existed I thought it best to find the commit that introduced them. After a quick search I found the relevant commit and had a better understanding of which parts of the code to change.
So, after a while of tweaking and twiddling I had a working prototype and I could finally have my GPU run its fans at all times. I knew a lot of people were also waiting for this feature, so I sent a patch upstream. After some short feedback and the addition of another feature the series was accepted, and is going to be part of the kernel sometime soon.
With the fans now running at all times I can happily report that ambient temperatures have dropped by more than 10°C and the SSD usually does not exceed 40°C when idling. Even better I do feel quite proud to have finally contributed code to the kernel.