High GPU temperature after kernel 4.18.0.9.10 on AMD RX460

Bug #1796720 reported by Daniel
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Low
linux (Ubuntu)
Confirmed
High
Unassigned

Bug Description

Hello,

Today I have received an update for Linux kernel which upgraded the kernel on my Ubuntu 18.10 installation to 4.18.0.9.10! Immediately after restart I noticed that the GPU temperature is abnormally High...

Before Update:
- GPU fan set to off on idle
- GPU idle temperature: 29-32 C

After Update:
- GPU fan set to off on idle
- GPU idle temperature: 47-50 C (which I have to put the fans on 50% to keep it on 33 C)

I have also tried to boot from the older kernel (4.18.0.8) and everything is normal there!

System spec:
- MB: ASUS Prime x299-deluxe
- CPU: Intel Corei7 7820x skylake-x
- GPU: AMD RX460

glxinfo | grep OpenGL:

OpenGL vendor string: X.Org
OpenGL renderer string: AMD Radeon (TM) RX 460 Graphics (POLARIS11, DRM 3.26.0, 4.18.0-9-generic, LLVM 7.0.0)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 18.2.1
OpenGL core profile shading language version string: 4.50
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.4 (Compatibility Profile) Mesa 18.2.1
OpenGL shading language version string: 4.40
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 18.2.1
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
OpenGL ES profile extensions:

sudo cat /sys/kernel/debug/dri/0/amdgpu_pm_info:

Clock Gating Flags Mask: 0x37bcf
 Graphics Medium Grain Clock Gating: On
 Graphics Medium Grain memory Light Sleep: On
 Graphics Coarse Grain Clock Gating: On
 Graphics Coarse Grain memory Light Sleep: On
 Graphics Coarse Grain Tree Shader Clock Gating: Off
 Graphics Coarse Grain Tree Shader Light Sleep: Off
 Graphics Command Processor Light Sleep: On
 Graphics Run List Controller Light Sleep: On
 Graphics 3D Coarse Grain Clock Gating: Off
 Graphics 3D Coarse Grain memory Light Sleep: Off
 Memory Controller Light Sleep: On
 Memory Controller Medium Grain Clock Gating: On
 System Direct Memory Access Light Sleep: Off
 System Direct Memory Access Medium Grain Clock Gating: On
 Bus Interface Medium Grain Clock Gating: Off
 Bus Interface Light Sleep: On
 Unified Video Decoder Medium Grain Clock Gating: On
 Video Compression Engine Medium Grain Clock Gating: On
 Host Data Path Light Sleep: Off
 Host Data Path Medium Grain Clock Gating: On
 Digital Right Management Medium Grain Clock Gating: Off
 Digital Right Management Light Sleep: Off
 Rom Medium Grain Clock Gating: On
 Data Fabric Medium Grain Clock Gating: Off

GFX Clocks and Power:
 1750 MHz (MCLK)
 1212 MHz (SCLK)
 214 MHz (PSTATE_SCLK)
 300 MHz (PSTATE_MCLK)
 1081 mV (VDDGFX)
 18.65 W (average GPU)

GPU Temperature: 39 C
GPU Load: 0 %

UVD: Disabled

VCE: Disabled

ProblemType: Bug
DistroRelease: Ubuntu 18.10
Package: linux-image-4.18.0-9-generic 4.18.0-9.10
ProcVersionSignature: Ubuntu 4.18.0-9.10-generic 4.18.12
Uname: Linux 4.18.0-9-generic x86_64
ApportVersion: 2.20.10-0ubuntu11
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: shadowfax 3078 F.... pulseaudio
 /dev/snd/controlC0: shadowfax 3078 F.... pulseaudio
CurrentDesktop: ubuntu:GNOME
Date: Mon Oct 8 20:27:13 2018
InstallationDate: Installed on 2018-09-26 (11 days ago)
InstallationMedia: Ubuntu 18.10 "Cosmic Cuttlefish" - Beta amd64 (20180925.1)
MachineType: System manufacturer System Product Name
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.18.0-9-generic root=UUID=a58662c7-39e2-4f05-9da6-d63173c94d53 ro quiet splash vt.handoff=1
RelatedPackageVersions:
 linux-restricted-modules-4.18.0-9-generic N/A
 linux-backports-modules-4.18.0-9-generic N/A
 linux-firmware 1.175
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 08/03/2018
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1503
dmi.board.asset.tag: Default string
dmi.board.name: PRIME X299-DELUXE
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev 1.xx
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr1503:bd08/03/2018:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnPRIMEX299-DELUXE:rvrRev1.xx:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

Created attachment 278829
lsusb lspci cpuinfo config url sensors

sensors reports a power consumption of 13 W idle for 4.18.10
sensors reports a power consumption of 7 W idle for 4.18.9

Attached:
lsusb
lspci
cpuinfo
url to mainboard an graphics card
kernel config
output of sensors while idle

Revision history for this message
In , alexdeucher (alexdeucher-linux-kernel-bugs) wrote :

Can you bisect?

Revision history for this message
In , alexdeucher (alexdeucher-linux-kernel-bugs) wrote :
Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

Am 28.09.18 um 20:14 schrieb <email address hidden>:
> https://bugzilla.kernel.org/show_bug.cgi?id=201275
>
> --- Comment #2 from Alex Deucher (<email address hidden>) ---
> Git bisect howto:
> https://www.kernel.org/doc/html/v4.18/admin-guide/bug-bisect.html
>
Sounds prakticable, but may take 1-2 days.

Revision history for this message
In , Dieter (dieter-linux-kernel-bugs) wrote :

I can second that for RX580 (Polaris20)

It raised to 60 W 'idle' (from ~31/32 W with 4.18.9)

bisect? - Not so fast 'cause I use openSUSE Tumbleweed 'Kernel:stable' when 'amd-staging-drm-next' do NOT work for me --- and it do NOT work for me since 21/22 August.... but that come with another ticket.

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx: +1.20 V
fan1: 886 RPM
temp1: +43.0°C (crit = +94.0°C, hyst = -273.1°C)
power1: 59.16 W (cap = 175.00 W)

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

Created attachment 278841
bisect result

Includes bisect steps + sensors output

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

RX560 is Polaris11, so Bug may be ported from Polaris20

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

For now i have resolved this problem by simply removing patch for 4.18.11

git diff 93b100ddda3be284be160e9ccba28c7f8f21ab73^1..93b100ddda3be284be160e9ccba28c7f8f21ab73
and patch -R -p1

or without "-R":

git diff 93b100ddda3be284be160e9ccba28c7f8f21ab73..93b100ddda3be284be160e9ccba28c7f8f21ab73^1

Maybe the specs for Vega and Polaris just differ at this point?

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

4.18.12
Bug still present, removing 93b100ddda3be284be160e9ccba28c7f8f21ab73 solves this problem for now.

Revision history for this message
In , alexdeucher (alexdeucher-linux-kernel-bugs) wrote :

I don't think this is a bug. The problem is, prior to that patch, the display component was requesting minimum clocks that were 10x too low. This saved power, but led to display problems on some systems because the clocks were too low to sustain the display requirements.

Revision history for this message
In , Dieter (dieter-linux-kernel-bugs) wrote :

(In reply to Alex Deucher from comment #9)
> I don't think this is a bug. The problem is, prior to that patch, the
> display component was requesting minimum clocks that were 10x too low. This
> saved power, but led to display problems on some systems because the clocks
> were too low to sustain the display requirements.

Sorry Alex,
what?

_All_ was fine _before_ this commit for ages with stable upstream and all 'amd-staging-drm-next'.

Now, I get ~60 W raised from ~30 W with 1920x1080 (even dual display was good before).

Revision history for this message
In , alexdeucher (alexdeucher-linux-kernel-bugs) wrote :

Can you attach the output of `cat /sys/kernel/debug/dri/0/amdgpu_pm_info` before and after the patch?

Revision history for this message
In , Dieter (dieter-linux-kernel-bugs) wrote :

openSUSE Tumbleweed Kernel:stable 4.18.12-2.ga880bd8-default

After the patch.
(For 'before' I have to reboot to broken 'amd-staging-drm-next')
https://bugs.freedesktop.org/show_bug.cgi?id=108096

Clock Gating Flags Mask: 0x37bcf
        Graphics Medium Grain Clock Gating: On
        Graphics Medium Grain memory Light Sleep: On
        Graphics Coarse Grain Clock Gating: On
        Graphics Coarse Grain memory Light Sleep: On
        Graphics Coarse Grain Tree Shader Clock Gating: Off
        Graphics Coarse Grain Tree Shader Light Sleep: Off
        Graphics Command Processor Light Sleep: On
        Graphics Run List Controller Light Sleep: On
        Graphics 3D Coarse Grain Clock Gating: Off
        Graphics 3D Coarse Grain memory Light Sleep: Off
        Memory Controller Light Sleep: On
        Memory Controller Medium Grain Clock Gating: On
        System Direct Memory Access Light Sleep: Off
        System Direct Memory Access Medium Grain Clock Gating: On
        Bus Interface Medium Grain Clock Gating: Off
        Bus Interface Light Sleep: On
        Unified Video Decoder Medium Grain Clock Gating: On
        Video Compression Engine Medium Grain Clock Gating: On
        Host Data Path Light Sleep: Off
        Host Data Path Medium Grain Clock Gating: On
        Digital Right Management Medium Grain Clock Gating: Off
        Digital Right Management Light Sleep: Off
        Rom Medium Grain Clock Gating: On
        Data Fabric Medium Grain Clock Gating: Off

GFX Clocks and Power:
        2000 MHz (MCLK)
        1411 MHz (SCLK)
        600 MHz (PSTATE_SCLK)
        1000 MHz (PSTATE_MCLK)
        1200 mV (VDDGFX)
        61.254 W (average GPU)

GPU Temperature: 44 C
GPU Load: 0 %

UVD: Disabled

VCE: Disabled

Revision history for this message
In , Dieter (dieter-linux-kernel-bugs) wrote :

amd-staging-drm-next (with broken SDDM and then 'init 3')

Why is 'GPU Load:' so hight?
 => take it with a drain of salt.

Clock Gating Flags Mask: 0x3fbcf
        Graphics Medium Grain Clock Gating: On
        Graphics Medium Grain memory Light Sleep: On
        Graphics Coarse Grain Clock Gating: On
        Graphics Coarse Grain memory Light Sleep: On
        Graphics Coarse Grain Tree Shader Clock Gating: Off
        Graphics Coarse Grain Tree Shader Light Sleep: Off
        Graphics Command Processor Light Sleep: On
        Graphics Run List Controller Light Sleep: On
        Graphics 3D Coarse Grain Clock Gating: Off
        Graphics 3D Coarse Grain memory Light Sleep: Off
        Memory Controller Light Sleep: On
        Memory Controller Medium Grain Clock Gating: On
        System Direct Memory Access Light Sleep: Off
        System Direct Memory Access Medium Grain Clock Gating: On
        Bus Interface Medium Grain Clock Gating: Off
        Bus Interface Light Sleep: On
        Unified Video Decoder Medium Grain Clock Gating: On
        Video Compression Engine Medium Grain Clock Gating: On
        Host Data Path Light Sleep: On
        Host Data Path Medium Grain Clock Gating: On
        Digital Right Management Medium Grain Clock Gating: Off
        Digital Right Management Light Sleep: Off
        Rom Medium Grain Clock Gating: On
        Data Fabric Medium Grain Clock Gating: Off

GFX Clocks and Power:
        300 MHz (MCLK)
        303 MHz (SCLK)
        600 MHz (PSTATE_SCLK)
        1000 MHz (PSTATE_MCLK)
        831 mV (VDDGFX)
        32.176 W (average GPU)

GPU Temperature: 29 C
GPU Load: 84 %

UVD: Disabled

VCE: Disabled

Revision history for this message
In , Dieter (dieter-linux-kernel-bugs) wrote :

Diff !!!

BAD
Host Data Path Light Sleep: Off

GOOD
Host Data Path Light Sleep: On

Revision history for this message
In , Dieter (dieter-linux-kernel-bugs) wrote :

(In reply to Dieter Nützel from comment #14)
> Diff !!!
>
> BAD
> Host Data Path Light Sleep: Off

card0/device> cat pp_dpm_mclk
0: 300Mhz
1: 1000Mhz
2: 2000Mhz *

card0/device> cat pp_dpm_sclk
0: 300Mhz
1: 600Mhz
2: 900Mhz
3: 1145Mhz
4: 1215Mhz
5: 1257Mhz
6: 1300Mhz
7: 1411Mhz *

> GOOD
> Host Data Path Light Sleep: On

card0/device cat pp_dpm_mclk
0: 300Mhz *
1: 1000Mhz
2: 2000Mhz

card0/device cat pp_dpm_sclk
0: 300Mhz
1: 600Mhz *
2: 900Mhz
3: 1145Mhz
4: 1215Mhz
5: 1257Mhz
6: 1300Mhz
7: 1411Mhz

But SCLK changed much.

Need badly some sleep.
Saturday morning off for family vacation.

Greetings,
Dieter

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

Created attachment 278933
amdgpu_pm_info

content of /sys/kernel/debug/dri/1/amdgpu_pm_info
for 4.18.12 +/- 93b100ddda3be284be160e9ccba28c7f8f21ab73

Revision history for this message
In , grmat (grmat-linux-kernel-bugs) wrote :

I can confirm the issue with Polaris10. Power consumption is roughly 30 Watts higher in idle compared to what it used to be and compared to Windows. DPM are stuck in highest power modes for both s and m.

The reporter has already bisected so I haven't. If you still need more info, please ping.

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

(In reply to Alex Deucher from comment #9)
> I don't think this is a bug. The problem is, prior to that patch, the
> display component was requesting minimum clocks that were 10x too low. This
> saved power, but led to display problems on some systems because the clocks
> were too low to sustain the display requirements.

so
 - 93b100ddd... fools DC
 + 93b100ddd... fools PM

from my point of view scaling clock values just happens at the wrong place.
So we may have to find different points in code from where smuX_get_XXX gets called by PM _or_ DC, may be in Firmware.

Revision history for this message
In , michel (michel-linux-kernel-bugs) wrote :

People on the Phoronix forum mentioned that this doesn't seem to happen with 4.19-rc kernels. If people here can confirm that, maybe there are other corresponding changes that need to be backported as well.

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

(In reply to Michel Dänzer from comment #19)
> People on the Phoronix forum mentioned that this doesn't seem to happen with
> 4.19-rc kernels. If people here can confirm that, maybe there are other
> corresponding changes that need to be backported as well.

4.19-rc1:

amdgpu-pci-0100
Adapter: PCI adapter
vddgfx: +0.81 V
fan1: 1602 RPM
temp1: +22.0°C (crit = +94.0°C, hyst = -273.1°C)
power1: 6.10 W (cap = 48.00 W)

Revision history for this message
In , thomas-lange2 (thomas-lange2-linux-kernel-bugs) wrote :

I can confirm that only 4.18.x (x > 9) is affected. 4.19-rc6 reports the same clock and power values as with 4.18.9. At least that's the case for my RX 560.

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

Created attachment 278939
bisect

Author of 93b100ddda3be284be160e9ccba28c7f8f21ab73 simply forgot to remove scaling values for powerplay.
Have a look at 23ec3d1479fd79658cd52c47618d8ddd2f32550b where the same scaling applied to vega.
You may have to patch drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_services.c
too.
Have a look at needed_patch.txt

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_services.c (v4.18.12)
There is a lot of work to do:

      230 for (i = 0; i < dc_clks->num_levels; i++) {
      231 DRM_INFO("DM_PPLIB:\t %d\n", pp_clks->clock[i]);
      232 /* translate 10kHz to kHz */
      233 dc_clks->clocks_in_khz[i] = pp_clks->clock[i] * 10;
      234 }

      257 for (i = 0; i < clk_level_info->num_levels; i++) {
      258 DRM_DEBUG("DM_PPLIB:\t %d in 10kHz\n", pp_clks->data[i].clocks_in_khz);
      259 /* translate 10kHz to kHz */
      260 clk_level_info->data[i].clocks_in_khz
                            = pp_clks->data[i].clocks_in_khz * 10;
      261 clk_level_info->data[i].latency_in_us
                            = pp_clks->data[i].latency_in_us;
      262 }

and maybe

      306 /* Translate 10 kHz to kHz. */
      307 validation_clks.engine_max_clock *= 10;
      308 validation_clks.memory_max_clock *= 10;

since 2017-09-12 15:58:20

bool dm_pp_get_clock_levels_by_type_with_voltage(
        const struct dc_context *ctx,
        enum dm_pp_clock_type clk_type,
        struct dm_pp_clock_levels_with_voltage *clk_level_info)
{
        /* TODO: to be implemented */
        return false;
}

bool dm_pp_notify_wm_clock_changes(
        const struct dc_context *ctx,
        struct dm_pp_wm_sets_with_clock_ranges *wm_with_clock_ranges)
{
        /* TODO: to be implemented */
        return false;
}

bool dm_pp_apply_power_level_change_request(
        const struct dc_context *ctx,
        struct dm_pp_power_level_change_request *level_change_req)
{
        /* TODO: to be implemented */
        return false;
}

bool dm_pp_apply_clock_for_voltage_request(
        const struct dc_context *ctx,
        struct dm_pp_clock_for_voltage_req *clock_for_voltage_req)
{
        /* TODO: to be implemented */
        return false;
}

bool dm_pp_get_static_clocks(
        const struct dc_context *ctx,
        struct dm_pp_static_clock_info *static_clk_info)
{
        /* TODO: to be implemented */
        return false;
}

Revision history for this message
In , alexdeucher (alexdeucher-linux-kernel-bugs) wrote :

This patch shouldn't have been applied to 4.18. It looks like it was autoselected:
https://lkml.org/lkml/2018/9/15/172
It should be reverted.

Revision history for this message
Daniel (khazaei.danial) wrote :
description: updated
description: updated
Revision history for this message
Daniel (khazaei.danial) wrote :

this report is from running lm-sensor when system is idle:

kernel 4.18.0.8

amdgpu-pci-6500
Adapter: PCI adapter
vddgfx: +0.77 V
fan1: 974 RPM
temp1: +30.0°C (crit = +94.0°C, hyst = -273.1°C)
power1: 6.11 W (cap = 48.00 W)

kernel 4.18.0.9.10

amdgpu-pci-6500
Adapter: PCI adapter
vddgfx: +1.08 V
fan1: 974 RPM
temp1: +41.0°C (crit = +94.0°C, hyst = -273.1°C)
power1: 18.47 W (cap = 48.00 W)

clearly the voltage increase is the reason or the dynamic power management is broken...
tried with amdgpu.dpm=1 on boot with no luck....

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

(In reply to Alex Deucher from comment #24)
> This patch shouldn't have been applied to 4.18. It looks like it was
> autoselected:
> https://lkml.org/lkml/2018/9/15/172
> It should be reverted.

So "[...]the display component was requesting minimum clocks[...]" isn´t an issue with Polaris?
Is there any QA left?
Avoiding unusual units is a good idea generally, but it should happen very early in development.

Revision history for this message
In , alexdeucher (alexdeucher-linux-kernel-bugs) wrote :

(In reply to quirin.blaeser from comment #25)
> (In reply to Alex Deucher from comment #24)
> > This patch shouldn't have been applied to 4.18. It looks like it was
> > autoselected:
> > https://lkml.org/lkml/2018/9/15/172
> > It should be reverted.
>
> So "[...]the display component was requesting minimum clocks[...]" isn´t an
> issue with Polaris?
> Is there any QA left?
> Avoiding unusual units is a good idea generally, but it should happen very
> early in development.

It was a fix for fallout from an interface refactor we did in 4.19 that mixed up the units between display and power. We did not intend to have the patch applied to 4.18 and we did not flag the patch for 4.18, it was flagged for 4.18 by someone else outside of AMD.

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

I can´t find "c3df50abc84b" from https://lkml.org/lkml/2018/9/15/172
but "drm/amd/pp: Convert clock unit to KHz as defined" is 23ec3d1479fd79658cd52c47618d8ddd2f32550b

Revision history for this message
In , alexdeucher (alexdeucher-linux-kernel-bugs) wrote :

(In reply to quirin.blaeser from comment #27)
> I can´t find "c3df50abc84b" from https://lkml.org/lkml/2018/9/15/172
> but "drm/amd/pp: Convert clock unit to KHz as defined" is
> 23ec3d1479fd79658cd52c47618d8ddd2f32550b

That was the commit id in our amd-staging-drm-next branch:
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-next&id=c3df50abc84b289be8e7b96968d7d7e006576880

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

Created attachment 278965
git diff -p 93b100ddda3be284be160e9ccba28c7f8f21ab73..93b100ddda3be284be160e9ccba28c7f8f21ab73^1

git diff -p 93b100ddda3be284be160e9ccba28c7f8f21ab73..93b100ddda3be284be160e9ccba28c7f8f21ab73^1

Apply to v4.18.10 .. v4.18.12 to revert

Revision history for this message
In , quirin.blaeser (quirin.blaeser-linux-kernel-bugs) wrote :

(In reply to Alex Deucher from comment #28)
> (In reply to quirin.blaeser from comment #27)
> > I can´t find "c3df50abc84b" from https://lkml.org/lkml/2018/9/15/172
> > but "drm/amd/pp: Convert clock unit to KHz as defined" is
> > 23ec3d1479fd79658cd52c47618d8ddd2f32550b
>
> That was the commit id in our amd-staging-drm-next branch:
> https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-
> next&id=c3df50abc84b289be8e7b96968d7d7e006576880

thx

Revision history for this message
Cristian Aravena Romero (caravena) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.19-rc7 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.19-rc7/

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Cristian Aravena Romero (caravena) wrote :

@Daniel:

https://bugzilla.kernel.org/show_bug.cgi?id=201275

Best regards,
--
Cristian Aravena Romero (caravena)

Revision history for this message
Cristian Aravena Romero (caravena) wrote :

Report in kernel.org
"Power consumption RX560 idle raised from 7 W to 13 W"
--
Cristian Aravena Romero (caravena)

Changed in linux (Ubuntu):
importance: Undecided → High
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Daniel (khazaei.danial) wrote :

@Cristian

thank you for the heads up...so this the upstream mainline kernel bug?
From the link you have posted I can see that the latest 4.19-rc doesn't seem to affected!

Do you still need me to install the latest kernel and confirm whether this fixed or not using the following instruction you have mentioned earlier?

- If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

- If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

- Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Regards,

Revision history for this message
Cristian Aravena Romero (caravena) wrote :

Help Joseph!

:)
--
Cristian Aravena Romero (caravena)

Revision history for this message
In , caravena (caravena-linux-kernel-bugs) wrote :

Hello,

This error is also registered on launchpad.net
https://bugs.launchpad.net/bugs/1796720

Best regards,
--
Cristian Aravena Romero (caravena)

Revision history for this message
Cristian Aravena Romero (caravena) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.18.13 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0]http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18.13/

Changed in linux (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
In , caravena (caravena-linux-kernel-bugs) wrote :

Hello,

The kernel 4.18.13 works correctly?

Best regards,
--
Cristian Aravena Romero (caravena)

Revision history for this message
In , harry.wentland (harry.wentland-linux-kernel-bugs) wrote :

Yes, it should. GregKH reverted the offending commit in 4.18.13.

Revision history for this message
Daniel (khazaei.danial) wrote :

I can confirm that the problem is fixed in 4.18.13:

amdgpu-pci-6500
Adapter: PCI adapter
vddgfx: +0.77 V
fan1: 968 RPM
temp1: +29.0°C (crit = +94.0°C, hyst = -273.1°C)
power1: 6.11 W (cap = 48.00 W)

tags: added: kernel-fixed-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Cristian Aravena Romero (caravena) wrote :

Excelent! Thanks You...
--
Cristian Aravena Romero (caravena)

Revision history for this message
Daniel (khazaei.danial) wrote :

I have tried the latest Ubuntu 18.10 kernel (4.18.0-10.11) available here:

https://launchpad.net/ubuntu/+source/linux

The problem is not fixed in this version and unfortunately it seems that Ubuntu 18.10 will be released with this kernel on October 18th as it is stated "pre-release freeze"...

Regards

Revision history for this message
Cristian Aravena Romero (caravena) wrote :

Hello Daniel,

Within the Cosmic [0]planning, the kernel [1]freeze is detailed (October 4th):
[0] https://wiki.ubuntu.com/CosmicCuttlefish/ReleaseSchedule
[1] https://wiki.ubuntu.com/KernelFreeze

The kernel is already frozen, probably the version that does not exist yet (Linux 4.18.0-11.x) will work correctly for you.

Best regards,
--
Cristian Aravena Romero (caravena)

Revision history for this message
Daniel (khazaei.danial) wrote :

today received update to kernel version 4.18.0-11.12 and the problem is not fixed yet...

Revision history for this message
Cristian Aravena Romero (caravena) wrote :

Help Joseph!
--
Cristian

Revision history for this message
Cristian Aravena Romero (caravena) wrote :

Daniel,

See Bug #1801931 :)
--
Cristian

Revision history for this message
Daniel (khazaei.danial) wrote :

Hi Cristian,

thank you for the heads up...

Regards

Changed in linux:
importance: Unknown → Low
status: Unknown → Fix Released
Revision history for this message
Per-Inge (per-inge-hallin) wrote :

This kernel update didn't fix the the high power consumption on my Radeon RX580 GPU.

Start-Date: 2018-11-14 06:51:30
Commandline: /usr/bin/unattended-upgrade
Install: linux-headers-4.18.0-11:amd64 (4.18.0-11.12, automatic), linux-image-4.18.0-11-generic:amd64 (4.18.0-11.12, automatic), linux-modules-extra-4.18.0-11-generic:amd64 (4.18.0-11.12, automatic), linux-modules-4.18.0-11-generic:amd64 (4.18.0-11.12, automatic), linux-headers-4.18.0-11-generic:amd64 (4.18.0-11.12, automatic)
Upgrade: linux-headers-generic:amd64 (4.18.0.10.11, 4.18.0.11.12), linux-image-generic:amd64 (4.18.0.10.11, 4.18.0.11.12), linux-generic:amd64 (4.18.0.10.11, 4.18.0.11.12)
End-Date: 2018-11-14 06:52:01

Revision history for this message
Daniel (khazaei.danial) wrote :

Dear Per-Inge,

Check for kernel version 4.18.0-12.13 here:

https://launchpad.net/ubuntu/+source/linux

It's released as a proposed package and it has the fix! I'm using it right now, but you need to enable proposed repository on your system or wait for it to complete the testing phase and be available on regular channel.

Regards,
D. Khazaei

Revision history for this message
Per-Inge (per-inge-hallin) wrote :

Thanks a lot.
This worked OK. I used kernel 4.19.1, but now I have removed it and I have also removed the "proposed package" setting.
The power draw from the wall outlet at idle is now back to the normal 63W-65W.

Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.