[arch-general] Steam hard-locks my PC (amdgpu fault)

David Runge dave at sleepmap.de
Sat Apr 13 15:11:21 UTC 2019


Hi!

On 2019-04-13 23:40:23 (+1000), Stephen Gregoratto via arch-general wrote:
> I've been having this problem for a while (since late 4.??) and it's
> been driving me up the wall. Basically, opening Steam 9/10 times hard
> locks my PC. I can still ssh into it, but the display is frozen and it
> hangs on shutdown, requiring a manual reset. Here's what comes up when
> viewing the dmesg:
We have a bug tracker. Please use it (for searches and reporting):
https://bugs.archlinux.org/

> [ 5191.955414] amdgpu 0000:01:00.0: GPU fault detected: 147 0x0ef1c801 for process vulkandriverque pid 11510 thread vulkandriverque pid 11510
> [ 5191.955416] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x0FDEFDDE
> [ 5191.955417] amdgpu 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x021C8001
> [ 5191.955419] amdgpu 0000:01:00.0: VM fault (0x01, vmid 1, pasid 32776) at page 266272222, read from 'TC6' (0x54433600) (456)
> [ 5202.015490] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=196445, emitted seq=196447
> [ 5202.015588] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process vulkandriverque pid 11510 thread vulkandriverque pid 11510
> [ 5202.015610] amdgpu 0000:01:00.0: GPU reset begin!
> [ 5209.913537] audit: type=1006 audit(1555161631.600:68): pid=11659 uid=0 old-auid=4294967295 auid=1000 tty=(none) old-ses=4294967295 ses=3 res=1
> [ 5212.032315] [drm:amdgpu_dm_atomic_check [amdgpu]] *ERROR* [CRTC:49:crtc-1] hw_done or flip_done timed out
> [ 5406.595049] INFO: task kworker/u16:3:16913 blocked for more than 120 seconds.
> [ 5406.595052]       Not tainted 5.0.4-arch1-1-ARCH #1
> [ 5406.595053] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 5406.595055] kworker/u16:3   D    0 16913      2 0x80000080
> [ 5406.595074] Workqueue: events_unbound commit_work [drm_kms_helper]
> [ 5406.595075] Call Trace:
> [ 5406.595085]  ? __schedule+0x30b/0x8b0
> [ 5406.595089]  schedule+0x32/0x80
> [ 5406.595093]  schedule_timeout+0x311/0x4a0
> [ 5406.595205]  ? dce110_timing_generator_get_crtc_scanoutpos+0x88/0x130 [amdgpu]
> [ 5406.595210]  dma_fence_default_wait+0x204/0x280
> [ 5406.595213]  ? dma_fence_wait_timeout+0x120/0x120
> [ 5406.595215]  dma_fence_wait_timeout+0x105/0x120
> [ 5406.595218]  reservation_object_wait_timeout_rcu+0x1f2/0x370
> [ 5406.595224]  ? preempt_count_add+0x79/0xb0
> [ 5406.595331]  amdgpu_dm_do_flip+0x14a/0x4a0 [amdgpu]
> [ 5406.595337]  ? _raw_spin_unlock_irqrestore+0x20/0x40
> [ 5406.595445]  ? amdgpu_dm_atomic_commit_tail+0x5f9/0xbc0 [amdgpu]
> [ 5406.595547]  amdgpu_dm_atomic_commit_tail+0x5f9/0xbc0 [amdgpu]
> [ 5406.595561]  commit_tail+0x3d/0x70 [drm_kms_helper]
> [ 5406.595566]  process_one_work+0x1eb/0x410
> [ 5406.595570]  worker_thread+0x2d/0x3d0
> [ 5406.595573]  ? process_one_work+0x410/0x410
> [ 5406.595576]  kthread+0x112/0x130
> [ 5406.595578]  ? kthread_park+0x80/0x80
> [ 5406.595581]  ret_from_fork+0x1f/0x40
Seems like you hit a bug in the driver or firmware for your AMD GPU.

Make sure to look into microcode updates for your CPU:
https://wiki.archlinux.org/index.php/Microcode

Also, look into any pitfalls regarding your GPU:
https://wiki.archlinux.org/index.php/AMDGPU

You're likely better off searching for similar issues of users with your
graphics card and/or reporting this to (your card's) upstream directly
though.

> Graphics:  Device-1: Advanced Micro Devices [AMD/ATI] Fiji [Radeon R9 FURY / NANO Series] driver: amdgpu v: kernel 
>            Display: server: X.org 1.20.4 driver: amdgpu tty: 228x62 
This is the relevant data.

> And here's all my packages:
There's no reason to post them.

Best,
David

P.S.: Please refrain from sending extensive output.

-- 
https://sleepmap.de
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <https://lists.archlinux.org/pipermail/arch-general/attachments/20190413/c45879b2/attachment.sig>


More information about the arch-general mailing list