[arch-general] IOMMU spam with kernel 5.15.2.arch1-1 when starting VM

jk jk at vin.ovh
Sun Nov 14 16:59:48 UTC 2021


Hello, after the recent linux kernel update I started to get a lot of 
errors on dmesg when starting
a virtual machine, I'm passing through a gpu to it. The gist of the 
error spam is this

[  +0.000001] DMAR: ERROR: DMA PTE for vPFN 0x813ee8 already set (to 
105eff003 not 105eff003)
[  +0.000002] ------------[ cut here ]------------
[  +0.000001] WARNING: CPU: 8 PID: 1960 at 
drivers/iommu/intel/iommu.c:2381 __domain_mapping.cold+0x32/0x39
[  +0.000001] Modules linked in: vhost_net vhost vhost_iotlb tap tun 
wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 
poly1305_x86_64 libblake2s blake2s_x86_64 libcurve25519_generic 
libchacha libblake2s_generic ip6_udp_tunnel udp_tunnel bridge stp llc 
nft_masq nft_chain_nat nf_nat nft_reject_inet nf_reject_ipv4 
nf_reject_ipv6 nft_reject nft_counter nft_limit nft_ct nf_conntrack 
nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink nct6775 hwmon_vid 
snd_hda_codec_realtek snd_hda_codec_generic intel_rapl_msr iTCO_wdt 
intel_pmc_bxt mei_hdcp ee1004 iTCO_vendor_support wmi_bmof 
snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel 
soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda 
intel_rapl_common mousedev snd_sof_pci snd_usb_audio snd_sof_xtensa_dsp 
snd_sof snd_usbmidi_lib x86_pkg_temp_thermal joydev snd_soc_hdac_hda 
intel_powerclamp snd_rawmidi coretemp snd_hda_ext_core 
snd_soc_acpi_intel_match snd_seq_device mc snd_soc_acpi kvm_intel 
soundwire_bus
[  +0.000017]  ledtrig_audio kvm snd_soc_core intel_cstate snd_compress 
intel_spi_pci amdgpu intel_spi intel_uncore pcspkr ac97_bus 
snd_hda_codec_hdmi spi_nor snd_pcm_dmaengine mtd snd_hda_intel i2c_i801 
snd_intel_dspcfg i2c_smbus r8169 snd_intel_sdw_acpi snd_hda_codec 
realtek mei_me mdio_devres snd_hda_core gpu_sched libphy mei nouveau 
snd_hwdep snd_pcm snd_timer snd soundcore mxm_wmi vfat drm_ttm_helper 
wmi fat mac_hid acpi_pad acpi_tad kvmfr(OE) fuse ip_tables x_tables ext4 
crc16 mbcache jbd2 usbhid dm_thin_pool dm_persistent_data libcrc32c 
crc32c_generic dm_bio_prison dm_bufio dm_crypt cbc encrypted_keys 
trusted asn1_encoder tee i915 crct10dif_pclmul crc32_pclmul dm_mod 
crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd tpm_crb cryptd 
intel_gtt xhci_pci ttm xhci_pci_renesas tpm_tis tpm_tis_core tpm 
rng_core video vfio_pci vfio_pci_core irqbypass vfio_virqfd 
vfio_iommu_type1 vfio
[  +0.000020] CPU: 8 PID: 1960 Comm: CPU 6/KVM Tainted: G     U W  
OE     5.15.2-arch1-1 #1 e3bfbeb633edc604ba956e06f24d5659e31c294f
[  +0.000001] Hardware name: To Be Filled By O.E.M. To Be Filled By 
O.E.M./Z590 Steel Legend, BIOS P1.80 07/22/2021
[  +0.000000] RIP: 0010:__domain_mapping.cold+0x32/0x39
[  +0.000001] Code: f6 48 c7 c7 80 0f af a0 4c 89 54 24 08 e8 e8 aa fd 
ff 8b 05 49 b1 1d 01 4c 8b 54 24 08 85 c0 74 09 83 e8 01 89 05 37 b1 1d 
01 <0f> 0b e9 2f a5 b8 ff 4c 89 c9 44 89 da 48 c7 c6 30 32 74 a0 48 c7
[  +0.000000] RSP: 0018:ffffb32ac27dfbf0 EFLAGS: 00010246
[  +0.000001] RAX: 0000000000000000 RBX: ffff972a598b5740 RCX: 
0000000000000000
[  +0.000000] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[  +0.000001] RBP: 0000000000000001 R08: 0000000000000000 R09: 
0000000000000000
[  +0.000000] R10: ffff972a02213700 R11: 0000000000000000 R12: 
ffff972a02213700
[  +0.000001] R13: 0000000000105eff R14: 0000000000813ee8 R15: 
0000000105eff003
[  +0.000000] FS:  00007f1f699ff640(0000) GS:ffff97313fc00000(0000) 
knlGS:0000000000000000
[  +0.000001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000001] CR2: 00007f1f51bab000 CR3: 000000018a69a002 CR4: 
0000000000772ee0
[  +0.000000] PKRU: 55555554
[  +0.000000] Call Trace:
[  +0.000001]  intel_iommu_map_pages+0xb0/0x100
[  +0.000002]  __iommu_map+0xde/0x2c0
[  +0.000001]  iommu_map+0x41/0x80
[  +0.000001]  vfio_iommu_type1_ioctl+0x85f/0x1670 [vfio_iommu_type1 
1ddc5c5dceb1dace15f0dee1b2f809fe6662c7c4]
[  +0.000003]  __x64_sys_ioctl+0x8b/0xd0
[  +0.000001]  do_syscall_64+0x59/0x90
[  +0.000002]  ? do_user_addr_fault+0x20b/0x6b0
[  +0.000002]  ? exc_page_fault+0x72/0x180
[  +0.000000]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  +0.000002] RIP: 0033:0x7f218b9d559b
[  +0.000000] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 
89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a5 a8 0c 00 f7 d8 64 89 01 48
[  +0.000001] RSP: 002b:00007f1f699fdfa8 EFLAGS: 00000246 ORIG_RAX: 
0000000000000010
[  +0.000000] RAX: ffffffffffffffda RBX: 000055d7a02181d0 RCX: 
00007f218b9d559b
[  +0.000001] RDX: 00007f1f699fdfb0 RSI: 0000000000003b71 RDI: 
000000000000002b
[  +0.000000] RBP: 0000000812000000 R08: 0000000000000000 R09: 
0000000006000000
[  +0.000001] R10: 0000000812000000 R11: 0000000000000246 R12: 
0000000002000000
[  +0.000000] R13: 0000000812000000 R14: 00007f1f699fdfb0 R15: 
0000000000000000

Here's the full dmesg log, too big for pasting sites: 
https://cloud.vin.ovh/s/3TpSgCgYs9oq9jY

I started to look around for this error messages and found the following 
mailing list.

https://www.spinics.net/lists/kernel/msg4097048.html

First it proposed a patch to silence those error messages as they aren't 
lethal but after some research another patch
was proposed which actually solved the issue I think.

https://lists.linuxfoundation.org/pipermail/iommu/2021-October/059955.html

It is quite small and it seems that it didn't get the required 
attention. I added the patch to the linux PKGBUILD and
the errors no longer appear. Is there a possibility to add that patch to 
the linux package? Don't really want to
build the kernel every time it is updated.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.archlinux.org/pipermail/arch-general/attachments/20211114/b2842168/attachment.sig>


More information about the arch-general mailing list