On 08/27/2015 02:29 PM, Casey Peter wrote:
I'm running a Gigabyte 970A-D3P, and with "iommu=soft" kernel parameter set up, I don't have those errors either. (I did have them before turning iommu on in bios and setting the kernel parameter).
I think we are getting somewhere, there is a mce on the number of CPUs: [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] smpboot: Allowing 8 CPUs, 0 hotplug CPUs [ 0.000000] Booting paravirtualized kernel on bare hardware [ 0.000000] setup_percpu: NR_CPUS:128 nr_cpumask_bits:128 nr_cpu_ids:8 nr_node_ids:1 [ 0.000000] PERCPU: Embedded 33 pages/cpu @ffff88042ec00000 s95576 r8192 d31400 u262144 [ 0.000000] pcpu-alloc: s95576 r8192 d31400 u262144 alloc=1*2097152 [ 0.000000] pcpu-alloc: [0] 0 1 2 3 4 5 6 7 [ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-linux root=UUID=515ef9dc-769f-4548-9a08-3a92fa83d86b rw quiet [ 0.000000] Memory: 16395952K/16740972K available (5699K kernel code, 893K rwdata, 1732K rodata, 1180K init, 1152K bss, 345020K reserved, 0K cma-reserved) [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=8, Nodes=1 [ 0.000000] RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=8. [ 0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=8 [ 0.009332] CPU: Physical Processor ID: 0 [ 0.009333] CPU: Processor Core ID: 0 [ 0.009334] mce: CPU supports 7 MCE banks [ 0.230921] smpboot: CPU0: AMD FX(tm)-8350 Eight-Core Processor (fam: 15, model: 02, stepping: 00) [ 0.247684] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter. [ 0.254353] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 [ 0.364267] x86: Booted up 1 node, 8 CPUs [ 0.391139] cpuidle: using governor ladder [ 0.404490] cpuidle: using governor menu [ 0.405039] mtrr: your CPUs had inconsistent variable MTRR settings [ 0.405040] mtrr: probably your BIOS does not setup all CPUs. I've tried setting "amd_iommu=on" in default/grub. I'll try iommu=soft and report back. Is there anything else to check? Funny, my IOMMU doesn't seem to trigger any issue: [ 0.792454] Unpacking initramfs... [ 0.843735] Freeing initrd memory: 3924K (ffff880037846000 - ffff880037c1b000) [ 0.844350] AMD-Vi: Found IOMMU at 0000:00:00.2 cap 0x40 [ 0.844351] AMD-Vi: Interrupt remapping enabled [ 0.855146] AMD-Vi: Lazy IO/TLB flushing enabled My issue explodes after xhci_hcd: [ 1.159635] ohci-pci: OHCI PCI platform driver [ 1.165660] ehci-pci 0000:00:12.2: USB 2.0 started, EHCI 1.00 [ 1.165859] hub 1-0:1.0: USB hub found [ 1.165868] hub 1-0:1.0: 5 ports detected [ 1.166060] xhci_hcd 0000:02:00.0: xHCI Host Controller [ 1.166068] xhci_hcd 0000:02:00.0: new USB bus registered, assigned bus number 2 [ 1.166126] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016 address=0x00000000ce9f9880 flags=0x0010] [ 1.167066] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016 address=0x00000000ce9f9880 flags=0x0010] [ 1.168025] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016 address=0x00000000ce9f9880 flags=0x0010] <snip repeated> [ 1.202519] AMD-Vi: Event logged [ [ 1.202571] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0 [ 1.202829] IO_PAGE_FAULT device=02:00.0 domain=0x0016 address=0x00000000ce9f9880 flags=0x0010] [ 1.202843] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016 address=0x00000000ce9f9880 flags=0x0010] <snip repeated> [ 1.216256] AMD-Vi: Event logged [ [ 1.216326] firewire_ohci 0000:04:0e.0: added OHCI v1.10 device as card 0, 4 IR + 8 IT contexts, quirks 0x11 [ 1.216547] IO_PAGE_FAULT device=02:00.0 domain=0x0016 address=0x00000000ce9f9880 flags=0x0010] [ 1.216563] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016 address=0x00000000ce9f9880 flags=0x0010] <snip repeated> [ 1.716168] firewire_core 0000:04:0e.0: created device fw0: GUID 0014aafc64aa2c00, S400 [ 1.716813] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016 address=0x00000000ce9f9880 flags=0x0010] <snip repeated> [ 1.932839] tsc: Refined TSC clocksource calibration: 4018.289 MHz [ 1.932842] clocksource tsc: mask: 0xffffffffffffffff max_cycles: 0x39ebd986d5e, max_idle_ns: 440795317543 ns [ 1.935061] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016 address=0x00000000ce9f9880 flags=0x0010] <snip repeated> [ 2.937205] AMD-Vi: Event logged [ [ 2.937208] Switched to clocksource tsc [ 2.937495] IO_PAGE_FAULT device=02:00.0 domain=0x0016 address=0x00000000ce9f9880 flags=0x0010] [ 2.941453] AMD-Vi: Event logged [IO_PAGE_FAULT device=02:00.0 domain=0x0016 address=0x00000000ce9f9880 flags=0x0010] <snip repeated> [ 20.090108] xhci_hcd 0000:02:00.0: can't setup: -110 [ 20.094746] xhci_hcd 0000:02:00.0: USB bus 2 deregistered [ 20.094771] ehci-pci 0000:00:13.2: EHCI Host Controller [ 20.094778] ehci-pci 0000:00:13.2: new USB bus registered, assigned bus number 2 [ 20.094783] ehci-pci 0000:00:13.2: applying AMD SB700/SB800/Hudson-2/3 EHCI dummy qh workaround [ 20.094791] ehci-pci 0000:00:13.2: debug port 1 [ 20.094796] xhci_hcd 0000:02:00.0: init 0000:02:00.0 fail, -110 [ 20.094837] ehci-pci 0000:00:13.2: irq 17, io mem 0xfe507000 [ 20.099716] xhci_hcd: probe of 0000:02:00.0 failed with error -110 [ 20.104621] ehci-pci 0000:00:13.2: USB 2.0 started, EHCI 1.00 [ 20.104805] hub 2-0:1.0: USB hub found [ 20.104811] hub 2-0:1.0: 5 ports detected [ 20.105034] ehci-pci 0000:00:16.2: EHCI Host Controller [ 20.105039] ehci-pci 0000:00:16.2: new USB bus registered, assigned bus number 3 [ 20.105042] ehci-pci 0000:00:16.2: applying AMD SB700/SB800/Hudson-2/3 EHCI dummy qh workaround [ 20.105050] ehci-pci 0000:00:16.2: debug port 1 [ 20.105073] ehci-pci 0000:00:16.2: irq 17, io mem 0xfe504000 [ 20.114633] ehci-pci 0000:00:16.2: USB 2.0 started, EHCI 1.00 [ 20.114787] hub 3-0:1.0: USB hub found [ 20.114794] hub 3-0:1.0: 4 ports detected [ 20.115031] ohci-pci 0000:00:12.0: OHCI PCI host controller [ 20.115039] ohci-pci 0000:00:12.0: new USB bus registered, assigned bus number 4 [ 20.115065] ohci-pci 0000:00:12.0: irq 18, io mem 0xfe50a000 [ 20.172168] hub 4-0:1.0: USB hub found [ 20.172177] hub 4-0:1.0: 5 ports detected [ 20.172396] ohci-pci 0000:00:13.0: OHCI PCI host controller [ 20.172401] ohci-pci 0000:00:13.0: new USB bus registered, assigned bus number 5 [ 20.172418] ohci-pci 0000:00:13.0: irq 18, io mem 0xfe508000 [ 20.228880] hub 5-0:1.0: USB hub found [ 20.228889] hub 5-0:1.0: 5 ports detected [ 20.229111] ohci-pci 0000:00:14.5: OHCI PCI host controller [ 20.229117] ohci-pci 0000:00:14.5: new USB bus registered, assigned bus number 6 [ 20.229134] ohci-pci 0000:00:14.5: irq 18, io mem 0xfe506000 [ 20.285567] hub 6-0:1.0: USB hub found [ 20.285575] hub 6-0:1.0: 2 ports detected [ 20.285739] ohci-pci 0000:00:16.0: OHCI PCI host controller [ 20.285744] ohci-pci 0000:00:16.0: new USB bus registered, assigned bus number 7 [ 20.285759] ohci-pci 0000:00:16.0: irq 18, io mem 0xfe505000 <snip boot continues normally> I'll keep digging, but this is got me stumped. -- David C. Rankin, J.D.,P.E.