[arch-general] Getting freeze on early start with linux 4.9-1 kernel.

Carsten Mattner carstenmattner at gmail.com
Sun Dec 25 01:33:28 UTC 2016


On Sat, Dec 24, 2016 at 4:33 PM, Mauro Santos
<registo.mailling at gmail.com> wrote:
> On 24-12-2016 14:14, Carsten Mattner wrote:
>> On Fri, Dec 23, 2016 at 3:17 PM, Mauro Santos via arch-general
>> <arch-general at archlinux.org> wrote:
>>> On 23-12-2016 13:58, Carsten Mattner via arch-general wrote:
>>>> On Fri, Dec 23, 2016 at 1:59 PM, fredbezies via arch-general
>>>> <arch-general at archlinux.org> wrote:
>>>>> Hello.
>>>>>
>>>>> I'm facing an annoying bug with linux 4.9-1 kernel on my 6 or 7 years
>>>>> old Toshiba Laptop. When I try to make it boot on with linux 4.9-1
>>>>> kernel, it freeze right after loading initramfs.
>>>>>
>>>>> 4.8.xx kernel was working flawlessly. My eeePC (nearly 9 years old)
>>>>> and my desktop computer (which is AMD based) are both starting with
>>>>> linux 4.9.
>>>>>
>>>>> I opened a bug : https://bugs.archlinux.org/task/52246
>>>>>
>>>>> Here is my lspci. If someone can help me finding what is happening,
>>>>> I'll be very happy :
>>>>>
>>>>> 00:00.0 Host bridge: Intel Corporation Mobile 4 Series Chipset Memory
>>>>> Controller Hub (rev 07)
>>>>> 00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series
>>>>> Chipset Integrated Graphics Controller (rev 07)
>>>>> 00:02.1 Display controller: Intel Corporation Mobile 4 Series Chipset
>>>>> Integrated Graphics Controller (rev 07)
>>>>> 00:1a.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB
>>>>> UHCI Controller #4 (rev 03)
>>>>> 00:1a.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB
>>>>> UHCI Controller #5 (rev 03)
>>>>> 00:1a.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2
>>>>> EHCI Controller #2 (rev 03)
>>>>> 00:1b.0 Audio device: Intel Corporation 82801I (ICH9 Family) HD Audio
>>>>> Controller (rev 03)
>>>>> 00:1c.0 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express
>>>>> Port 1 (rev 03)
>>>>> 00:1c.1 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express
>>>>> Port 2 (rev 03)
>>>>> 00:1c.4 PCI bridge: Intel Corporation 82801I (ICH9 Family) PCI Express
>>>>> Port 5 (rev 03)
>>>>> 00:1d.0 USB controller: Intel Corporation 82801I (ICH9 Family) USB
>>>>> UHCI Controller #1 (rev 03)
>>>>> 00:1d.1 USB controller: Intel Corporation 82801I (ICH9 Family) USB
>>>>> UHCI Controller #2 (rev 03)
>>>>> 00:1d.2 USB controller: Intel Corporation 82801I (ICH9 Family) USB
>>>>> UHCI Controller #3 (rev 03)
>>>>> 00:1d.3 USB controller: Intel Corporation 82801I (ICH9 Family) USB
>>>>> UHCI Controller #6 (rev 03)
>>>>> 00:1d.7 USB controller: Intel Corporation 82801I (ICH9 Family) USB2
>>>>> EHCI Controller #1 (rev 03)
>>>>> 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev 93)
>>>>> 00:1f.0 ISA bridge: Intel Corporation ICH9M LPC Interface Controller (rev 03)
>>>>> 00:1f.2 SATA controller: Intel Corporation 82801IBM/IEM
>>>>> (ICH9M/ICH9M-E) 4 port SATA Controller [AHCI mode] (rev 03)
>>>>> 00:1f.3 SMBus: Intel Corporation 82801I (ICH9 Family) SMBus Controller (rev 03)
>>>>> 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>>>>> RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller (rev 02)
>>>>> 03:00.0 Ethernet controller: Qualcomm Atheros AR242x / AR542x Wireless
>>>>> Network Adapter (PCI-Express) (rev 01)
>>>>
>>>> Does the fallback boot entry work?
>>>>
>>>> Have you tried reinstalling the kernel?
>>>>
>>>> I wish arch would (like other distros) keep 2 or three old kernel
>>>> versions around because it doesn't take any space to do so
>>>> and works around boot bugs in new kernels.
>>>>
>>>
>>> Care to explain how "doesn't take any space" works? Last time I checked
>>> files do take up space. There is an LTS kernel in the repos, which you
>>> can have installed exactly for things like this.
>>
>> While writing that I knew somebody would read it in strict interpretation mode.
>> s/no space/not enough space in \/boot to matter/
>
> The kernel only might not take much space but you have to take into
> account the initramfs images and kernel modules too. All together it
> should amount to over 100MiB per kernel.
>
> What other distros do is recommend a 1GB /boot or changing the
> configuration to reduce the number of older kernels installed[1]. People
> have complained about small libraries needing to be installed as being
> wasteful, at a grand total 100MiB+ for each kernel that would start a
> nice flamewar.
>
>>> There is also the matter of automagic bootloader configuration change to
>>> support that, not to mention people that use efistub to boot their
>>> system, how do you propose to handle that?
>>
>> If you have installed archlinux, it's reasonable to expect that one knows
>> how to configure this.
>>
>
> It is you who said "I wish arch would (like other distros) keep 2 or
> three old kernel versions around" not me. Other distributions
> automagically take care of updating the bootloader configuration, as
> much would be expected of arch.
>
> Some people already have trouble managing to update one kernel properly,
> imagine the chaos it would be with more than one if manual steps were
> involved, not to mention old kernels have _known_ security issues and
> having old stuff around is not the Arch way.

Fair point. My suggestion was of course for the default arch kernel
to have at least one alt version there as a fallback and I think one of old
LTS branches would make sense, if you don't happen to require 4.10
for AMD Ryzen or have a similar requirement.

Knowing that there's 4.1-lts or 3.18-lts in the boot menu that works
good enough to access the system and fix any boot issue with 4.9
is invaluable, if we stop suggesting to people to make small /boot
partitions. Unless you're in a constrained environment, it isn't
a big deal to allocate 2G for /boot and even 1G would be plenty
enough for one stable kernel and a fallback lts kernel as the
emergency boot option, plus more.

I have 4 different kernels, all with two initrds in a 256MB /boot
partition and still over 100MB free. The biggest initrds are the
fallback ones and those are around 20MB.

For one standard arch config kernel:
5MB vimage
8MB initrd
22MB initrd.fallback

35MB * 2 = 70MB.

With a 256 /boot partitition we can have two versions per
variant (linux, linux-old, linux-lts, linux-old) and still be good
in a constrained /boot.

It's less space needed than I thought.

>>>> If this is a regression you will have to post dmesg. If you don't see
>>>> errors/warnings, then kernel developers would usually ask to enable
>>>> debug flags for printing more information during boot.
>>>>
>>>> That said, I have one old machine with a Core2Duo and GM4xx and
>>>> ever since DRM's atomic modesetting was introduced in 4.2, I can
>>>> only use 4.1 warning free. Regressions do happen but you had no
>>>> warnings or errors in 4.8 so yours looks like a different regression.
>>>>
>>>
>>> If you don't report the bugs upstream they don't get fixed, if you have
>>> reported it and no one got around to take a look at it then fine,
>>> otherwise don't be lazy and report those bugs and help get them fixed.
>>
>> I did report it and it's been written off as "why do you care about the new
>> warning/stacktrace?". Given that I didn't bother trying to convince the
>> DRM devs of the importance since I don't have a RHEL or SLES support
>> contract I pay for.
>>
>
> Then you've done your part and no one can fault you, if the drm devs
> don't think it is a problem there isn't much you can do. That said, if
> the only problem is seeing some spam in the output of dmesg I can
> understand why they wouldn't give it top priority.

Xorg wasn't working properly and seeing stacktraces caused by
atomic modesetting regressions and those being brushed off as
warnings to ignore, either means their kprintf's are bad or they (Intel)
don't have a lab with all the supported GPU. It's one thing to stop
supporting a chip in a driver, but if you claim support it's important
not to regress. This is why people like RHEL and SLES and I
cannot blame them, but I think the better solution is that of a
stable ABI, where you limit the security potential just one old
driver, which is bad but not as bad as having to use a complete
old kernel not just one driver.


More information about the arch-general mailing list