[arch-general] [RFC] Potentially deprecating primus, bumblebee, virtualGL and primus_vk

older
[arch-general] knotes crashes on...

Emil Velikov

4 Dec 2020 4 Dec '20

12:27 p.m.

Hello all, As some of you may know, I have been an Arch user for over 5 years and a Linux graphics developer working on the whole stack from the kernel DRM all the way up-to Mesa and X. I would like to propose, partially deprecating and potentially removing some of the said packages in light of the development that has been happening over the last few years. tl;dr: these solutions have been partially or completely obsolete, for a while now ## OpenGL/GLX/etc # Background As you know, in the early days we would have both Mesa and vendors provide their own libGL.so.1 and in some cases libglx.so (xorg module). Alongside that drivers lacked dynPM, allowing you to powering off the GPU, when it's not used. To address those primus and bumblebee came about. Note: dynPM is mentioned for completeness, it won't be covered any more. # Since then Nvidia and open source drivers support client - Mesa drivers - since v17.2.0 (at least), packaged 7 Sep 2017 - Nvidia drivers - since v390.25 (at least), packaged 29 Jan 2018 ... and server-side glvnd: - xorg-xserver - since v1.20.6, packaged 23 Nov 2019 - Nvidia drivers - since v435.21, packaged 30 Aug 2019 ## Vulkan # Background On the Vulkan side the drivers are clearly separated from the start - libvulkan_$vendor.so. Yet very few applications would bother selecting the GPU, plus server-side (when using WSI_X11) wasn't complete. GL Vulkan interop was also tricky, with libGL.so.1 being provided by vendors. This resulted in primus_vk. # Since then - Mesa drivers - since 20.1.0, packaged 28 May 2020 Driver selection layer was added, falling back to DRI_PRIME [1] - Nvidia drivers - since v390.25 (at least), packaged 29 Jan 2018 Vulkan implicit layer was added - see [2] ## My proposal: Cleanup the packages, as outlined below and update the wiki. Please check out the Extra notes below. I will be more than happy to tackle this although maintainers will have to push/rebuild the packages for obvious reasons. - primus: replace with a simple compat script primusrun, which a) finds the ID_PATH_TAG for the Nvidia GPU and sets DRI_PRIME=$tag $@ - bumblebee: optirun - replace with script calling primusrun xorg config - should not be needed, double-check and drop bumblebeed - (optionally) remove no longer applicable code, config etc bumblebee - remove no longer applicable group - virtual GL: move to AUR, since it has no more users - primus_vk: replace with a simple compat script pvkrun akin to primusrun, making use of [1] and [2] ## Extra notes: - primus/bumblebee - seemingly been abandoned upstream for 4+ years. Arch maintainers having to carry fixes locally. - primus/virtualgl/primus_vk performance is worse than natively - compat primusrun/optirun/pvkrun are to minimise breakage - some legacy Nvidia drivers may lack support - those drivers live in the AUR, thus a solution for them should also be packaged there. ## Feedback I would love to hear the input from the respective maintainers and the overall Arch developer base as a whole. Is there a particular use-case that I've missed? How would you like this handled - as mentioned above, I'm happy to submit patches be that via Flyspray, email, MR or otherwise. Looking forward to your input -Emil [1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/1766 [2] http://download.nvidia.com/XFree86/Linux-x86_64/435.17/README/primerenderoff...

Show replies by date

Giancarlo Razzolini

4 Dec 4 Dec

12:50 p.m.

Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu:

...

I would love to hear the input from the respective maintainers and the overall Arch developer base as a whole.

As the maintainer for both bumblebee and prime-run, I don't see the need for deprecation, yet. Bumblebee still has some uses and also, the it has the appeal of keeping the card completely powered off, something that doesn't happen with prime render offload. Having said that, I do think bumblebee/primus/primus_vk days are numbered. Regards, Giancarlo Razzolini

Lone_Wolf

12:55 p.m.

On 04-12-2020 13:50, Giancarlo Razzolini via arch-general wrote:

...

Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu:

...
I would love to hear the input from the respective maintainers and the overall Arch developer base as a whole.

As the maintainer for both bumblebee and prime-run, I don't see the need for deprecation, yet. Bumblebee still has some uses and also, the it has the appeal of keeping the card completely powered off, something that doesn't happen with prime render offload.

Having said that, I do think bumblebee/primus/primus_vk days are numbered.

Regards, Giancarlo Razzolini

For clarity : Does this affect people without an nvidia card ? Are users with an nvidia card that only use nouveau kernel module affected ? Lone_Wolf

Emil Velikov

1:06 p.m.

On Fri, 4 Dec 2020 at 12:55, Lone_Wolf <lone_wolf@klaas-de-kat.nl> wrote:

...

On 04-12-2020 13:50, Giancarlo Razzolini via arch-general wrote:

...
Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu:

...
I would love to hear the input from the respective maintainers and the overall Arch developer base as a whole.

As the maintainer for both bumblebee and prime-run, I don't see the need for deprecation, yet. Bumblebee still has some uses and also, the it has the appeal of keeping the card completely powered off, something that doesn't happen with prime render offload.

Having said that, I do think bumblebee/primus/primus_vk days are numbered.

Regards, Giancarlo Razzolini

For clarity :

Does this affect people without an nvidia card ?

Are users with an nvidia card that only use nouveau kernel module affected ?

There should be no user visible changes with my proposal - both GL and VK should work as normal. The power management side of things is completely unchanged. -Emil

Archange

12 Jan 12 Jan

3:07 p.m.

Le 04/12/2020 à 14:06, Emil Velikov via arch-general a écrit :

...

On Fri, 4 Dec 2020 at 12:55, Lone_Wolf <lone_wolf@klaas-de-kat.nl> wrote:

...
On 04-12-2020 13:50, Giancarlo Razzolini via arch-general wrote:

...
Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu:

...
I would love to hear the input from the respective maintainers and the overall Arch developer base as a whole.

As the maintainer for both bumblebee and prime-run, I don't see the need for deprecation, yet. Bumblebee still has some uses and also, the it has the appeal of keeping the card completely powered off, something that doesn't happen with prime render offload.

Having said that, I do think bumblebee/primus/primus_vk days are numbered.

Regards, Giancarlo Razzolini

For clarity :

Does this affect people without an nvidia card ?

Are users with an nvidia card that only use nouveau kernel module affected ?

There should be no user visible changes with my proposal - both GL and VK should work as normal. The power management side of things is completely unchanged.

Regarding that last statement I’m not sure. Can you confirm that you can unload the nvidia modules in this configuration (using PRIME offloading with the proprietary driver through nvidia-prime)? If so we are definitively willing to integrate this in Bumblebee upstream, so that people with <Turing or <CoffeLake platform can still enjoy power management while finally getting the full power from their card. Regards, Bruno/Archange

Emil Velikov

5:33 p.m.

On Tue, 12 Jan 2021 at 15:07, Archange <archange@archlinux.org> wrote:

...

Le 04/12/2020 à 14:06, Emil Velikov via arch-general a écrit :

...
On Fri, 4 Dec 2020 at 12:55, Lone_Wolf <lone_wolf@klaas-de-kat.nl> wrote:

...
On 04-12-2020 13:50, Giancarlo Razzolini via arch-general wrote:

...
Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu:

...
I would love to hear the input from the respective maintainers and the overall Arch developer base as a whole.

As the maintainer for both bumblebee and prime-run, I don't see the need for deprecation, yet. Bumblebee still has some uses and also, the it has the appeal of keeping the card completely powered off, something that doesn't happen with prime render offload.

Having said that, I do think bumblebee/primus/primus_vk days are numbered.

Regards, Giancarlo Razzolini

For clarity :

Does this affect people without an nvidia card ?

Are users with an nvidia card that only use nouveau kernel module affected ?

There should be no user visible changes with my proposal - both GL and VK should work as normal. The power management side of things is completely unchanged.

Regarding that last statement I’m not sure. Can you confirm that you can unload the nvidia modules in this configuration (using PRIME offloading with the proprietary driver through nvidia-prime)?

Should work fine, although cannot try it at the moment on my Intel/Nvidia box. DId you try it and you're seeing issues or there's something in particular which causes doubt?

...

If so we are definitively willing to integrate this in Bumblebee upstream, so that people with <Turing or <CoffeLake platform can still enjoy power management while finally getting the full power from their card.

With all respect to the Bumblebee project and it's developers, I think the project is dead. I love the upstream-first mentality, but with 7 local patches in Arch and the last (merged) MR upstream from 2018 I'm not too hopeful. -Emil

Archange

5:53 p.m.

Le 12/01/2021 à 18:33, Emil Velikov a écrit :

...

On Tue, 12 Jan 2021 at 15:07, Archange <archange@archlinux.org> wrote:

...
Le 04/12/2020 à 14:06, Emil Velikov via arch-general a écrit :

...
On Fri, 4 Dec 2020 at 12:55, Lone_Wolf <lone_wolf@klaas-de-kat.nl> wrote:

...
On 04-12-2020 13:50, Giancarlo Razzolini via arch-general wrote:

...
Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu:

...
I would love to hear the input from the respective maintainers and the overall Arch developer base as a whole.

As the maintainer for both bumblebee and prime-run, I don't see the need for deprecation, yet. Bumblebee still has some uses and also, the it has the appeal of keeping the card completely powered off, something that doesn't happen with prime render offload.

Having said that, I do think bumblebee/primus/primus_vk days are numbered.

Regards, Giancarlo Razzolini For clarity :

Does this affect people without an nvidia card ?

Are users with an nvidia card that only use nouveau kernel module affected ?

There should be no user visible changes with my proposal - both GL and VK should work as normal. The power management side of things is completely unchanged. Regarding that last statement I’m not sure. Can you confirm that you can unload the nvidia modules in this configuration (using PRIME offloading with the proprietary driver through nvidia-prime)?

Should work fine, although cannot try it at the moment on my Intel/Nvidia box. DId you try it and you're seeing issues or there's something in particular which causes doubt?

I have no laptop at hand to try, but I can ask a friend to check, maybe tomorrow. I seem to remember that for the proprietary driver to work with PRIME, it had to be loaded before Xorg and could not be unloaded afterwards, which would prevent Bumblebee/bbswitch PM to work.

...

...
If so we are definitively willing to integrate this in Bumblebee upstream, so that people with <Turing or <CoffeLake platform can still enjoy power management while finally getting the full power from their card.

With all respect to the Bumblebee project and it's developers, I think the project is dead.

Kind of yes, mostly because we all lost interest in this (no affected laptops anymore) and time to work on it (as it moved to low priority).

...

I love the upstream-first mentality, but with 7 local patches in Arch and the last (merged) MR upstream from 2018 I'm not too hopeful.

Well, look again, we just merged all patch Debian was caring on top of the develop branch, while most Arch patches have been part of the develop branch for years. I was suppose to handle the release of 4.0 for years, but that never happened. However with the recent pushes around Optimus, it might be worth to have a look at it and finally release that, maybe with a new transport way better than primus or virtualgl (that should have been gone for years…). Regards, Bruno/Archange

Emil Velikov

6:15 p.m.

On Tue, 12 Jan 2021 at 17:53, Archange <archange@archlinux.org> wrote:

...

Le 12/01/2021 à 18:33, Emil Velikov a écrit :

...
On Tue, 12 Jan 2021 at 15:07, Archange <archange@archlinux.org> wrote:

...
Le 04/12/2020 à 14:06, Emil Velikov via arch-general a écrit :

...
On Fri, 4 Dec 2020 at 12:55, Lone_Wolf <lone_wolf@klaas-de-kat.nl> wrote:

...
On 04-12-2020 13:50, Giancarlo Razzolini via arch-general wrote:

...
Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu: > I would love to hear the input from the respective maintainers and the > overall Arch developer base as a whole. > As the maintainer for both bumblebee and prime-run, I don't see the need for deprecation, yet. Bumblebee still has some uses and also, the it has the appeal of keeping the card completely powered off, something that doesn't happen with prime render offload.

Having said that, I do think bumblebee/primus/primus_vk days are numbered.

Regards, Giancarlo Razzolini For clarity :

Does this affect people without an nvidia card ?

Are users with an nvidia card that only use nouveau kernel module affected ?

There should be no user visible changes with my proposal - both GL and VK should work as normal. The power management side of things is completely unchanged. Regarding that last statement I’m not sure. Can you confirm that you can unload the nvidia modules in this configuration (using PRIME offloading with the proprietary driver through nvidia-prime)?

Should work fine, although cannot try it at the moment on my Intel/Nvidia box. DId you try it and you're seeing issues or there's something in particular which causes doubt?

I have no laptop at hand to try, but I can ask a friend to check, maybe tomorrow. I seem to remember that for the proprietary driver to work with PRIME, it had to be loaded before Xorg and could not be unloaded afterwards, which would prevent Bumblebee/bbswitch PM to work.

...
...
If so we are definitively willing to integrate this in Bumblebee upstream, so that people with <Turing or <CoffeLake platform can still enjoy power management while finally getting the full power from their card.

With all respect to the Bumblebee project and it's developers, I think the project is dead.

Kind of yes, mostly because we all lost interest in this (no affected laptops anymore) and time to work on it (as it moved to low priority).

Last time I looked X supported "hotplugged" GPUs, so unloading the driver (userspace and kernel) should be doable.

...

...
I love the upstream-first mentality, but with 7 local patches in Arch and the last (merged) MR upstream from 2018 I'm not too hopeful.

Well, look again, we just merged all patch Debian was caring on top of the develop branch, while most Arch patches have been part of the develop branch for years. I was suppose to handle the release of 4.0 for years, but that never happened. However with the recent pushes around Optimus, it might be worth to have a look at it and finally release that, maybe with a new transport way better than primus or virtualgl (that should have been gone for years…).

Indeed the develop branch has 2 build warning fixes and the manpages are moved. Cannot see many of the local Arch patches merged though? AFAICT regardless of the transport, I presume that GLVND performance will be superior. So I would urge you to report (or even fix) any of the GLVND/PRIME issues that you see. -Emil

Archange

6:24 p.m.

Le 12/01/2021 à 19:15, Emil Velikov a écrit :

...

On Tue, 12 Jan 2021 at 17:53, Archange <archange@archlinux.org> wrote:

...
Le 12/01/2021 à 18:33, Emil Velikov a écrit :

...
On Tue, 12 Jan 2021 at 15:07, Archange <archange@archlinux.org> wrote:

...
Le 04/12/2020 à 14:06, Emil Velikov via arch-general a écrit :

...
On Fri, 4 Dec 2020 at 12:55, Lone_Wolf <lone_wolf@klaas-de-kat.nl> wrote:

...
On 04-12-2020 13:50, Giancarlo Razzolini via arch-general wrote: > Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu: >> I would love to hear the input from the respective maintainers and the >> overall Arch developer base as a whole. >> > As the maintainer for both bumblebee and prime-run, I don't see the > need for deprecation, yet. > Bumblebee still has some uses and also, the it has the appeal of > keeping the card completely powered > off, something that doesn't happen with prime render offload. > > Having said that, I do think bumblebee/primus/primus_vk days are > numbered. > > Regards, > Giancarlo Razzolini For clarity :

Does this affect people without an nvidia card ?

Are users with an nvidia card that only use nouveau kernel module affected ?

There should be no user visible changes with my proposal - both GL and VK should work as normal. The power management side of things is completely unchanged. Regarding that last statement I’m not sure. Can you confirm that you can unload the nvidia modules in this configuration (using PRIME offloading with the proprietary driver through nvidia-prime)?

Should work fine, although cannot try it at the moment on my Intel/Nvidia box. DId you try it and you're seeing issues or there's something in particular which causes doubt? I have no laptop at hand to try, but I can ask a friend to check, maybe tomorrow. I seem to remember that for the proprietary driver to work with PRIME, it had to be loaded before Xorg and could not be unloaded afterwards, which would prevent Bumblebee/bbswitch PM to work.

...
...
If so we are definitively willing to integrate this in Bumblebee upstream, so that people with <Turing or <CoffeLake platform can still enjoy power management while finally getting the full power from their card.

With all respect to the Bumblebee project and it's developers, I think the project is dead. Kind of yes, mostly because we all lost interest in this (no affected laptops anymore) and time to work on it (as it moved to low priority).

Last time I looked X supported "hotplugged" GPUs, so unloading the driver (userspace and kernel) should be doable.

OK, will try to check tomorrow with a friend that has an Optimus laptop.

...

...
...
I love the upstream-first mentality, but with 7 local patches in Arch and the last (merged) MR upstream from 2018 I'm not too hopeful. Well, look again, we just merged all patch Debian was caring on top of the develop branch, while most Arch patches have been part of the develop branch for years. I was suppose to handle the release of 4.0 for years, but that never happened. However with the recent pushes around Optimus, it might be worth to have a look at it and finally release that, maybe with a new transport way better than primus or virtualgl (that should have been gone for years…).

Indeed the develop branch has 2 build warning fixes and the manpages are moved. Cannot see many of the local Arch patches merged though?

They are not local Arch patches, they are old commit already in the develop branch (from ~2014 or 2015 IIRC). The exception is the GLVND one that is an upstream PR but not merged (Luca is not sure whether that’s still required). In fact they are ~60 unreleased commits upstream that were supposed to be part of a 4.0 release long ago, most of which have been used downstream (Debian, Arch…) for a long time now.

...

AFAICT regardless of the transport, I presume that GLVND performance will be superior. So I would urge you to report (or even fix) any of the GLVND/PRIME issues that you see.

By transport, I mean use of PRIME offloading vs primus calls interception or VirtualGL transport. Bruno/Archange

Emil Velikov

8:45 p.m.

On Tue, 12 Jan 2021 at 18:24, Archange <archange@archlinux.org> wrote:

...

OK, will try to check tomorrow with a friend that has an Optimus laptop.

Perfect, thanks in advance. To trigger the event one has to echo into a sysfs file... Don't recall exactly, some of the following should be it. a) If the nvidia module is driving an fbcon: - echo "0" > /sys/class/vtconsole/vtconX/bind b) and/or a combination of the following - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/uevent - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/driver/uevent - echo "1" > /sys/class/drm/cardX/device/driver/unbind

...

By transport, I mean use of PRIME offloading vs primus calls interception or VirtualGL transport.

Didn't think that by transport you meant PRIME - silly me. Thanks Emil

Archange

13 Jan 13 Jan

4:52 p.m.

Le 12/01/2021 à 21:45, Emil Velikov a écrit :

...

On Tue, 12 Jan 2021 at 18:24, Archange <archange@archlinux.org> wrote:

...
OK, will try to check tomorrow with a friend that has an Optimus laptop.

Perfect, thanks in advance.

To trigger the event one has to echo into a sysfs file... Don't recall exactly, some of the following should be it. a) If the nvidia module is driving an fbcon: - echo "0" > /sys/class/vtconsole/vtconX/bind b) and/or a combination of the following - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/uevent - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/driver/uevent - echo "1" > /sys/class/drm/cardX/device/driver/unbind

Bad news as expected: 1) If the driver is not loaded before X, the card is not taken into account. However unbinding/rebinding works, and PM too when unbinded and setting power to auto on the PCI link. 2) If the driver is loaded before X, it works for PRIME. However, the card refuse to unbind because it’s in use by X even if nothing is actually running on it. So unless I missed something power management is not achievable for pre-Turing card or on pre-Coffee Lake platform when using PRIME offloading. At this point I can’t see how changing Bumblebee would work for every case. The fact is that there is still no good solution for older platform that does both PM and performances. Regards, Bruno/Archange

Emil Velikov

14 Jan 14 Jan

6:15 p.m.

On Wed, 13 Jan 2021 at 16:52, Archange <archange@archlinux.org> wrote:

...

Le 12/01/2021 à 21:45, Emil Velikov a écrit :

...
On Tue, 12 Jan 2021 at 18:24, Archange <archange@archlinux.org> wrote:

...
OK, will try to check tomorrow with a friend that has an Optimus laptop.

Perfect, thanks in advance.

To trigger the event one has to echo into a sysfs file... Don't recall exactly, some of the following should be it. a) If the nvidia module is driving an fbcon: - echo "0" > /sys/class/vtconsole/vtconX/bind b) and/or a combination of the following - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/uevent - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/driver/uevent - echo "1" > /sys/class/drm/cardX/device/driver/unbind

...

2) If the driver is loaded before X, it works for PRIME. However, the card refuse to unbind because it’s in use by X even if nothing is actually running on it.

It just works here so I suspect that a command or the order was off. My setup: - Intel/Nvidia system, monitor plugged/powered by Intel GPU - Therefore fb0 is for Intel (alongside card0) and Nvidia does not have fb device only card1. What I've done: - Toggle render offload ON via xrandr, try glxgears - Toggle render offload OFF via xrandr - Issue the removal - echo "remove" > /sys/class/drm/card1/uevent - Confirm that it works - Xorg.0.log should list "removing GPU device ...blabla/card1" - No Nvidia fb - skipping the vtcon magic - Double-check nothing else is using the module - lsmod | grep nvidia_drm -> returns 0 - Remove the nvidia module(s) - rmmod nvidia_drm (+ rest) In practise the above approach should work with reverse PRIME but honestly I haven't tried. Can you give it a try, step-by-step and let me know if any issues you encounter along the way. Thanks Emil

Emil Velikov

6:41 p.m.

On Thu, 14 Jan 2021 at 18:15, Emil Velikov <emil.l.velikov@gmail.com> wrote:

...

On Wed, 13 Jan 2021 at 16:52, Archange <archange@archlinux.org> wrote:

...
Le 12/01/2021 à 21:45, Emil Velikov a écrit :

...
On Tue, 12 Jan 2021 at 18:24, Archange <archange@archlinux.org> wrote:

...
OK, will try to check tomorrow with a friend that has an Optimus laptop.

Perfect, thanks in advance.

To trigger the event one has to echo into a sysfs file... Don't recall exactly, some of the following should be it. a) If the nvidia module is driving an fbcon: - echo "0" > /sys/class/vtconsole/vtconX/bind b) and/or a combination of the following - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/uevent - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/driver/uevent - echo "1" > /sys/class/drm/cardX/device/driver/unbind

...
2) If the driver is loaded before X, it works for PRIME. However, the card refuse to unbind because it’s in use by X even if nothing is actually running on it.

It just works here so I suspect that a command or the order was off.

My setup: - Intel/Nvidia system, monitor plugged/powered by Intel GPU - Therefore fb0 is for Intel (alongside card0) and Nvidia does not have fb device only card1.

What I've done: - Toggle render offload ON via xrandr, try glxgears - Toggle render offload OFF via xrandr - Issue the removal - echo "remove" > /sys/class/drm/card1/uevent - Confirm that it works - Xorg.0.log should list "removing GPU device ...blabla/card1" - No Nvidia fb - skipping the vtcon magic - Double-check nothing else is using the module - lsmod | grep nvidia_drm -> returns 0 - Remove the nvidia module(s) - rmmod nvidia_drm (+ rest)

In practise the above approach should work with reverse PRIME but honestly I haven't tried.

Can you give it a try, step-by-step and let me know if any issues you encounter along the way.

Completely forgot: there is no need for an Optimus machine. The above is applicable for any multi GPU setup - dual AMD, dual Nvidia or any random 3+ combination. The only hardcoded assumption of a) dual and b) optimus machine seems** to be in BB. -Emil ** From a quick read through the code a while ago.

Archange

15 Jan 15 Jan

11:16 p.m.

Le 14/01/2021 à 19:15, Emil Velikov a écrit :

...

On Wed, 13 Jan 2021 at 16:52, Archange <archange@archlinux.org> wrote:

...
Le 12/01/2021 à 21:45, Emil Velikov a écrit :

...
On Tue, 12 Jan 2021 at 18:24, Archange <archange@archlinux.org> wrote:

...
OK, will try to check tomorrow with a friend that has an Optimus laptop.

Perfect, thanks in advance.

To trigger the event one has to echo into a sysfs file... Don't recall exactly, some of the following should be it. a) If the nvidia module is driving an fbcon: - echo "0" > /sys/class/vtconsole/vtconX/bind b) and/or a combination of the following - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/uevent - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/driver/uevent - echo "1" > /sys/class/drm/cardX/device/driver/unbind

2) If the driver is loaded before X, it works for PRIME. However, the card refuse to unbind because it’s in use by X even if nothing is actually running on it.

It just works here so I suspect that a command or the order was off.

My setup: - Intel/Nvidia system, monitor plugged/powered by Intel GPU - Therefore fb0 is for Intel (alongside card0) and Nvidia does not have fb device only card1.

Same on the laptop of my friend.

...

What I've done: - Toggle render offload ON via xrandr, try glxgears - Toggle render offload OFF via xrandr

Not sure how I’m supposed to do that, didn’t find any reference online.

...

- Issue the removal - echo "remove" > /sys/class/drm/card1/uevent This segfaults Xorg right away. - Confirm that it works - Xorg.0.log should list "removing GPU device ...blabla/card1"

It does appear, with the segfault: [117777.574] (II) config/udev: removing GPU device /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1 /dev/dri/card1 [117777.575] xf86: remove device 0 /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/drm/card1 [117777.593] (II) UnloadModule: "nvidia" [117777.593] (II) UnloadSubModule: "glxserver_nvidia" [117777.593] (II) Unloading glxserver_nvidia [117777.593] (II) UnloadSubModule: "wfb" [117777.593] (II) UnloadSubModule: "fb" [117777.594] (EE) [117777.594] (EE) Backtrace: [117777.594] (EE) 0: /usr/lib/Xorg (xorg_backtrace+0x53) [0x55efd7ad1f63] [117777.594] (EE) 1: /usr/lib/Xorg (0x55efd798b000+0x151da5) [0x55efd7adcda5] [117777.594] (EE) 2: /usr/lib/libc.so.6 (0x7f105e9f1000+0x3d6a0) [0x7f105ea2e6a0] [117777.595] (EE) 3: /usr/lib/Xorg (RRTellChanged+0x969) [0x55efd7a3e3d9] [117777.595] (EE) 4: /usr/lib/Xorg (0x55efd798b000+0x1bb9d4) [0x55efd7b469d4] [117777.595] (EE) 5: /usr/lib/Xorg (0x55efd798b000+0x1bbb28) [0x55efd7b46b28] [117777.595] (EE) 6: /usr/lib/Xorg (0x55efd798b000+0x14a331) [0x55efd7ad5331] [117777.595] (EE) 7: /usr/lib/Xorg (WaitForSomething+0x250) [0x55efd7ad0c00] [117777.595] (EE) 8: /usr/lib/Xorg (0x55efd798b000+0x39914) [0x55efd79c4914] [117777.595] (EE) 9: /usr/lib/libc.so.6 (__libc_start_main+0xf2) [0x7f105ea19152] [117777.595] (EE) 10: /usr/lib/Xorg (_start+0x2e) [0x55efd79c55de] [117777.595] (EE) [117777.595] (EE) Segmentation fault at address 0x34 [117777.595] (EE) Fatal server error: [117777.595] (EE) Caught signal 11 (Segmentation fault). Server aborting [117777.595] (EE) [117777.595] (EE) Please consult the The X.Org Foundation support at http://wiki.x.org for help. [117777.595] (EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information. [117777.595] (EE) [117777.595] (II) AIGLX: Suspending AIGLX clients for VT switch [117777.682] (EE) Server terminated with error (1). Closing log file Maybe I missed a fb? Not sure… I might try again tomorrow but really not sure, I have a plane to catch and my friend might not be available early enough. Regards, Bruno

Emil Velikov

16 Jan 16 Jan

3:30 p.m.

On Fri, 15 Jan 2021 at 23:16, Archange <archange@archlinux.org> wrote:

...

Le 14/01/2021 à 19:15, Emil Velikov a écrit :

...
On Wed, 13 Jan 2021 at 16:52, Archange <archange@archlinux.org> wrote:

...
Le 12/01/2021 à 21:45, Emil Velikov a écrit :

...
On Tue, 12 Jan 2021 at 18:24, Archange <archange@archlinux.org> wrote:

...
OK, will try to check tomorrow with a friend that has an Optimus laptop.

Perfect, thanks in advance.

To trigger the event one has to echo into a sysfs file... Don't recall exactly, some of the following should be it. a) If the nvidia module is driving an fbcon: - echo "0" > /sys/class/vtconsole/vtconX/bind b) and/or a combination of the following - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/uevent - echo "remove" (or was it "unbind") > /sys/class/drm/cardX/device/driver/uevent - echo "1" > /sys/class/drm/cardX/device/driver/unbind

2) If the driver is loaded before X, it works for PRIME. However, the card refuse to unbind because it’s in use by X even if nothing is actually running on it.

It just works here so I suspect that a command or the order was off.

My setup: - Intel/Nvidia system, monitor plugged/powered by Intel GPU - Therefore fb0 is for Intel (alongside card0) and Nvidia does not have fb device only card1.

Same on the laptop of my friend.

...
What I've done: - Toggle render offload ON via xrandr, try glxgears - Toggle render offload OFF via xrandr

Not sure how I’m supposed to do that, didn’t find any reference online.

Oops sorry about that - disabling it is a hard requirement, actually. The reason why it's barely mentioned, is that people (tend to) use FOSS drivers, where rmmod isn't needed ;-)

...

...
- Issue the removal - echo "remove" > /sys/class/drm/card1/uevent This segfaults Xorg right away. - Confirm that it works - Xorg.0.log should list "removing GPU device ...blabla/card1"

It does appear, with the segfault:

Did you disable the offload as mentioned above? You may also want to close any apps you've started while offload was enabled. To put it otherwise: - xrandr enable offload - X and newly started programs start using the other GPU - echo "remove" > .../uevent sends an event to userspace that the GPU is being removed Most programs do not handle, GPU removal so they crash. Therefore, using "remove" w/o disabling offload, is akin to (physically) plugging out the GPU, while running optirun $app. A scenario which I suspect is not supported with bumblebee.

...

Maybe I missed a fb? Not sure…

Seems like I should add some information there. Our systems usually have 8 VT. On one of those we have X (intentionally ignoring Wayland/Weston here and multiple X instances) and on the rest we have vtcon (aka VT framebuffer console). Even with X completely killed, a kernel module can be used for vtcon (usually, if a screen physically connected to the GPU you have one), thus rmmod will fail. You will see the reference counter in `lsmod | grep driver` show 1.

...

I might try again tomorrow but really not sure, I have a plane to catch and my friend might not be available early enough.

Sure, please don't miss your plane. Everything else can happen in due time. Thanks Emil

Giancarlo Razzolini

13 Jan 13 Jan

2:40 p.m.

Em janeiro 12, 2021 15:24 Archange via arch-general escreveu:

...

OK, will try to check tomorrow with a friend that has an Optimus laptop.

If upstream is willing to pick up these changes, I'll wait on that, instead of making the proposed changes as of now. Regards, Giancarlo Razzolini

Emil Velikov

14 Jan 14 Jan

7:01 p.m.

On Wed, 13 Jan 2021 at 14:40, Giancarlo Razzolini <grazzolini@archlinux.org> wrote:

...

Em janeiro 12, 2021 15:24 Archange via arch-general escreveu:

...
OK, will try to check tomorrow with a friend that has an Optimus laptop.

If upstream is willing to pick up these changes, I'll wait on that, instead of making the proposed changes as of now.

I'm not sure I follow - how is upstream supposed to "pick" those up? These are packaging related and upstream doesn't do packaging after all. The only non-packaging piece is the optirun script. I guess one can teach BB about "glvnd" bridge. In that case, the daemon stays to do power management and the executable correctly sets DRI_PRIME. None of the fancy Xorg instances or Xorg config management. In the primus(primus_vk*) case they are completely removed in favour of compat script, since I don't want to annoy people. Although if you prefer I can omit the compatibility scripts all together :-P Thanks Emil *Realised that I did not open a bug about primus_vk, but it's virtually identical to primus.

Emil Velikov

4 Feb 4 Feb

8:48 p.m.

On Thu, 14 Jan 2021 at 19:01, Emil Velikov <emil.l.velikov@gmail.com> wrote:

...

On Wed, 13 Jan 2021 at 14:40, Giancarlo Razzolini <grazzolini@archlinux.org> wrote:

...
Em janeiro 12, 2021 15:24 Archange via arch-general escreveu:

...
OK, will try to check tomorrow with a friend that has an Optimus laptop.

If upstream is willing to pick up these changes, I'll wait on that, instead of making the proposed changes as of now.

I'm not sure I follow - how is upstream supposed to "pick" those up? These are packaging related and upstream doesn't do packaging after all.

The only non-packaging piece is the optirun script. I guess one can teach BB about "glvnd" bridge. In that case, the daemon stays to do power management and the executable correctly sets DRI_PRIME. None of the fancy Xorg instances or Xorg config management.

In the primus(primus_vk*) case they are completely removed in favour of compat script, since I don't want to annoy people. Although if you prefer I can omit the compatibility scripts all together :-P

Humble poke, can you elaborate what you have in mind? Thanks Emil

Emil Velikov

16 Feb 16 Feb

11:40 a.m.

On Thu, 4 Feb 2021 at 20:48, Emil Velikov <emil.l.velikov@gmail.com> wrote:

...

On Thu, 14 Jan 2021 at 19:01, Emil Velikov <emil.l.velikov@gmail.com> wrote:

...
On Wed, 13 Jan 2021 at 14:40, Giancarlo Razzolini <grazzolini@archlinux.org> wrote:

...
Em janeiro 12, 2021 15:24 Archange via arch-general escreveu:

...
OK, will try to check tomorrow with a friend that has an Optimus laptop.

If upstream is willing to pick up these changes, I'll wait on that, instead of making the proposed changes as of now.

I'm not sure I follow - how is upstream supposed to "pick" those up? These are packaging related and upstream doesn't do packaging after all.

The only non-packaging piece is the optirun script. I guess one can teach BB about "glvnd" bridge. In that case, the daemon stays to do power management and the executable correctly sets DRI_PRIME. None of the fancy Xorg instances or Xorg config management.

In the primus(primus_vk*) case they are completely removed in favour of compat script, since I don't want to annoy people. Although if you prefer I can omit the compatibility scripts all together :-P

Humble poke, can you elaborate what you have in mind?

Another humble poke? -Emil

Giancarlo Razzolini

4:36 p.m.

Em fevereiro 16, 2021 8:40 Emil Velikov escreveu:

...

On Thu, 4 Feb 2021 at 20:48, Emil Velikov <emil.l.velikov@gmail.com> wrote: Another humble poke?

I haven't been using bumblebee anymore for a while now and I have very few time to test these. I'll try to switch my configuration to bbswitch + bumblebee after 5.11 stabilizes. Regards, Giancarlo Razzolini

Emil Velikov

12 Jan 12 Jan

5:54 p.m.

On Tue, 12 Jan 2021 at 17:33, Emil Velikov <emil.l.velikov@gmail.com> wrote:

...

On Tue, 12 Jan 2021 at 15:07, Archange <archange@archlinux.org> wrote:

...
Le 04/12/2020 à 14:06, Emil Velikov via arch-general a écrit :

...
On Fri, 4 Dec 2020 at 12:55, Lone_Wolf <lone_wolf@klaas-de-kat.nl> wrote:

...
On 04-12-2020 13:50, Giancarlo Razzolini via arch-general wrote:

...
Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu:

...
I would love to hear the input from the respective maintainers and the overall Arch developer base as a whole.

As the maintainer for both bumblebee and prime-run, I don't see the need for deprecation, yet. Bumblebee still has some uses and also, the it has the appeal of keeping the card completely powered off, something that doesn't happen with prime render offload.

Having said that, I do think bumblebee/primus/primus_vk days are numbered.

Regards, Giancarlo Razzolini

For clarity :

Does this affect people without an nvidia card ?

Are users with an nvidia card that only use nouveau kernel module affected ?

There should be no user visible changes with my proposal - both GL and VK should work as normal. The power management side of things is completely unchanged.

Regarding that last statement I’m not sure. Can you confirm that you can unload the nvidia modules in this configuration (using PRIME offloading with the proprietary driver through nvidia-prime)?

Should work fine, although cannot try it at the moment on my Intel/Nvidia box. DId you try it and you're seeing issues or there's something in particular which causes doubt?

...
If so we are definitively willing to integrate this in Bumblebee upstream, so that people with <Turing or <CoffeLake platform can still enjoy power management while finally getting the full power from their card.

With all respect to the Bumblebee project and it's developers, I think the project is dead. I love the upstream-first mentality, but with 7 local patches in Arch and the last (merged) MR upstream from 2018 I'm not too hopeful.

I stand corrected - upstream have started to show signs the last two weeks. Apologies. -Emil

Emil Velikov

4 Dec 4 Dec

1:04 p.m.

On Fri, 4 Dec 2020 at 12:50, Giancarlo Razzolini <grazzolini@archlinux.org> wrote:

...

Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu:

...
I would love to hear the input from the respective maintainers and the overall Arch developer base as a whole.

As the maintainer for both bumblebee and prime-run, I don't see the need for deprecation, yet. Bumblebee still has some uses and also, the it has the appeal of keeping the card completely powered off, something that doesn't happen with prime render offload.

Thanks for maintaining these Giancarlo. The power management side of Bumblebee will be untouched - my email explicitly covers only the file side of things. I've seen far too many reports of people using primus, on top of GLVND enabled nvidia/mesa causing all sorts of problems. Since tracking individual reports does not scale - I've put this proposal. Note: having a compat primusrun/optirun/pvkrun makes sense - setting the respective environment variables is annoying. -Emil

Giancarlo Razzolini

2:11 p.m.

Em dezembro 4, 2020 10:04 Emil Velikov escreveu:

...

On Fri, 4 Dec 2020 at 12:50, Giancarlo Razzolini <grazzolini@archlinux.org> wrote:

...
Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu:

...
I would love to hear the input from the respective maintainers and the overall Arch developer base as a whole.

As the maintainer for both bumblebee and prime-run, I don't see the need for deprecation, yet. Bumblebee still has some uses and also, the it has the appeal of keeping the card completely powered off, something that doesn't happen with prime render offload.

Thanks for maintaining these Giancarlo.

The power management side of Bumblebee will be untouched - my email explicitly covers only the file side of things.

I've seen far too many reports of people using primus, on top of GLVND enabled nvidia/mesa causing all sorts of problems. Since tracking individual reports does not scale - I've put this proposal.

Note: having a compat primusrun/optirun/pvkrun makes sense - setting the respective environment variables is annoying.

-Emil

Put your proposal changes and reasoning on flyspray tickets for each package and I'll take a look. I'm not sure all the changes are needed. And, it turns out that, due to some limitations/issues related to intel gvt-g and nvidia with prime render offload, I'm using bumblebee currently. Only issue I had recently was the xorg autodetection issue that they reverted, other than that, it works fine here. Regards, Giancarlo Razzolini

Emil Velikov

7 Dec 7 Dec

5:24 p.m.

On Fri, 4 Dec 2020 at 14:11, Giancarlo Razzolini <grazzolini@archlinux.org> wrote:

...

Em dezembro 4, 2020 10:04 Emil Velikov escreveu:

...
On Fri, 4 Dec 2020 at 12:50, Giancarlo Razzolini <grazzolini@archlinux.org> wrote:

...
Em dezembro 4, 2020 9:27 Emil Velikov via arch-general escreveu:

...
I would love to hear the input from the respective maintainers and the overall Arch developer base as a whole.

As the maintainer for both bumblebee and prime-run, I don't see the need for deprecation, yet. Bumblebee still has some uses and also, the it has the appeal of keeping the card completely powered off, something that doesn't happen with prime render offload.

Thanks for maintaining these Giancarlo.

The power management side of Bumblebee will be untouched - my email explicitly covers only the file side of things.

I've seen far too many reports of people using primus, on top of GLVND enabled nvidia/mesa causing all sorts of problems. Since tracking individual reports does not scale - I've put this proposal.

Note: having a compat primusrun/optirun/pvkrun makes sense - setting the respective environment variables is annoying.

-Emil

Put your proposal changes and reasoning on flyspray tickets for each package and I'll take a look.

Doing GL (primus+bumblebee) as a start, trimmed only to the bare minimum changes: https://bugs.archlinux.org/task/68882 https://bugs.archlinux.org/task/68883

...

I'm not sure all the changes are needed. And, it turns out that, due to some limitations/issues related to intel gvt-g and nvidia with prime render offload, I'm using bumblebee currently.

Do you have details or a bug report about these? I haven't used intel gvt-g and on the occasion of skimming through the code - quality was below i915.

...

Only issue I had recently was the xorg autodetection issue that they reverted, other than that, it works fine here.

Again, a bug report with details would be appreciated. Thanks Emil

Giancarlo Razzolini

6:47 p.m.

Em dezembro 7, 2020 14:24 Emil Velikov escreveu:

...

Doing GL (primus+bumblebee) as a start, trimmed only to the bare minimum changes: https://bugs.archlinux.org/task/68882 https://bugs.archlinux.org/task/68883

Thanks.

...

Do you have details or a bug report about these? I haven't used intel gvt-g and on the occasion of skimming through the code - quality was below i915.

There is a nasty bug where nvidia prime render offload causes the VM to simply segfault [0]. But intel gvt-g is better than using virgl for 3D stuff on a VM. Regards, Giancarlo Razzolini [0] https://github.com/intel/gvt-linux/issues/162

Emil Velikov

11 Dec 11 Dec

11:43 a.m.

On Mon, 7 Dec 2020 at 18:47, Giancarlo Razzolini <grazzolini@archlinux.org> wrote:

...

Em dezembro 7, 2020 14:24 Emil Velikov escreveu:

...
Doing GL (primus+bumblebee) as a start, trimmed only to the bare minimum changes: https://bugs.archlinux.org/task/68882 https://bugs.archlinux.org/task/68883

Thanks.

...
Do you have details or a bug report about these? I haven't used intel gvt-g and on the occasion of skimming through the code - quality was below i915.

There is a nasty bug where nvidia prime render offload causes the VM to simply segfault [0]. But intel gvt-g is better than using virgl for 3D stuff on a VM.

If I have to choose - segfault or slower rendering (via virgl) - I'd choose the latter :-P Nevertheless added a few comments which should help you move forward. I would suggest opting for virgl, until all the relevant bugs are fixed. This way we can proceed with the package cleanup quicker. -Emil

Emil Velikov

22 Dec 22 Dec

4:33 p.m.

On Fri, 11 Dec 2020 at 11:43, Emil Velikov <emil.l.velikov@gmail.com> wrote:

...

On Mon, 7 Dec 2020 at 18:47, Giancarlo Razzolini <grazzolini@archlinux.org> wrote:

...
Em dezembro 7, 2020 14:24 Emil Velikov escreveu:

...
Doing GL (primus+bumblebee) as a start, trimmed only to the bare minimum changes: https://bugs.archlinux.org/task/68882 https://bugs.archlinux.org/task/68883

Thanks.

...
Do you have details or a bug report about these? I haven't used intel gvt-g and on the occasion of skimming through the code - quality was below i915.

There is a nasty bug where nvidia prime render offload causes the VM to simply segfault [0]. But intel gvt-g is better than using virgl for 3D stuff on a VM.

If I have to choose - segfault or slower rendering (via virgl) - I'd choose the latter :-P Nevertheless added a few comments which should help you move forward.

I would suggest opting for virgl, until all the relevant bugs are fixed. This way we can proceed with the package cleanup quicker.

Is there any interest in the bugs opened, or the gvt regression is a must for those to move forward? My systems lack gvt support (nor do I have Windows license) to reproduce the bug(s), yet I'm willing to help to get this moving. Thanks Emil

Giancarlo Razzolini

4:45 p.m.

Em dezembro 22, 2020 13:33 Emil Velikov escreveu:

...

Is there any interest in the bugs opened, or the gvt regression is a must for those to move forward? My systems lack gvt support (nor do I have Windows license) to reproduce the bug(s), yet I'm willing to help to get this moving.

Yes. I haven't got time to proper look at them, but I will. I'm not using bumblebee atm as well. Regards, Giancarlo Razzolini

Emil Velikov

6 Jan 6 Jan

3:48 p.m.

On Tue, 22 Dec 2020 at 16:45, Giancarlo Razzolini <grazzolini@archlinux.org> wrote:

...

Em dezembro 22, 2020 13:33 Emil Velikov escreveu:

...
Is there any interest in the bugs opened, or the gvt regression is a must for those to move forward? My systems lack gvt support (nor do I have Windows license) to reproduce the bug(s), yet I'm willing to help to get this moving.

Yes. I haven't got time to proper look at them, but I will. I'm not using bumblebee atm as well.

Happy New Year Giancarlo. Saw you doing lots of devops work over the holidays and there is plenty of mkinitcpio work that you're been meaning to review/merge. Perhaps I can help with the latter or something else to free you some time? Aside: we have a bisection on the kernel regression. fingers crossed upstream will be able to resolve the problem shortly. The i965 <-> iris regression is reported and people are working on it. Thanks Emil

1521

Age (days ago)

1595

Last active (days ago)

List overview

Download

28 comments

4 participants

participants (4)

Archange
Emil Velikov
Giancarlo Razzolini
Lone_Wolf

[arch-general] [RFC] Potentially deprecating primus, bumblebee, virtualGL and primus_vk

tags

participants (4)