[arch-general] user namespaces
Summary: Arch Linux is one of the few, if not the only distribution that still disables or restricts the use of unprivileged user namespaces, a feature that is used by many applications and containers to provide secure sandboxing. There have been request to turn this feature on since Linux 3.13 (in 2013) but they are still being denied. While there may have been some reason for doing so a few year ago, leading to many distributions like Debian and Red Hat to restrict its use to privileged users via a kernel patch (they never disabled it completely), today arch seems to be the only distribution to block this feature. Even conservative distros like Debian 8 and 9 have this feature fully enabled. I would like to suggest that arch stops to disable this feature in future kernel versions. Resoning: The original reason to block user namespaces were a number of security issues that allowed unprivileged users to access features they should not have access to. Due to the nature of user namespaces to provide isolated user environments with access to privileged features like other namespaces (inside that isolated namespace only), it should be obvious that this feature had to be designed carefully in order not to harm the security outside the namespace. Even though there have been issues, this feature is now considered stable enough for distros like debian and red hat to allow its use even for unprivileged users. Moreover there are many applications that use this feature to provide or enhance security Among them are: lxc, systemd-nspawn, docker, flatpak, bubblewrap, firejail, firefox, chromium After working with sandboxing applications for several month, it seems clear to me that disabling user namespaces decreases the security of the system significantly. Some of these applications can not provide core features due to user namespaces missing. Others have significant security features disabled for this reasons. But the worst part is how some of these projects dealt with the missing feature: Many are using suid bits to execute the application as root to get access to the features they would have inside a user namespace. And for those who have worked with suid applications and their security it will not be surprising that they have failed to do this securely, leading to not just a few local root exploits. Taking firejail just as an example: (CVE-2017-5207) (CVE-2017-5206) (CVE-2017-5180) (CVE-2016-10122) (CVE-2016-10118) (CVE-2016-9016) And that is just from the last release... non of these issues would have been possible if user namespaces could be used, which is btw. what bubblewrap does if the feature is available, but since it isn’t on arch they have to use suid too (but bubblewrap is designed with security in mind for a change, so no known issues so far) Chromium is another case that has to use suid to use its sandbox and while I consider the developers very skilled in regards to security (they build a very nice broker architecture sandbox on windows too) there have been local root exploits in the linux version of chromium because of this. Even while looking at the surface of this problem it becomes clear this causes way more problems then it solves. Considering arch will be or already is the only linux distribution to disable this feature, developers of future applications will have to chose between dropping support for arch or to keep using features like suid that pose a real security threat opposite to user namespaces. Therefore I urge the people responsible to reconsider their choice an enable user namespaces in future kernel versions of arch linux. Bug reports regarding user namespaces: https://bugs.archlinux.org/task/36969 https://bugs.archlinux.org/task/49337
On Wed, 2017-02-01 at 00:18 +0100, sivmu wrote:
Summary:
Arch Linux is one of the few, if not the only distribution that still disables or restricts the use of unprivileged user namespaces, a feature that is used by many applications and containers to provide secure sandboxing. There have been request to turn this feature on since Linux 3.13 (in 2013) but they are still being denied. While there may have been some reason for doing so a few year ago, leading to many distributions like Debian and Red Hat to restrict its use to privileged users via a kernel patch (they never disabled it completely), today arch seems to be the only distribution to block this feature. Even conservative distros like Debian 8 and 9 have this feature fully enabled.
There are still endless unprivileged user namespace vulnerabilities and it's a nearly useless feature. The uid/gid mapping is poorly thought out and immature without the necessary environment (filesystem support, etc.) built around it, but no one really wants it for that reason. They want it because it started pretending that it can offer something that it can't actually deliver safely. There are much better ways to do unprivileged sandboxes with significantly less risk than CLONE_NEWUSER or setuid executables where the user controls the environment. Anything depending on this mechanism instead of properly designed plumbing for it is simply lazy garbage. Lack of a proper layer on top of the kernel providing infrastructure (systemd is so far from that) on desktop/server Linux is not going to be fixed by delegating everything to the kernel even when it massively increases attack surface.
I would like to suggest that arch stops to disable this feature in future kernel versions.
Resoning:
The original reason to block user namespaces were a number of security issues that allowed unprivileged users to access features they should not have access to. Due to the nature of user namespaces to provide isolated user environments with access to privileged features like other namespaces (inside that isolated namespace only), it should be obvious that this feature had to be designed carefully in order not to harm the security outside the namespace. Even though there have been issues, this feature is now considered stable enough for distros like debian and red hat to allow its use even for unprivileged users.
There's still an unrelenting torrent of security issues from it. Maybe wait until that stops before proposing this. I don't think it's going to stop because of how this feature is designed. It greatly increases the attack surface and there isn't going to be a mitigating factor that changes this situation. It's a fundamentally flawed, garbage feature and the arguments for it are nonsense. There are better ways to do this, by simply not tying your hands and refusing to implement anything in user space but instead pretending that all common features must happen in the kernel despite major security risks and poor semantics.
Moreover there are many applications that use this feature to provide or enhance security Among them are:
lxc, systemd-nspawn, docker, flatpak, bubblewrap, firejail, firefox, chromium
There's one well-written sandbox there (Chromium's usage) and it doesn't require this feature. They also don't need this feature on platforms where they have control like Android, since they can implement it in a saner way where it doesn't massively increase kernel attack surface.
After working with sandboxing applications for several month, it seems clear to me that disabling user namespaces decreases the security of the system significantly. Some of these applications can not provide core features due to user namespaces missing. Others have significant security features disabled for this reasons. But the worst part is how some of these projects dealt with the missing feature: Many are using suid bits to execute the application as root to get access to the features they would have inside a user namespace. And for those who have worked with suid applications and their security it will not be surprising that they have failed to do this securely, leading to not just a few local root exploits.
There's no hard requirement that they have to do it that way. They can use a service where the user doesn't control the environment used to spawn the application (like setuid) or full control over the environment where it ends up being run. Application containers *really* do not need this feature. It's far better to do it in a more secure, saner way vs. exposing massive kernel attack surface.
Taking firejail just as an example: (CVE-2017-5207) (CVE-2017-5206) (CVE-2017-5180) (CVE-2016-10122) (CVE-2016-10118) (CVE-2016-9016)
A junk, insecure application is not a reason to greatly reduce kernel security for everyone.
And that is just from the last release...
non of these issues would have been possible if user namespaces could be used, which is btw. what bubblewrap does if the feature is available, but since it isn’t on arch they have to use suid too (but bubblewrap is designed with security in mind for a change, so no known issues so far) Chromium is another case that has to use suid to use its sandbox and while I consider the developers very skilled in regards to security (they build a very nice broker architecture sandbox on windows too) there have been local root exploits in the linux version of chromium because of this.
Chromium has had a couple vulnerabilities there. Can you point to any that are full blown privesc? I can point to 30+ kernel bugs from the past couple years that are privesc via user namespaces. Also those kernel vulnerabilities impact *everyone*.
Even while looking at the surface of this problem it becomes clear this causes way more problems then it solves. Considering arch will be or already is the only linux distribution to disable this feature, developers of future applications will have to chose between droppingsupport for arch or to keep using features like suid that pose a real security threat opposite to user namespaces.
Nope, you're just ignoring / misrepresenting the facts here and failing to present a real proposal. Try again, and propose something where attack surface is not increased beyond the cases where this feature is actually required. Enabling it globally when people install something like Chromium doesn't qualify. User namespaces are far more real of a security threat than these fears you're presenting here, and doing it as you propose would impose those risks on EVERYONE so that the few can have their poorly designed container features based on this.
Therefore I urge the people responsible to reconsider their choice an enable user namespaces in future kernel versions of arch linux.
Present a real proposal taking into account the very real reasons to avoid this that you are skirting around. If you aren't going to present technical solutions to the problems, which are certainly possible and could be implemented, then I don't think anything should be changed. I have thoughts on how to enable this while containing the attack surface but seeing as I have no interest in the feature and have a lot of far more important work to do than working on toy features, I don't plan on doing anything about this myself.
Bug reports regarding user namespaces:
https://bugs.archlinux.org/task/36969 https://bugs.archlinux.org/task/49337
Also worth noting that one of the first thing any sandbox based on user namespaces will do is *disabling* user namespaces. The programs using them acknowledge them to be a huge security problem. It doesn't work out well when only a subset of processes are running in that container env. The only sane way to approach this without taking a different path is implementing plumbing to only expose user namespaces to the sandbox spawning executables. Kernel infrastructure exists for doing that already. It just depends on whether anyone is willing to do any real work vs. complaining about it and denying the facts.
On Wed, Feb 01, 2017 at 01:20:41AM -0500, Daniel Micay via arch-general wrote:
On Wed, 2017-02-01 at 00:18 +0100, sivmu wrote:
Summary:
Arch Linux is one of the few, if not the only distribution that still disables or restricts the use of unprivileged user namespaces, a feature that is used by many applications and containers to provide secure sandboxing. There have been request to turn this feature on since Linux 3.13 (in 2013) but they are still being denied. While there may have been some reason for doing so a few year ago, leading to many distributions like Debian and Red Hat to restrict its use to privileged users via a kernel patch (they never disabled it completely), today arch seems to be the only distribution to block this feature. Even conservative distros like Debian 8 and 9 have this feature fully enabled.
There are still endless unprivileged user namespace vulnerabilities and it's a nearly useless feature. The uid/gid mapping is poorly thought out and immature without the necessary environment (filesystem support, etc.) built around it, but no one really wants it for that reason. They want it because it started pretending that it can offer something that it can't actually deliver safely. There are much better ways to do unprivileged sandboxes with significantly less risk than CLONE_NEWUSER or setuid executables where the user controls the environment. Anything depending on this mechanism instead of properly designed plumbing for it is simply lazy garbage. Lack of a proper layer on top of the kernel providing infrastructure (systemd is so far from that) on desktop/server Linux is not going to be fixed by delegating everything to the kernel even when it massively increases attack surface.
BTW, why can't one simply create a *privileged* lxc container on a host filesystem mounted with nosuid, then create an unprivileged user inside that container for browsing / viewing of untrusted pdfs, etc? But I still believe that the idea of sandboxing a web browser is idiotic... Cheers, -- Leonid Isaev
On Wed, 2017-02-01 at 00:21 -0700, Leonid Isaev wrote:
On Wed, Feb 01, 2017 at 01:20:41AM -0500, Daniel Micay via arch- general wrote:
On Wed, 2017-02-01 at 00:18 +0100, sivmu wrote:
Summary:
Arch Linux is one of the few, if not the only distribution that still disables or restricts the use of unprivileged user namespaces, a feature that is used by many applications and containers to provide secure sandboxing. There have been request to turn this feature on since Linux 3.13 (in 2013) but they are still being denied. While there may have been some reason for doing so a few year ago, leading to many distributions like Debian and Red Hat to restrict its use to privileged users via a kernel patch (they never disabled it completely), today arch seems to be the only distribution to block this feature. Even conservative distros like Debian 8 and 9 have this feature fully enabled.
There are still endless unprivileged user namespace vulnerabilities and it's a nearly useless feature. The uid/gid mapping is poorly thought out and immature without the necessary environment (filesystem support, etc.) built around it, but no one really wants it for that reason. They want it because it started pretending that it can offer something that it can't actually deliver safely. There are much better ways to do unprivileged sandboxes with significantly less risk than CLONE_NEWUSER or setuid executables where the user controls the environment. Anything depending on this mechanism instead of properly designed plumbing for it is simply lazy garbage. Lack of a proper layer on top of the kernel providing infrastructure (systemd is so far from that) on desktop/server Linux is not going to be fixed by delegating everything to the kernel even when it massively increases attack surface.
BTW, why can't one simply create a *privileged* lxc container on a host filesystem mounted with nosuid, then create an unprivileged user inside that container for browsing / viewing of untrusted pdfs, etc?
Application containers don't have a use for the user namespace quasi root and no one really needs the half baked uid/gid mapping feature. There's no real reason for stuff being done that way beyond desktop Linux having the disease of inability to do plumbing in userspace, but instead putting everything in the kernel simply to have it universally available rather than for technical reasons. It would make sense to simply have a service spawning on-demand unpriv users from a range of uid/gid pairs. That's exactly how this works on Android for both apps and isolatedProcess services (they each get a unique uid/gid pair assigned), although they also layer SELinux and mount namespaces on top. The only real use case for user namespaces is unprivileged, contained usage of OS containers since they actually need the quasi root. For application containers / sandboxes, it's just laziness and bad design.
On Wed, Feb 01, 2017 at 02:45:46AM -0500, Daniel Micay wrote:
Application containers don't have a use for the user namespace quasi root and no one really needs the half baked uid/gid mapping feature. There's no real reason for stuff being done that way beyond desktop Linux having the disease of inability to do plumbing in userspace, but instead putting everything in the kernel simply to have it universally available rather than for technical reasons.
It would make sense to simply have a service spawning on-demand unpriv users from a range of uid/gid pairs. That's exactly how this works on Android for both apps and isolatedProcess services (they each get a unique uid/gid pair assigned), although they also layer SELinux and mount namespaces on top.
Cool :) thx for the explanation... Cheers, L. -- Leonid Isaev
Am 01.02.2017 um 07:20 schrieb Daniel Micay via arch-general:
On Wed, 2017-02-01 at 00:18 +0100, sivmu wrote:
Summary:
Arch Linux is one of the few, if not the only distribution that still disables or restricts the use of unprivileged user namespaces, a feature that is used by many applications and containers to provide secure sandboxing. There have been request to turn this feature on since Linux 3.13 (in 2013) but they are still being denied. While there may have been some reason for doing so a few year ago, leading to many distributions like Debian and Red Hat to restrict its use to privileged users via a kernel patch (they never disabled it completely), today arch seems to be the only distribution to block this feature. Even conservative distros like Debian 8 and 9 have this feature fully enabled.
There are still endless unprivileged user namespace vulnerabilities
You failed to name even one.
it's a nearly useless feature.
That's a baseless claim, that was already proved wrong in my first post by the many applications that use this feature.
The uid/gid mapping is poorly thought out and immature without the necessary environment (filesystem support, etc.) built around it,
Something like mount namespaces, that are designed to be used in combination with user namespaces?
but no one really wants it for that reason. They want it because it started pretending that it can offer something that it can't actually deliver safely.
Again a claim without prove
There are much better ways to do unprivileged sandboxes with significantly less risk than CLONE_NEWUSER or setuid executables where the user controls the environment.
And yet you fail to name even one alternative. Please do
Anything depending on this mechanism instead of properly designed plumbing for it is simply lazy garbage.
Another baseless and arrogant claim
Lack of a proper layer on top of the kernel providing infrastructure (systemd is so far from that) on desktop/server Linux is not going to be fixed by delegating everything to the kernel even when it massively increases attack surface.
I would like to suggest that arch stops to disable this feature in future kernel versions.
Resoning:
The original reason to block user namespaces were a number of security issues that allowed unprivileged users to access features they should not have access to. Due to the nature of user namespaces to provide isolated user environments with access to privileged features like other namespaces (inside that isolated namespace only), it should be obvious that this feature had to be designed carefully in order not to harm the security outside the namespace. Even though there have been issues, this feature is now considered stable enough for distros like debian and red hat to allow its use even for unprivileged users.
There's still an unrelenting torrent of security issues from it.
Name one
Maybe wait until that stops before proposing this.
Vulnerabilities in kernel features will never stop to exist. If we disable everything with potential vulnerabilities, we did not have a kernel anymore.
I don't think it's going to stop because of how this feature is designed. It greatly increases the attack surface and there isn't going to be a mitigating factor that changes this situation. It's a fundamentally flawed, garbage feature and the arguments for it are nonsense. There are better ways to do this, by simply not tying your hands and refusing to implement anything in user space but instead pretending that all common features must happen in the kernel despite major security risks and poor semantics.
So this is actually about you not liking this feature without naming any real reason making a bunch of baseless accusations and claims.
Moreover there are many applications that use this feature to provide or enhance security Among them are:
lxc, systemd-nspawn, docker, flatpak, bubblewrap, firejail, firefox, chromium
There's one well-written sandbox there (Chromium's usage) and it doesn't require this feature.
Wrong https://chromium.googlesource.com/chromium/src/+/master/docs/linux_sandboxin... And for suid: Quote: „The intention is if you want to run Chrome and only use the namespace sandbox, you can set --disable-setuid-sandbox. But if you do so on a host without appropriate kernel support for the namespace sandbox, Chrome will loudly refuse to run.“ Source: https://bugs.chromium.org/p/chromium/issues/detail?id=598454 They also don't need this feature on platforms
where they have control like Android, since they can implement it in a saner way where it doesn't massively increase kernel attack surface.
Android uses minijail (default app sandbox in android 7), which relies on user namespaces… Just opened a terminal on my android and checked it. Its inside a user namespaces.
After working with sandboxing applications for several month, it seems clear to me that disabling user namespaces decreases the security of the system significantly. Some of these applications can not provide core features due to user namespaces missing. Others have significant security features disabled for this reasons. But the worst part is how some of these projects dealt with the missing feature: Many are using suid bits to execute the application as root to get access to the features they would have inside a user namespace. And for those who have worked with suid applications and their security it will not be surprising that they have failed to do this securely, leading to not just a few local root exploits.
There's no hard requirement that they have to do it that way. They can use a service where the user doesn't control the environment used to spawn the application (like setuid) or full control over the environment where it ends up being run. Application containers *really* do not need this feature. It's far better to do it in a more secure, saner way vs. exposing massive kernel attack surface.
Again no real life example for an alternative
Taking firejail just as an example: (CVE-2017-5207) (CVE-2017-5206) (CVE-2017-5180) (CVE-2016-10122) (CVE-2016-10118) (CVE-2016-9016)
A junk, insecure application is not a reason to greatly reduce kernel security for everyone.
I actually do not really want to argue with you about this one except that your claim for reduced kernel security is greatly exaggerated. And please not that the security of firejail would be grreatly increa
And that is just from the last release...
non of these issues would have been possible if user namespaces could be used, which is btw. what bubblewrap does if the feature is available, but since it isn’t on arch they have to use suid too (but bubblewrap is designed with security in mind for a change, so no known issues so far) Chromium is another case that has to use suid to use its sandbox and while I consider the developers very skilled in regards to security (they build a very nice broker architecture sandbox on windows too) there have been local root exploits in the linux version of chromium because of this.
Chromium has had a couple vulnerabilities there. Can you point to any that are full blown privesc?
Local root exploit in chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=76542 you are welcome I can point to 30+ kernel bugs from the
past couple years that are privesc via user namespaces. Also those kernel vulnerabilities impact *everyone*.
Please do point out some from the last 6 mounth.
Even while looking at the surface of this problem it becomes clear this causes way more problems then it solves. Considering arch will be or already is the only linux distribution to disable this feature, developers of future applications will have to chose between droppingsupport for arch or to keep using features like suid that pose a real security threat opposite to user namespaces.
Nope, you're just ignoring / misrepresenting the facts here and failing to present a real proposal. Try again, and propose something where attack surface is not increased beyond the cases where this feature is actually required. Enabling it globally when people install something like Chromium doesn't qualify.
User namespaces are far more real of a security threat than these fears you're presenting here, and doing it as you propose would impose those risks on EVERYONE so that the few can have their poorly designed container features based on this.
I do not share your assessment of the threat posed by userns and you have given me no reaseon to share your opinion yet
Therefore I urge the people responsible to reconsider their choice an enable user namespaces in future kernel versions of arch linux.
Present a real proposal taking into account the very real reasons to avoid this that you are skirting around. If you aren't going to present technical solutions to the problems, which are certainly possible and could be implemented, then I don't think anything should be changed.
Solutions to change user namespaces inside the kernel? This isn’t the kernel mailing list and arch won’t patch the kernel, so I do not get what you are proposing.
I have thoughts on how to enable this while containing the attack surface but seeing as I have no interest in the feature and have a lot of far more important work to do than working on toy features, I don't plan on doing anything about this myself.
Please share this either here or via direct mail and I will work on this as far as I am able.
Bug reports regarding user namespaces:
https://bugs.archlinux.org/task/36969 https://bugs.archlinux.org/task/49337
To make this short, please provide sources for your claim regarding the kernel attack surface of user namespaces and alternatives to provide the same funktionality. To conclude: The people responsible for linux distributions like debian, red hat and pretty much all other distros, as well as many developers of sandboxing applications including the tails and chromium people all believe this feature is a useful tool to provide unprivileged sandbox applications worth the risk. Without any real prove of the claims you made in your post, it seems you rather have a personal grudge against this feature while at the same time saying you know better then all these people. Sorry but that is pretty rich. Don’t get me wrong I would love to discuss with you about this all day long but I would like to ask you to reconsider your tone, as you sound incredibly arrogant when you put yourself above all those voices/people without providing real prove for your arguments.
On Wed, Feb 01, 2017 at 07:51:49PM +0100, sivmu wrote:
The people responsible for linux distributions like debian, red hat and pretty much all other distros, as well as many developers of sandboxing applications including the tails and chromium people all believe this feature is a useful tool to provide unprivileged sandbox applications worth the risk.
But you see, sandboxing apps is by itself is a misleading security feature. Why do I need to sandbox my browser if it is written properly and allows me to disable the unnecessary (for me) features? In the end, every sandbox uses DAC protection, no? And I already proposed a sandbox which is far better than firejail or the one used in chrome, and doesn't use userns.
Without any real prove of the claims you made in your post, it seems you rather have a personal grudge against this feature while at the same time saying you know better then all these people. Sorry but that is pretty rich.
The issue is this: either enable userns fully, i.e. unprivileged users are able to create user namespaecs, or don't enable them at all. The way Fedora does things, for example, is worse that the latter (of course, if you used Fedora you know that it sucks in general).
Don’t get me wrong I would love to discuss with you about this all day long but I would like to ask you to reconsider your tone, as you sound incredibly arrogant when you put yourself above all those voices/people without providing real prove for your arguments.
So, why don't you just build your own kernel? It takes only 20 mins... Cheers, -- Leonid Isaev
-- Changed the topic to keep things clean -- Am 01.02.2017 um 21:16 schrieb Leonid Isaev:
But you see, sandboxing apps is by itself is a misleading security feature. Why do I need to sandbox my browser if it is written properly and allows me to disable the unnecessary (for me) features?
Sorry to say this, but this is the most disturbingly naive thing I have read in quite some time. As a rule, it is said that programms contain one error for every 1000 lines of code. Firefox contains 14 million lines of code Chromium has 15,3 million lines Do the math No matter how much you focus on secure coding, there will always be vulnerabilities and sandboxing can help to contain the consequences of their exploitation. However it is to be said, that sandboxing does not protect the contained application and the data it has access to. Therefore sandboxing a browser will not prevent the compromisation of the data you access with it. Sandboxing a browser has therefore only limited use.
In the end, every sandbox uses DAC protection, no? And I already proposed a sandbox which is far better than firejail or the one used in chrome, and doesn't use userns.
Please take a look at bubblewrap https://github.com/projectatomic/bubblewrap On the default arch kernel it does not use user namespaces. It is also use by tails to sandbox firefox or rather the tor browser And chromium actually uses quite some nice sandboxing and has become quite famous for being nearly unbreakable. They also have a bug bounty programm, so if you find a way to break out of their sandbox you can get up to 100k. Good luck :)
Without any real prove of the claims you made in your post, it seems you rather have a personal grudge against this feature while at the same time saying you know better then all these people. Sorry but that is pretty rich.
The issue is this: either enable userns fully, i.e. unprivileged users are able to create user namespaecs, or don't enable them at all. The way Fedora does things, for example, is worse that the latter (of course, if you used Fedora you know that it sucks in general).
grsecurity has user namespaces enabled but restricted to privileged users only. This allows privileged apps like docker to use this feature. I think they know what they are doing. btw. what does fedora do exactly? (I think I found somthing about a kernel module parameter to enable userns. not sure why that is bad)
Don’t get me wrong I would love to discuss with you about this all day long but I would like to ask you to reconsider your tone, as you sound incredibly arrogant when you put yourself above all those voices/people without providing real prove for your arguments.
So, why don't you just build your own kernel? It takes only 20 mins...
Cheers,
thats not helping. And this is not about me getting this feature but about a ton of applications that use suid and other shit to work around this problem, which creates a lot of security problems for arch only. And yes as pointed out without using these apps the kernel without userns might be safer, but this still does not solve the general issue. Am 01.02.2017 um 22:22 schrieb Martin Kühne via arch-general:
As somebody with no actual knowledge of the details you guys are arguing over, but it seems to me OP has yet to learn that a simpler and more secure environment can only be achieved by using fewer and powerful components instead of many useless ones.
the features in question are inside the kernel and apart from user namespaces there is no controvery that these features are helpful to build containers. But to provide unprivileged users with the ability to use namespaces, user namespaces are required.
Okay, there might be a point from which the amount of components will add enough obscurity to the overall system that simply nobody will bother trying to break it, but really, what's the big deal. I think sandboxing is a concept reminding too much of windows tools such as bullguard, which simply doesn't translate well enough (read: at all) to unixes, so I recommend checking whether you can trust the few things you use instead of adding a whole bunch of potempkin barriers. It's actually less work overall, too.
Not really sure what your point is here. Sandboxing is not a concept from windows and that bullguard looks like garbage after 0.1 seconds of looking at is, so no that does not compare. Sandboxing has many aspects and is not bount to any plattform. It should be said though, that sandboxing is not a replacement for secure coding and has its limits.
On Thu, Feb 02, 2017 at 03:24:11AM +0100, sivmu wrote:
Am 01.02.2017 um 21:16 schrieb Leonid Isaev:
But you see, sandboxing apps is by itself is a misleading security feature. Why do I need to sandbox my browser if it is written properly and allows me to disable the unnecessary (for me) features?
Sorry to say this, but this is the most disturbingly naive thing I have read in quite some time.
Why? Your browser is in effect an execution env for untrusted code. So, of course it requires isolation. The same applies to any PDF reader that is capable of doing javascript etc. But the question is do we need such features?
No matter how much you focus on secure coding, there will always be vulnerabilities and sandboxing can help to contain the consequences of their exploitation.
It's not about secure coding, but about feature creep. If you want you insist on executing arbitrary code, then the only sane strategy is to use that program in a VM.
Please take a look at bubblewrap https://github.com/projectatomic/bubblewrap On the default arch kernel it does not use user namespaces.
And? Why do you point out such projects? I already described an approach when one always runs browsers, pdf readers, etc, inside an lxc container, as an unprivileged user. That container resides on a filesystem mounted with nosuid (so things like ping, su, sudo won't work), and has a locked root account. On top of that, it connects to a xephyr session running on the host, to avoid X11 sniffing attacks. I have been using such setup on all my desktops for over a year now. The only way to break out of such a container is a local kernel privilege escalation. Of course, having *privileged* userns *might* help because inside container UID=0 will map to smth like UID=123456 on the host, but this doesn't seem worth doing given all the ussues with userns. I should also point out that linux upstream refuses to accept a patch providing a sysctl switch between unprivileged and privilieged userns.
It is also use by tails to sandbox firefox or rather the tor browser
Any distribution that says "we focus on security" is garbage because security depends on the user's threat model. A distro should provide the *basic* tools that enable the user to implement his security demands. But tails is worse than garbage -- it is malicious, because it also focuses on privacy, forgetting that user's privacy is almost synonomous to his education. So, there is no such thing as "easy privacy" or "easy security". And no, pls don't bring up the breakage that you call OpenBSD...
And chromium actually uses quite some nice sandboxing and has become quite famous for being nearly unbreakable. They also have a bug bounty programm, so if you find a way to break out of their sandbox you can get up to 100k. Good luck :)
Why? My sandbox is better than that of chromium.
grsecurity has user namespaces enabled but restricted to privileged users only. This allows privileged apps like docker to use this feature. I think they know what they are doing.
Docker is not a security mechanism because its mission is totally different. Secure sandbox is always maximally isolated, while docker puts emphasis on as much sharing as possible, for efficiency. Also, SECURITY != TOOL. So, unless you understand what grsecurity does, don't use it. Cheers, -- Leonid Isaev
Am 02.02.2017 um 19:28 schrieb Leonid Isaev:
On Thu, Feb 02, 2017 at 03:24:11AM +0100, sivmu wrote:
Please take a look at bubblewrap https://github.com/projectatomic/bubblewrap On the default arch kernel it does not use user namespaces.
And? Why do you point out such projects?
I already described an approach when one always runs browsers, pdf readers, etc, inside an lxc container, as an unprivileged user. That container resides on a filesystem mounted with nosuid (so things like ping, su, sudo won't work), and has a locked root account. On top of that, it connects to a xephyr session running on the host, to avoid X11 sniffing attacks.
I have been using such setup on all my desktops for over a year now. The only way to break out of such a container is a local kernel privilege escalation. Of course, having *privileged* userns *might* help because inside container UID=0 will map to smth like UID=123456 on the host, but this doesn't seem worth doing given all the ussues with userns.
Form what I have seen so far, it is pretty simular to what bubblewrap does and also provides isolation with namespaces. I just noticed this can be used by unprivileged users too, so it might be worth a try. Bubblewrap is however very lightweight which is a nice feat I think. (Plus with a few hunderd lines of code I can actually audit it to some extend)
Any distribution that says "we focus on security" is garbage because security depends on the user's threat model. A distro should provide the *basic* tools that enable the user to implement his security demands.
But tails is worse than garbage -- it is malicious, because it also focuses on privacy, forgetting that user's privacy is almost synonomous to his education. So, there is no such thing as "easy privacy" or "easy security".
And no, pls don't bring up the breakage that you call OpenBSD...
I won't, trust me :) Although they do contribute to many successful security innovations that get adapted by linux and others. openssh is also a great example of secure coding and sandboxing. Anyway, while i somewhat share your opinion that without the user inclusion and threat model consideration, there is something missing. But for what they intend tails does provide what they promise and its not that bad.
And chromium actually uses quite some nice sandboxing and has become quite famous for being nearly unbreakable. They also have a bug bounty programm, so if you find a way to break out of their sandbox you can get up to 100k. Good luck :)
Why? My sandbox is better than that of chromium.
No your sandbox, as mine, is a cage that surrounds the contained applications Chromiums has a nice coat that fits perfectly and is adapted to the applications. That is actually better.
grsecurity has user namespaces enabled but restricted to privileged users only. This allows privileged apps like docker to use this feature. I think they know what they are doing.
Docker is not a security mechanism because its mission is totally different.
I did not say that.
Also, SECURITY != TOOL. So, unless you understand what grsecurity does, don't use it.
Although I know quite well what they are doing, I disagree with you here. Grsecurity is in part a great feature because it does not need konfoguration/interaction to work. Everyone can use it as long as the don't mess with it without understanding what they do.
On 02/02/2017 07:28 PM, Leonid Isaev wrote:
I already described an approach when one always runs browsers, pdf readers, etc, inside an lxc container, as an unprivileged user. That container resides on a filesystem mounted with nosuid (so things like ping, su, sudo won't work), and has a locked root account. On top of that, it connects to a xephyr session running on the host, to avoid X11 sniffing attacks.
I have been using such setup on all my desktops for over a year now. The only way to break out of such a container is a local kernel privilege escalation. Of course, having *privileged* userns *might* help because inside container UID=0 will map to smth like UID=123456 on the host, but this doesn't seem worth doing given all the ussues with userns.
This sounds cool. Do you happen to have written that up somewhere? :) -- GPG fingerprint: 871F 1047 7DB3 DDED 5FC4 47B2 26C7 E577 EF96 7808
On Thu, Feb 02, 2017 at 09:30:58PM +0100, Bennett Piater wrote:
On 02/02/2017 07:28 PM, Leonid Isaev wrote:
I already described an approach when one always runs browsers, pdf readers, etc, inside an lxc container, as an unprivileged user. That container resides on a filesystem mounted with nosuid (so things like ping, su, sudo won't work), and has a locked root account. On top of that, it connects to a xephyr session running on the host, to avoid X11 sniffing attacks.
I have been using such setup on all my desktops for over a year now. The only way to break out of such a container is a local kernel privilege escalation. Of course, having *privileged* userns *might* help because inside container UID=0 will map to smth like UID=123456 on the host, but this doesn't seem worth doing given all the ussues with userns.
This sounds cool. Do you happen to have written that up somewhere? :)
Hmm, there is really nothing to write up, because it is very simple. Anyway, first you install any linux distro into an lxc container. I chose arch guest because in that case I have a good control over installed features, and I chose lxc over docker or systemd-nspawn because it is the most mature project. If you are conscious about the installation size (in GB), get rid of all unnecessary packages (anything that deals with hardware, man-pages, man-db, ...) and completely purge the package cache (with pacman -Scc). Configure networking in the container whatever is appropriate for you. I have a bridge on the host where I plug veth interface pairs for containers and VMs, and use NAT to hide thus created LAN behind the host. Make this container start on boot by enabling the appropriate lxc@.service. Second, configure ssh server inside container to accept X11 forwarding. Choose whatever user policy, for instance, you can lock all user accounts, even root (with passwd -l root, passwd -l <user_name>) and configure ssh keys. There is a way to generate one-time keys, similar to how archiso generates pacman keyring on boot -- I can explain separately if you are interested (so there is no secret stored on the machine, not even hashed in /etc/shadow). Install any additional software inside container, that you'll actually use, e.g. epiphany ad evince. Also, install xorg-server-xephyr on the host. You can connect to the container with smth like --- host$ xephyr -resizeable -screen 1000x1000 :3 & host$ export DISPLAY=:3 && ssh -Y <container_ip> ... guest$ <desktop> & --- where as a <desktop> I used fluxbox. Now, you'll run a complete desktop inside Xephyr over ssh. You need a window manager to manage multiple windows. Xephyr then protects your keystrokes from being sniffed by the containerized X11 clients (but of course, you won't be able to paste into apps inside container). If this is not a concern for you, then simply do ssh -Y and forget about xephyr and DISPLAY variables. Yes, starting graphic apps over ssh incurs a performance overhead. You can use VNC or similar, but because all network connections are actually local this overhead is not noticeable on modern hardware (but you can't use a graphical browser on old machines anyway). Also, a container is simply a few processes on the host (after cold start, only systemd, journald, logind, dbus and sshd), so it doesn't waste too much memory. Technicalities of all of the above are explained in archwiki (search for LXC, qemu networking and xephyr). Finally, I found it convenient to script the above login procedure by using ssh tunnels in detached screen sessions with optinal sshf mounts for file exchange. A very quick example script that you can start from .bash_login or your DE session autostarter: --- host$ cat local-c-setup.sh #!/bin/bash # systemD cleans /tmp tmpdir="/run/user/$UID" # Xephyr DISPLAY xeph_d=":2" # ssh tunnel tuncmd="ssh -YNMS $tmpdir/ssh-%r@%h:%p" # sshfs options and mountpoints local_mnt_base="$HOME/obmennik" sshfs_opt="-o noexec -o sshfs_sync -o workaround=rename" # hosts and their sshfs shares hosts=('rallidae' 'wolf') rmt_dir=('/home/lisaev/obmennik' '/home/lisaev/obmennik') # do X11 forwarding to Xephyr (at $xeph_d) only on wolf, and mount sshfs i=0 while [ $i -lt 2 ]; do [[ $i -eq 1 ]] && export DISPLAY=$xeph_d /usr/bin/screen -S ${hosts[i]} -d -m $tuncmd ${hosts[i]} /usr/bin/sshfs $sshfs_opt ${hosts[i]}:${rmt_dir[i]} \ $local_mnt_base/${hosts[i]} let i=i+1 done host$ host$ screen -ls There are screens on: 3863.wolf (Detached) 3860.rallidae (Detached) 2 Sockets in /run/screens/S-leis8574. --- Hope this helps, -- Leonid Isaev
hello I've been postponing looking into browser isolation since I started using Wayland about a year ago. Does anyone have pointers, experiences or comments on this topic with regard to Xwayland? If I'd want to disassociate parts of chromiums execution context, what are common, good options? cheers, Bart
Am 03.02.2017 um 17:49 schrieb Bart De Roy via arch-general:
hello
I've been postponing looking into browser isolation since I started using Wayland about a year ago.
Does anyone have pointers, experiences or comments on this topic with regard to Xwayland? If I'd want to disassociate parts of chromiums execution context, what are common, good options?
cheers, Bart
As long as the application has access to the xwayland instance, which is by default the case when xwayland is available, it can influence all other applications that still use the x-protcol. Only the input/output of applications using only the wayland protocol are somewhat safe from this attack vector. To fully close this risk, full adaption of wayland in all applications is necessary, because then you no longer need any xserver. In the end this is really tricky and as has been mentioned, there is currently no really good solution for sandboxing desktop applications that can be easily applied. For most isolation purposes, applications like bubblewrap, lxc or systemd-nspawn can help, but you will still need to take care of X11, dbus and some other issues taht are not all that easy.
On Saturday 4 February 2017 7:28:31 AM IST sivmu wrote:
As long as the application has access to the xwayland instance, which is by default the case when xwayland is available, it can influence all other applications that still use the x-protcol.
Just to understand, if there are two applications using xwayland, under a wayland session, will they be still able to look at each other's resources? If the answer is no, the security is equivalent to the wayland applications, since xwayland instance is essentially a sandbox?
Only the input/output of applications using only the wayland protocol are somewhat safe from this attack vector. To fully close this risk, full adaption of wayland in all applications is necessary, because then you no longer need any xserver.
Again, if a wayland application and a xwayland application are running side- by-side, the xwayland application cannot of peek into the resources of wayland application right? Thanks. -- Regards Shridhar
Am 05.02.2017 um 05:16 schrieb Shridhar Daithankar:
On Saturday 4 February 2017 7:28:31 AM IST sivmu wrote:
As long as the application has access to the xwayland instance, which is by default the case when xwayland is available, it can influence all other applications that still use the x-protcol.
Just to understand, if there are two applications using xwayland, under a wayland session, will they be still able to look at each other's resources?
If the answer is no, the security is equivalent to the wayland applications, since xwayland instance is essentially a sandbox?
Not sure what you mean with resources. this point is about the insecurity of the X Windows System architecture, which basically assumes that all applications are to be trusted. There is no build in security, therefore failing modern threat models completly. This explains it pretty well I guess: https://theinvisiblethings.blogspot.de/2011/04/linux-security-circus-on-gui-... All of that is equally true for x-wayland, which is just a modified xserver run alongside a wayland instance to allow x applications to run on wayland compositors like weston.
Only the input/output of applications using only the wayland protocol are somewhat safe from this attack vector. To fully close this risk, full adaption of wayland in all applications is necessary, because then you no longer need any xserver.
Again, if a wayland application and a xwayland application are running side- by-side, the xwayland application cannot of peek into the resources of wayland application right?
If I am not mistaken it does not matter if an application is run on xwayland od directly on wayland, in regards to what it can capture. All applications can see input/putput of all other applications using the X Server Protocol, no matter what they themselfs are using. You can test this by running xinput on a terminal like in the linked article explained. No matter where you run it, you can capture the input of x applications. You can however not capture the input of wayland applications (at least not that easily) So if you want to avoid that other applications can snoop e.g. on your terminal input where you enter your root password, you need to use one that can work directly on wayland. Termite is a great terminal that supports wayland. Btw. to fully prevent keyloggin on wayland, you need to do more, e.g. by sandboxing, since there are ways to work around the security of wayland where the default linux security model is weaker then that of the wayland architecture. More info here: https://www.reddit.com/r/linux/comments/23mj49/wayland_is_not_immune_to_keyl... I hope I did not mess up that explaination, if I did someone please hit me.
On Sunday 5 February 2017 6:10:51 AM IST sivmu wrote:
Am 05.02.2017 um 05:16 schrieb Shridhar Daithankar:
On Saturday 4 February 2017 7:28:31 AM IST sivmu wrote:
As long as the application has access to the xwayland instance, which is by default the case when xwayland is available, it can influence all other applications that still use the x-protcol.
Just to understand, if there are two applications using xwayland, under a wayland session, will they be still able to look at each other's resources?
If the answer is no, the security is equivalent to the wayland applications, since xwayland instance is essentially a sandbox?
Not sure what you mean with resources.
devices and events, mostly.
this point is about the insecurity of the X Windows System architecture, which basically assumes that all applications are to be trusted. There is no build in security, therefore failing modern threat models completly.
This explains it pretty well I guess: https://theinvisiblethings.blogspot.de/2011/04/linux-security-circus-on-gui-> isolation.html
ok. It confirms my understanding that X clients can listen to each other's events and modify them. But in xwayland, things are bit different. https://lists.freedesktop.org/archives/wayland-devel/2014-January/012777.htm... As the thread suggests, if there is a separate X server instance per xwayland application, they won't be able to snoop on each other.
Btw. to fully prevent keyloggin on wayland, you need to do more, e.g. by sandboxing, since there are ways to work around the security of wayland where the default linux security model is weaker then that of the wayland architecture.
More info here: https://www.reddit.com/r/linux/comments/23mj49/wayland_is_not_immune_to_keyl oggers/
Exactly. If I am running chromium with firejail, which whitelists what chromium can do to the file system(even better with --private); the browser cannot tamper with .profile/.bash_profile or .ssh. -- Regards Shridhar
Am 05.02.2017 um 06:38 schrieb Shridhar Daithankar:
this point is about the insecurity of the X Windows System architecture, which basically assumes that all applications are to be trusted. There is no build in security, therefore failing modern threat models completly.
This explains it pretty well I guess: https://theinvisiblethings.blogspot.de/2011/04/linux-security-circus-on-gui-> isolation.html
ok. It confirms my understanding that X clients can listen to each other's events and modify them.
But in xwayland, things are bit different.
https://lists.freedesktop.org/archives/wayland-devel/2014-January/012777.htm...
As the thread suggests, if there is a separate X server instance per xwayland application, they won't be able to snoop on each other.
Sounds like what some sandboxing tools try to do with xpra and other additional x instances. However the default on wayland/xwayland is as described. You can easily try weston. Just install and enter 'weston' and you will get a weston instance where you can try this out with xinput
Btw. to fully prevent keyloggin on wayland, you need to do more, e.g. by sandboxing, since there are ways to work around the security of wayland where the default linux security model is weaker then that of the wayland architecture.
More info here: https://www.reddit.com/r/linux/comments/23mj49/wayland_is_not_immune_to_keyl oggers/
Exactly. If I am running chromium with firejail, which whitelists what chromium can do to the file system(even better with --private); the browser cannot tamper with .profile/.bash_profile or .ssh.
Not so sure using firejail will not actually decrease security in light of the recent wave of local root exploits...
On Sun, Feb 05, 2017 at 11:08:09AM +0530, Shridhar Daithankar wrote:
ok. It confirms my understanding that X clients can listen to each other's events and modify them.
But in xwayland, things are bit different.
https://lists.freedesktop.org/archives/wayland-devel/2014-January/012777.htm...
As the thread suggests, if there is a separate X server instance per xwayland application, they won't be able to snoop on each other.
Yes, and you don't need wayland for that... If copy-paste between apps is not required, xephyr should be sufficient. AFAUI, selinux sandbox does that https://dwalsh.fedorapeople.org/SELinux/Presentations/sandbox.pdf .
Btw. to fully prevent keyloggin on wayland, you need to do more, e.g. by sandboxing, since there are ways to work around the security of wayland where the default linux security model is weaker then that of the wayland architecture.
More info here: https://www.reddit.com/r/linux/comments/23mj49/wayland_is_not_immune_to_keyl oggers/
Exactly. If I am running chromium with firejail, which whitelists what chromium can do to the file system(even better with --private); the browser cannot tamper with .profile/.bash_profile or .ssh.
See, this is the problem: Why would a browser need these files? File access should only be possible with user interaction (via a file-open dialog). Cheers, -- Leonid Isaev
On Saturday 4 February 2017 11:00:12 PM IST Leonid Isaev wrote:
Exactly. If I am running chromium with firejail, which whitelists what chromium can do to the file system(even better with --private); the browser cannot tamper with .profile/.bash_profile or .ssh.
See, this is the problem: Why would a browser need these files? File access should only be possible with user interaction (via a file-open dialog).
Ideally, it doesn't. But programs have bugs and its nice to restrict them if those happens. Chromium just just an example. Here is something firejail(again an example sandbox) would prevent. https://blog.mozilla.org/security/2015/08/06/firefox-exploit-found-in-the-wi... -- Regards Shridhar
Based on the given links and comments I could not decide on a clear course of action. If only we w'd have continuous builds of Chromium in the Ozone-Wayland implementation. Buying a Chromebook may not be the worst idea after all. At least this sounds promising: https://youtu.be/4PflCyiULO4?t=2h31m32s https://docs.google.com/document/d/1WPdUbaJ6_UVxsJ6hLnDpGR-eMvS6j-0tF_TZ62DM... Or maybe I'll decide on a read-only filesystem, which is inconvenient and unsuitable for me and my two simple little laptops running 'n rolling Arch. Maarten Baert write (in 2014):
As long as Wayland isn't used together with some form of sandboxing, holes in the underlying system won't protect you from keyloggers.
As an amateur, it is hard for me to identify likely attack vectors. I would like to see a package with a ran{somware,domness} detection daemon in the official repos, and learn more about machine learning security models. Have there been cases of X client mimicry or click- jacking? I sure a compositor doesn't care about that. I'm particularly cautious about GUI clicking... I often look at the source of a web page, or use a browser extension that allows me to automatically scrape the target url, as opposed to clicking, which could trigger anything beyond control. So I'm not sure about the idea presented here: http://mupuf.org/blog/2014/03/18/managing-auth-ui-in-linux/ Steve D. Lazaro wrote:
It’s important to separate authentication from authorisation so that spoofing does not compromise valuable tokens. (...) An authorisation token is typically a one-time use object generated by a trusted authority (the compositor) and used by the system controlling access to privileged interfaces (the WSM). Such tokens can be distributed by having the user interact with an authorisation UI controlled by the compositor.
I've written down an silly idea (off topic) in a gist: "Can password typing in the browser be made less obvious for a keylogger?" https://gist.github.com/sharethewisdom/062da46347c93f778e0fae8d30e87090 I've been unsharing and chrooting for a while. I think I'll symlink most of my configs to a read only folder, owned by a 'myname.conf' user, and I'll try and read more about SElinux, MACs etc. cheers, Bart
On Fri, 2017-02-03 at 17:49 +0100, Bart De Roy via arch-general wrote:
Error verifying signature: parse error --pyi53mwzyx2s2ll6 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline
hello
I've been postponing looking into browser isolation since I started using Wayland about a year ago.
Does anyone have pointers, experiences or comments on this topic with regard to Xwayland? If I'd want to disassociate parts of chromiums execution context, what are common, good options?
cheers, Bart
--pyi53mwzyx2s2ll6--
The vast majority of Chromium's code already runs in far tighter sandboxes than can be made externally via namespaces, MAC, etc. Using an outer sandbox can help in the case where a sandbox bypass exploits the browser's broker process responsible for managing the renderers. The easiest way out is often a kernel exploit despite the extremely restricted system call whitelist which doesn't even include calls like open(...). If you want to strength the existing sandbox, a hardened kernel goes a long way to mitigating one of the two primary attack vectors for escaping the sandbox. There might be some value in containing the file access of the outer sandbox even if it's not really contained, because an attacker might only be able to influence it to incorrectly open any file, etc. if they only have code execution in a renderer without a code execution exploit for the outer part. I don't think there have been many bugs like that though, it's mostly just a full sandbox escape in which case the outer sandbox would actually need to contain usage of X11, pulseaudio, dbus, etc. So you definitely need more than simply MAC or namespaces + X11 / pulse access granted. Indirect access to X11 via another instance of it isn't great either.
On Wed, 1 Feb 2017 13:16:12 -0700, Leonid Isaev wrote:
So, why don't you just build your own kernel? It takes only 20 mins...
I agree that users should build the kernel on their own, if they want special features, but on many old machines it takes much longer to build a kernel based on a default config. 180 minutes on my machine. It might be that some hardware is broken, but in the past even the quickest builds, when definitively all the hardware was ok, require around 90 minutes. FWIW I don't need this namespace thingy, but build rt patched kernels myself. I know what I'm taklking about, because last time I build a kernel on my 2.1 dual-core, 4 GiB RAM yesterday ... [root@moonstudio weremouse]# systemd-nspawn -qD /mnt/archlinux/ 2>/dev/null pacman -Qi linux-rt-rosaplüsch | grep Build Build Date : Wed Feb 1 14:24:47 2017 ... and I build tons of kernels, not just for the connected hardware, but with more or less default configs.
On Wed, 2017-02-01 at 19:51 +0100, sivmu wrote:
Am 01.02.2017 um 07:20 schrieb Daniel Micay via arch-general:
On Wed, 2017-02-01 at 00:18 +0100, sivmu wrote:
Summary:
Arch Linux is one of the few, if not the only distribution that still disables or restricts the use of unprivileged user namespaces, a feature that is used by many applications and containers to provide secure sandboxing. There have been request to turn this feature on since Linux 3.13 (in 2013) but they are still being denied. While there may have been some reason for doing so a few year ago, leading to many distributions like Debian and Red Hat to restrict its use to privileged users via a kernel patch (they never disabled it completely), today arch seems to be the only distribution to block this feature. Even conservative distros like Debian 8 and 9 have this feature fully enabled.
There are still endless unprivileged user namespace vulnerabilities
You failed to name even one.
I already listed several in the linked issue reports.
it's a nearly useless feature.
That's a baseless claim, that was already proved wrong in my first post by the many applications that use this feature.
That doesn't demonstrate that it's useful relative to the alternatives. It enables unprivileged OS containers but isn't really any use for app containers.
The uid/gid mapping is poorly thought out and immature without the necessary environment (filesystem support, etc.) built around it,
Something like mount namespaces, that are designed to be used in combination with user namespaces?
That has nothing to do with this.
but no one really wants it for that reason. They want it because it started pretending that it can offer something that it can't actually deliver safely.
Again a claim without prove
The proof is easy to find. You're the one making a proposal but you clearly haven't done your research. It's not my job to spoon feed you.
There are much better ways to do unprivileged sandboxes with significantly less risk than CLONE_NEWUSER or setuid executables where the user controls the environment.
And yet you fail to name even one alternative. Please do
Uh, yeah, I did. M
Anything depending on this mechanism instead of properly designed plumbing for it is simply lazy garbage.
Another baseless and arrogant claim
Not baseless and it's not arrogant to point out that this is a bad feature for app containers. It's the truth.
Lack of a proper layer on top of the kernel providing infrastructure (systemd is so far from that) on desktop/server Linux is not going to be fixed by delegating everything to the kernel even when it massively increases attack surface.
I would like to suggest that arch stops to disable this feature in future kernel versions.
Resoning:
The original reason to block user namespaces were a number of security issues that allowed unprivileged users to access features they should not have access to. Due to the nature of user namespaces to provide isolated user environments with access to privileged features like other namespaces (inside that isolated namespace only), it should be obvious that this feature had to be designed carefully in order not to harm the security outside the namespace. Even though there have been issues, this feature is now considered stable enough for distros like debian and red hat to allow its use even for unprivileged users.
There's still an unrelenting torrent of security issues from it.
Name one
Look at the discussion on the issue report or do basic research on the topic. It's your proposal, if you haven't done even basic research that's your problem.
Maybe wait until that stops before proposing this.
Vulnerabilities in kernel features will never stop to exist. If we disable everything with potential vulnerabilities, we did not have a kernel anymore.
It's a very niche feature with better alternatives for sandboxes and app containers. It exposes all of the netfilter administration code and tons of other networking and mount code as new attack surface.
I don't think it's going to stop because of how this feature is designed. It greatly increases the attack surface and there isn't going to be a mitigating factor that changes this situation. It's a fundamentally flawed, garbage feature and the arguments for it are nonsense. There are better ways to do this, by simply not tying your hands and refusing to implement anything in user space but instead pretending that all common features must happen in the kernel despite major security risks and poor semantics.
So this is actually about you not liking this feature without naming any real reason making a bunch of baseless accusations and claims.
There are no baseless claims / accusations here. I am not going to spoon feed you information that's already in the issue reports, easily found on oss-security, etc.
Moreover there are many applications that use this feature to provide or enhance security Among them are:
lxc, systemd-nspawn, docker, flatpak, bubblewrap, firejail, firefox, chromium
There's one well-written sandbox there (Chromium's usage) and it doesn't require this feature.
Wrong
https://chromium.googlesource.com/chromium/src/+/master/docs/linux_san dboxing.md
And for suid:
Quote: „The intention is if you want to run Chrome and only use the namespace sandbox, you can set --disable-setuid-sandbox. But if you do so on a host without appropriate kernel support for the namespace sandbox, Chrome will loudly refuse to run.“
That switch isn't passed, which should be pretty clear considering that it runs.
Source: https://bugs.chromium.org/p/chromium/issues/detail?id=598454
That doesn't do what you think it does.
They also don't need this feature on platforms
where they have control like Android, since they can implement it in a saner way where it doesn't massively increase kernel attack surface.
Android uses minijail (default app sandbox in android 7), which relies on user namespaces… Just opened a terminal on my android and checked it. Its inside a user namespaces.
No, that's incorrect and you're just further demonstrating how far out of your depth you are here. Google doesn't even enable user namespaces in the kernel in AOSP / stock Android for Nexus/Pixel. Doubt that any other vendors are enabling it. It doesn't use any namespaces other than mount namespaces as part of the multi-user emulation for backwards compatibility. It certainly doesn't use minijail as the 'default app sandbox'. It uses minijail as a library to factor out common patterns involved in privilege dropping, like dropping capabilities. The app sandbox is done with uid/gid pairs (AIDs) and the full system SELinux policy (untrusted_app domain for regular non-platform apps and isolated_app for isolatedProcess services). Permissions are generally done with IPC checks but some are done with secondary groups. Before it had SELinux, it was just using the POSIX user/group/permission model to implement the app sandbox and that's still the base. It has no use case at all for user namespaces, and process namespaces would not really have much use either due to hidepid=2 since 7.x combined with uid isolation. It would just be a mess since they turn a process into a subreaper / secondary init. Trying to explain to me how Android works from skimming and misinterpreting news / documentation and making incorrect assumptions is not going to get you far.
After working with sandboxing applications for several month, it seems clear to me that disabling user namespaces decreases the security of the system significantly. Some of these applications can not provide core features due to user namespaces missing. Others have significant security features disabled for this reasons. But the worst part is how some of these projects dealt with the missing feature: Many are using suid bits to execute the application as root to get access to the features they would have inside a user namespace. And for those who have worked with suid applications and their security it will not be surprising that they have failed to do this securely, leading to not just a few local root exploits.
There's no hard requirement that they have to do it that way. They can use a service where the user doesn't control the environment used to spawn the application (like setuid) or full control over the environment where it ends up being run. Application containers *really* do not need this feature. It's far better to do it in a more secure, saner way vs. exposing massive kernel attack surface.
Again no real life example for an alternative
Android, which was given as an example. You are going out of the way to ignore all of the information that's right in front of you.
Taking firejail just as an example: (CVE-2017-5207) (CVE-2017-5206) (CVE-2017-5180) (CVE-2016-10122) (CVE-2016-10118) (CVE-2016-9016)
A junk, insecure application is not a reason to greatly reduce kernel security for everyone.
I actually do not really want to argue with you about this one except that your claim for reduced kernel security is greatly exaggerated.
Not exaggerated at all. It adds a huge amount of attack surface. It's no joke to suddenly expect all of netfilter to handle untrusted administration, and that's just one of a bunch of API surfaces added as attack surface for unprivileged users.
And please not that the security of firejail would be grreatly increa
And that is just from the last release...
non of these issues would have been possible if user namespaces could be used, which is btw. what bubblewrap does if the feature is available, but since it isn’t on arch they have to use suid too (but bubblewrap is designed with security in mind for a change, so no known issues so far) Chromium is another case that has to use suid to use its sandbox and while I consider the developers very skilled in regards to security (they build a very nice broker architecture sandbox on windows too) there have been local root exploits in the linux version of chromium because of this.
Chromium has had a couple vulnerabilities there. Can you point to any that are full blown privesc?
Local root exploit in chromium: https://bugs.chromium.org/p/chromium/issues/detail?id=76542
you are welcome
If you read past the initial information (seems to be a consistent problem for you), you'll see that they determined that it didn't seem to really be a privilege escalation bug after all. I was already aware of that issue, and it's exactly why I asked for a real privilege escalation bug caused by chrome-sandbox because I am not aware of one.
I can point to 30+ kernel bugs from the
past couple years that are privesc via user namespaces. Also those kernel vulnerabilities impact *everyone*.
Please do point out some from the last 6 mounth.
CVE-2016-8655 is a simple one that comes to mind. Not accessible attack surface to unprivileged users without user namespaces. There are a bunch more though!
Even while looking at the surface of this problem it becomes clear this causes way more problems then it solves. Considering arch will be or already is the only linux distribution to disable this feature, developers of future applications will have to chose between droppingsupport for arch or to keep using features like suid that pose a real security threat opposite to user namespaces.
Nope, you're just ignoring / misrepresenting the facts here and failing to present a real proposal. Try again, and propose something where attack surface is not increased beyond the cases where this feature is actually required. Enabling it globally when people install something like Chromium doesn't qualify.
User namespaces are far more real of a security threat than these fears you're presenting here, and doing it as you propose would impose those risks on EVERYONE so that the few can have their poorly designed container features based on this.
I do not share your assessment of the threat posed by userns and you have given me no reaseon to share your opinion yet
You haven't done any real research, so you're in no position to draw conclusions.
Therefore I urge the people responsible to reconsider their choice an enable user namespaces in future kernel versions of arch linux.
Present a real proposal taking into account the very real reasons to avoid this that you are skirting around. If you aren't going to present technical solutions to the problems, which are certainly possible and could be implemented, then I don't think anything should be changed.
Solutions to change user namespaces inside the kernel? This isn’t the kernel mailing list and arch won’t patch the kernel, so I do not get what you are proposing.
The kernel change that's required is already upstream
I have thoughts on how to enable this while containing the attack surface but seeing as I have no interest in the feature and have a lot of far more important work to do than working on toy features, I don't plan on doing anything about this myself.
Please share this either here or via direct mail and I will work on this as far as I am able.
Bug reports regarding user namespaces:
https://bugs.archlinux.org/task/36969 https://bugs.archlinux.org/task/49337
To make this short, please provide sources for your claim regarding the kernel attack surface of user namespaces and alternatives to provide the same funktionality.
To conclude:
The people responsible for linux distributions like debian, red hat and pretty much all other distros, as well as many developers of sandboxing applications including the tails and chromium people all believe this feature is a useful tool to provide unprivileged sandbox applications worth the risk.
I haven't seen any such assessment by them about the risk vs. reward and comparing it to alternative solutions from a security perspective. The Chromium change has a lot more to do with them only really caring about ChromeOS (where they can disable userns everywhere but the spawning process) and Android (where it's not needed due to a better alternative and user namespaces aren't available). An argument from authority is worth nothing particularly when those people are not actually saying what you claim they are, and here is someone that works full time on infosec that's telling you otherwise.
Without any real prove of the claims you made in your post, it seems you rather have a personal grudge against this feature while at the same time saying you know better then all these people. Sorry but that is pretty rich.
Don’t get me wrong I would love to discuss with you about this all day long but I would like to ask you to reconsider your tone, as you sound incredibly arrogant when you put yourself above all those voices/people without providing real prove for your arguments.
You're the one making a proposal without having done much research into it, and you're going out of your way to only skim the available info. Not spoon feeding you information != lack of sources. You're the one making a proposal about this. It's on you to get yourself up to speed about the recent bugs exposed as privilege escalation vulns due to user namespaces. It's easy to find a dozen of them from the past 6 months simply from basic Google searches / oss-security, but there are many more if you actually dig deeper into CVEs and bug fixes backported to stables for these issues without CVEs.
As somebody with no actual knowledge of the details you guys are arguing over, but it seems to me OP has yet to learn that a simpler and more secure environment can only be achieved by using fewer and powerful components instead of many useless ones. Okay, there might be a point from which the amount of components will add enough obscurity to the overall system that simply nobody will bother trying to break it, but really, what's the big deal. I think sandboxing is a concept reminding too much of windows tools such as bullguard, which simply doesn't translate well enough (read: at all) to unixes, so I recommend checking whether you can trust the few things you use instead of adding a whole bunch of potempkin barriers. It's actually less work overall, too. cheers! mar77i
Am 01.02.2017 um 21:21 schrieb Daniel Micay via arch-general:
it's a nearly useless feature.
That's a baseless claim, that was already proved wrong in my first post by the many applications that use this feature.
That doesn't demonstrate that it's useful relative to the alternatives. It enables unprivileged OS containers but isn't really any use for app containers.
Pretty much all famous container programms use this. I wonder why if there is no use for it. Also I would still like to see a simple alternative for unprivileged namespaces to sandbox apps. How do you provide something like bubblewrap without user namespaces? And no that android example below is not the same as long as there is no simple way to use this (which I am not aware of)
but no one really wants it for that reason. They want it because it started pretending that it can offer something that it can't actually deliver safely.
Again a claim without prove
The proof is easy to find. You're the one making a proposal but you clearly haven't done your research. It's not my job to spoon feed you.
I do know some of the discussions about this feature on the kernel mailing list. But the opinions even there are not as clear as you want to make us believe.
There are much better ways to do unprivileged sandboxes with significantly less risk than CLONE_NEWUSER or setuid executables where the user controls the environment.
And yet you fail to name even one alternative. Please do
Uh, yeah, I did. M
Sorry but 'M' ? I don't get it.
Anything depending on this mechanism instead of properly designed plumbing for it is simply lazy garbage.
Another baseless and arrogant claim
Not baseless and it's not arrogant to point out that this is a bad feature for app containers. It's the truth.
even if that is correct, it is a pretty weird/funny argument to say it's the truth ... :)
There's still an unrelenting torrent of security issues from it.
Name one
Look at the discussion on the issue report or do basic research on the topic. It's your proposal, if you haven't done even basic research that's your problem.
I did, but we differ about the interpretations (see below)
Maybe wait until that stops before proposing this.
Vulnerabilities in kernel features will never stop to exist. If we disable everything with potential vulnerabilities, we did not have a kernel anymore.
It's a very niche feature with better alternatives for sandboxes and app containers. It exposes all of the netfilter administration code and tons of other networking and mount code as new attack surface.
Point taken
Android uses minijail (default app sandbox in android 7), which relies on user namespaces… Just opened a terminal on my android and checked it. Its inside a user namespaces.
No, that's incorrect and you're just further demonstrating how far out of your depth you are here. Google doesn't even enable user namespaces in the kernel in AOSP / stock Android for Nexus/Pixel. Doubt that any other vendors are enabling it. It doesn't use any namespaces other than mount namespaces as part of the multi-user emulation for backwards compatibility. It certainly doesn't use minijail as the 'default app sandbox'. It uses minijail as a library to factor out common patterns involved in privilege dropping, like dropping capabilities. The app sandbox is done with uid/gid pairs (AIDs) and the full system SELinux policy (untrusted_app domain for regular non-platform apps and isolated_app for isolatedProcess services). Permissions are generally done with IPC checks but some are done with secondary groups. Before it had SELinux, it was just using the POSIX user/group/permission model to implement the app sandbox and that's still the base. It has no use case at all for user namespaces, and process namespaces would not really have much use either due to hidepid=2 since 7.x combined with uid isolation. It would just be a mess since they turn a process into a subreaper / secondary init.
Trying to explain to me how Android works from skimming and misinterpreting news / documentation and making incorrect assumptions is not going to get you far.
Considering what you do for a living I believe you here. However that also means that A LOT of documentation about how chromium, android and minijail work is completely wrong. Which is kinda disturbing...
Again no real life example for an alternative
Android, which was given as an example. You are going out of the way to ignore all of the information that's right in front of you.
I am talking about alternatives that provide the same funktionality as the full set of namespaces like bubblewrap does.
I can point to 30+ kernel bugs from the
past couple years that are privesc via user namespaces. Also those kernel vulnerabilities impact *everyone*.
Please do point out some from the last 6 mounth.
CVE-2016-8655 is a simple one that comes to mind. Not accessible attack surface to unprivileged users without user namespaces. There are a bunch more though!
Now I get this. Your risk assessment includes all vulnerabilities in all parts of the kernel that are available to unprivileged users because of user namespaces. That does make sense but: There are A LOT of features that provide simular access to these kernel parts and would make those volnerabilities exploitable for normal users. That's why I do not share this assessment, although I have to admit that the provided attack surface of userns is by itself way larger then by using other vectors. There was an interesting presentations somewhere that talked about this, but I cannot find it right now, so I concede this point for now and agree to your assessment of the risks involved
Solutions to change user namespaces inside the kernel? This isn’t the kernel mailing list and arch won’t patch the kernel, so I do not get what you are proposing.
The kernel change that's required is already upstream
Please provide a link, I would very much like to see this but could not find it so far.
The people responsible for linux distributions like debian, red hat and pretty much all other distros, as well as many developers of sandboxing applications including the tails and chromium people all believe this feature is a useful tool to provide unprivileged sandbox applications worth the risk.
I haven't seen any such assessment by them about the risk vs. reward and comparing it to alternative solutions from a security perspective. The Chromium change has a lot more to do with them only really caring about ChromeOS (where they can disable userns everywhere but the spawning process) and Android (where it's not needed due to a better alternative and user namespaces aren't available).
An argument from authority is worth nothing particularly when those people are not actually saying what you claim they are, and here is someone that works full time on infosec that's telling you otherwise.
You are right there is no assessment of these people I can point to, but that was not what I was trying to say anyway. The point is: All those distros, everyone except arch has decided at some point to no longer restrict the use of unprivileged user namespaces. That's the result we have today, that cannot be denied. So by enableing this feature I do see a decision that involves the risks. You can of course claim they do not know what they are doing but I think that would be pretty arogant to do. In any case: arch is the last distribution to disable this feature and I doubt this will go away anytime soon, plus more programms will rely on it. So even assuming that I am in no position to assess the risks involved, I think it would be obvious to question this decision when everyone else seems to think otherwise. Not that majorities are anything to go by but the maintainers of other distros are not stupid either...
All those distros, everyone except arch has decided at some point to no longer restrict the use of unprivileged user namespaces.
In no way whatsoever does Arch restrict the use of unprivileged user namespaces. Rebuilding your kernel with them enabled is a trivial task for any user familiar with ABS. If you feel this strongly about it please write a wiki article about the benefits/tradeoffs and link it with the relevant application articles (Firejail, Security, etc.). Max
Am 02.02.2017 um 05:10 schrieb Maxwell Anselm via arch-general:
All those distros, everyone except arch has decided at some point to no longer restrict the use of unprivileged user namespaces.
In no way whatsoever does Arch restrict the use of unprivileged user namespaces. Rebuilding your kernel with them enabled is a trivial task for any user familiar with ABS. If you feel this strongly about it please write a wiki article about the benefits/tradeoffs and link it with the relevant application articles (Firejail, Security, etc.).
Max
This issue is about the default arch kernel disabling user namespaces and the consequence that many applications have to use insecure workarounds like suid to still work on arch. This has nothing to do with the gernal ability to user user namespaces on arch, this is about the default kernel.
On Thu, 2 Feb 2017 05:13:46 +0100 sivmu <sivmu@web.de> wrote:
Am 02.02.2017 um 05:10 schrieb Maxwell Anselm via arch-general:
All those distros, everyone except arch has decided at some point to no longer restrict the use of unprivileged user namespaces.
In no way whatsoever does Arch restrict the use of unprivileged user namespaces. Rebuilding your kernel with them enabled is a trivial task for any user familiar with ABS. If you feel this strongly about it please write a wiki article about the benefits/tradeoffs and link it with the relevant application articles (Firejail, Security, etc.).
Max
This issue is about the default arch kernel disabling user namespaces and the consequence that many applications have to use insecure workarounds like suid to still work on arch.
This has nothing to do with the gernal ability to user user namespaces on arch, this is about the default kernel.
You have said multiple times that Arch is restricting this. They're not. It's simply not there by default, like just about everything in Arch. Build your own kernel and move on.
On Thu, 2017-02-02 at 02:40 +0100, sivmu wrote:
Am 01.02.2017 um 21:21 schrieb Daniel Micay via arch-general:
it's a nearly useless feature.
That's a baseless claim, that was already proved wrong in my first post by the many applications that use this feature.
That doesn't demonstrate that it's useful relative to the alternatives. It enables unprivileged OS containers but isn't really any use for app containers.
Pretty much all famous container programms use this. I wonder why if there is no use for it.
Also I would still like to see a simple alternative for unprivileged namespaces to sandbox apps. How do you provide something like bubblewrap without user namespaces? And no that android example below is not the same as long as there is no simple way to use this (which I am not aware of)
Doing things properly is not easy.
but no one really wants it for that reason. They want it because it started pretending that it can offer something that it can't actually deliver safely.
Again a claim without prove
The proof is easy to find. You're the one making a proposal but you clearly haven't done your research. It's not my job to spoon feed you.
I do know some of the discussions about this feature on the kernel mailing list. But the opinions even there are not as clear as you want to make us believe.
The kernel configuration disables it by default. It enables UTS, IPC, PID and NET namespaces by default. That's the opinion from upstream on the sane default for a general purpose build: disabled. It is quite clear that it's a major security risk. It exposes an endless stream of privesc vulnerabilities from all of the attack surface it adds. That attack surface was never exposed like that before and the code is not at all robust against attackers, since it was only exposed to root users before. It is going to take years for it to settle down and become more like core kernel code that was already exposed, and it's always going to be a ton of extra attack surface.
There are much better ways to do unprivileged sandboxes with significantly less risk than CLONE_NEWUSER or setuid executables where the user controls the environment.
And yet you fail to name even one alternative. Please do
Uh, yeah, I did. M
Sorry but 'M' ? I don't get it.
Anything depending on this mechanism instead of properly designed plumbing for it is simply lazy garbage.
Another baseless and arrogant claim
Not baseless and it's not arrogant to point out that this is a bad feature for app containers. It's the truth.
even if that is correct, it is a pretty weird/funny argument to say it's the truth ... :)
There's still an unrelenting torrent of security issues from it.
Name one
Look at the discussion on the issue report or do basic research on the topic. It's your proposal, if you haven't done even basic research that's your problem.
I did, but we differ about the interpretations (see below)
Maybe wait until that stops before proposing this.
Vulnerabilities in kernel features will never stop to exist. If we disable everything with potential vulnerabilities, we did not have a kernel anymore.
It's a very niche feature with better alternatives for sandboxes and app containers. It exposes all of the netfilter administration code and tons of other networking and mount code as new attack surface.
Point taken
Android uses minijail (default app sandbox in android 7), which relies on user namespaces… Just opened a terminal on my android and checked it. Its inside a user namespaces.
No, that's incorrect and you're just further demonstrating how far out of your depth you are here. Google doesn't even enable user namespaces in the kernel in AOSP / stock Android for Nexus/Pixel. Doubt that any other vendors are enabling it. It doesn't use any namespaces other than mount namespaces as part of the multi-user emulation for backwards compatibility. It certainly doesn't use minijail as the 'default app sandbox'. It uses minijail as a library to factor out common patterns involved in privilege dropping, like dropping capabilities. The app sandbox is done with uid/gid pairs (AIDs) and the full system SELinux policy (untrusted_app domain for regular non-platform apps and isolated_app for isolatedProcess services). Permissions are generally done with IPC checks but some are done with secondary groups. Before it had SELinux, it was just using the POSIX user/group/permission model to implement the app sandbox and that's still the base. It has no use case at all for user namespaces, and process namespaces would not really have much use either due to hidepid=2 since 7.x combined with uid isolation. It would just be a mess since they turn a process into a subreaper / secondary init.
Trying to explain to me how Android works from skimming and misinterpreting news / documentation and making incorrect assumptions is not going to get you far.
Considering what you do for a living I believe you here.
However that also means that A LOT of documentation about how chromium, android and minijail work is completely wrong. Which is kinda disturbing...
The documentation isn't wrong. Chromium never claims to have a mandatory dependency on user namespaces since they've kept the setuid sandbox and Android's documentation *definitely* doesn't claim to use them. Android has never enabled user namespaces and has no use for them.
Again no real life example for an alternative
Android, which was given as an example. You are going out of the way to ignore all of the information that's right in front of you.
I am talking about alternatives that provide the same funktionality as the full set of namespaces like bubblewrap does.
I can point to 30+ kernel bugs from the
past couple years that are privesc via user namespaces. Also those kernel vulnerabilities impact *everyone*.
Please do point out some from the last 6 mounth.
CVE-2016-8655 is a simple one that comes to mind. Not accessible attack surface to unprivileged users without user namespaces. There are a bunch more though!
Now I get this. Your risk assessment includes all vulnerabilities in all parts of the kernel that are available to unprivileged users because of user namespaces. That does make sense but:
There are A LOT of features that provide simular access to these kernel parts and would make those volnerabilities exploitable for normal users. That's why I do not share this assessment, although I have to admit that the provided attack surface of userns is by itself way larger then by using other vectors.
There was an interesting presentations somewhere that talked about this, but I cannot find it right now, so I concede this point for now and agree to your assessment of the risks involved
There's no other kernel feature exposing all of that attack surface.
Solutions to change user namespaces inside the kernel? This isn’t the kernel mailing list and arch won’t patch the kernel, so I do not get what you are proposing.
The kernel change that's required is already upstream
Please provide a link, I would very much like to see this but could not find it so far.
sysctl: user.max_cgroup_namespaces = 257166 user.max_ipc_namespaces = 257166 user.max_mnt_namespaces = 257166 user.max_net_namespaces = 257166 user.max_pid_namespaces = 257166 user.max_user_namespaces = 257166 user.max_uts_namespaces = 257166 A starting point is always setting max_user_namespaces to 0 by default, and enabling the feature at compile-time. The proper way to do this is not forcing people to toggle it on globally to use container software depending on it though. It's scoped per userns and it's meant to be only exposed where it's needed. The way the kernel implemented it makes this painful, but it's doable. It would make sense to enable it, disabled by default via the sysctl, with a policy to not automatically enable it in packages, with the goal of a proper scoped implementation. Or just take a sane approach to sandboxing / app containers...
The people responsible for linux distributions like debian, red hat and pretty much all other distros, as well as many developers of sandboxing applications including the tails and chromium people all believe this feature is a useful tool to provide unprivileged sandbox applications worth the risk.
I haven't seen any such assessment by them about the risk vs. reward and comparing it to alternative solutions from a security perspective. The Chromium change has a lot more to do with them only really caring about ChromeOS (where they can disable userns everywhere but the spawning process) and Android (where it's not needed due to a better alternative and user namespaces aren't available).
An argument from authority is worth nothing particularly when those people are not actually saying what you claim they are, and here is someone that works full time on infosec that's telling you otherwise.
You are right there is no assessment of these people I can point to, but that was not what I was trying to say anyway.
The point is: All those distros, everyone except arch has decided at some point to no longer restrict the use of unprivileged user namespaces. That's the result we have today, that cannot be denied.
So by enableing this feature I do see a decision that involves the risks. You can of course claim they do not know what they are doing but I think that would be pretty arogant to do.
The majority of people working on desktop Linux security definitely have no clue. It's a disaster and container tech is a terrible approach to addressing it that brings many drawbacks vs. better solutions elsewhere. I think it's laughable really. You can escape from all of these app container 'sandbox' implementations via pulseaudio / dbus. The only proper sandbox you named is the Chromium one, and it doesn't have a hard dependency on this. It's only one of the options to make it work, and they choose to do things differently elsewhere. It's all about the expedience of using an available feature for a platform that's not exactly first tier (desktop Linux) like ChromeOS, Android and Windows.
In any case: arch is the last distribution to disable this feature and I doubt this will go away anytime soon, plus more programms will rely on it.
It's not the last distribution to not have this enabled at all, and some of the distributions enabling it are constraining it to be disabled by default or only accessible to privileged users. The grsecurity patch set restricts user namespaces to privileged users too.
So even assuming that I am in no position to assess the risks involved, I think it would be obvious to question this decision when everyone else seems to think otherwise. Not that majorities are anything to go by but the maintainers of other distros are not stupid either...
Am 02.02.2017 um 11:28 schrieb Daniel Micay via arch-general:
On Thu, 2017-02-02 at 02:40 +0100, sivmu wrote:
Am 01.02.2017 um 21:21 schrieb Daniel Micay via arch-general:
it's a nearly useless feature.
That's a baseless claim, that was already proved wrong in my first post by the many applications that use this feature.
That doesn't demonstrate that it's useful relative to the alternatives. It enables unprivileged OS containers but isn't really any use for app containers.
Pretty much all famous container programms use this. I wonder why if there is no use for it.
Also I would still like to see a simple alternative for unprivileged namespaces to sandbox apps. How do you provide something like bubblewrap without user namespaces? And no that android example below is not the same as long as there is no simple way to use this (which I am not aware of)
Doing things properly is not easy.
That's a bad attitude. It sounds like proper implementations need to be difficult. That's not true. Especially security and above all crypto fails often because it is hard to apply. That is why people like Bruce Schneier have often talked about this. Dan Bernstein has created the crypto library NaCl for that very reason, to allow the use of crypto without overly complex and error prone implementations like needed by openssl. That is why this sentence is extremly wrong and dangerous. If there is no way to privide users or developers with easy tools to sandbox apps, then one has to be created. Just saying that doing things properly isn't easy will do more harm then features like user namespaces will ever be able to. And if I am not mistaken, that is pretty much what android does: it provides app developers with easy ways to drop privileges and sandbox their apps. Therefore I think the wish and need for easy ways to privode security is important. Bubblewrap is one of the concepts that I think do a great job on providing easy isolation of apps, even if they utilise namespaces for that purpose. (The Tor people seem to agree)
but no one really wants it for that reason. They want it because it started pretending that it can offer something that it can't actually deliver safely.
Again a claim without prove
The proof is easy to find. You're the one making a proposal but you clearly haven't done your research. It's not my job to spoon feed you.
I do know some of the discussions about this feature on the kernel mailing list. But the opinions even there are not as clear as you want to make us believe.
The kernel configuration disables it by default. It enables UTS, IPC, PID and NET namespaces by default. That's the opinion from upstream on the sane default for a general purpose build: disabled.
Tha is news to me actually. I was under the impression that all namespaces were enabled by default. That changes things to some extend.
It is quite clear that it's a major security risk. It exposes an endless stream of privesc vulnerabilities from all of the attack surface it adds. That attack surface was never exposed like that before and the code is not at all robust against attackers, since it was only exposed to root users before. It is going to take years for it to settle down and become more like core kernel code that was already exposed, and it's always going to be a ton of extra attack surface.
I recently talked about this with IT Sec people with kernel inside knowledge and one strong opinion was that some enable this feature for that exact reason. Because like many other features it will not evolve if everyone disables it. Only the finding of vulnerabilities and design flaws will lead to secure kernel features.
The kernel change that's required is already upstream
Please provide a link, I would very much like to see this but could not find it so far.
sysctl:
user.max_cgroup_namespaces = 257166 user.max_ipc_namespaces = 257166 user.max_mnt_namespaces = 257166 user.max_net_namespaces = 257166 user.max_pid_namespaces = 257166 user.max_user_namespaces = 257166 user.max_uts_namespaces = 257166
A starting point is always setting max_user_namespaces to 0 by default, and enabling the feature at compile-time. The proper way to do this is not forcing people to toggle it on globally to use container software depending on it though. It's scoped per userns and it's meant to be only exposed where it's needed. The way the kernel implemented it makes this painful, but it's doable. It would make sense to enable it, disabled by default via the sysctl, with a policy to not automatically enable it in packages, with the goal of a proper scoped implementation. Or just take a sane approach to sandboxing / app containers...
Thanks that is interesting. Is there a comprehensive documentation about how to use this somewhere?
The point is: All those distros, everyone except arch has decided at some point to no longer restrict the use of unprivileged user namespaces. That's the result we have today, that cannot be denied.
So by enableing this feature I do see a decision that involves the risks. You can of course claim they do not know what they are doing but I think that would be pretty arogant to do.
The majority of people working on desktop Linux security definitely have no clue. It's a disaster and container tech is a terrible approach to addressing it that brings many drawbacks vs. better solutions elsewhere.
Thats without a doubt correct, but Kernel maintainers from distros like debian and red hat usually know what they are doing. Thats why I question the reason the enabled this.
I think it's laughable really. You can escape from all of these app container 'sandbox' implementations via pulseaudio / dbus.
Flatpak and firejail both block access to these services. Even with bubblewrap/namespace-sandboxing alone you can easily block them by not granting access to the domain socket files and using network namespaces to block abstract sockets. Does not seem like such a bad idea to me (I not would recommend firejail though) Not that there are not enoght weak points in sandboxing as long as things like X11 are available. But especially for things like, "parse this file and tell me if it is malicious" a sandbox in an empty user/mount/pid/ipc/network namespace with stong seccomp filters, can be quite useful. Even more so if you need to do stuff like this anyway, sandboxed or not.
The only proper sandbox you named is the Chromium one, and it doesn't have a hard dependency on this. It's only one of the options to make it work, and they choose to do things differently elsewhere. It's all about the expedience of using an available feature for a platform that's not exactly first tier (desktop Linux) like ChromeOS, Android and Windows.
In any case: arch is the last distribution to disable this feature and I doubt this will go away anytime soon, plus more programms will rely on it.
It's not the last distribution to not have this enabled at all, and some of the distributions enabling it are constraining it to be disabled by default or only accessible to privileged users. The grsecurity patch set restricts user namespaces to privileged users too.
Is there any chance to get the arch main kernel to use such a patch for privileged user namespaces like with grsec? That would at least reduce the aweful number of bugreports for many projekts that use them this way.
On Thu, 2 Feb 2017 16:29:52 +0100, sivmu wrote:
Is there any chance to get the arch main kernel to use such a patch for privileged user namespaces like with grsec?
Hi, you could provide the kernel by the AUR and see how many votes it gets. Note "linux-grsec" is provided by "Community" and "linux" is part of "Core" and the "base" group. Your patched kernel might migrated from the AUR to "Community", if a lot of users vote for it and then somebody should be willing to maintain it. Regards, Ralf
On Thu, 2017-02-02 at 16:29 +0100, sivmu wrote:
Am 02.02.2017 um 11:28 schrieb Daniel Micay via arch-general:
On Thu, 2017-02-02 at 02:40 +0100, sivmu wrote:
Am 01.02.2017 um 21:21 schrieb Daniel Micay via arch-general:
it's a nearly useless feature.
That's a baseless claim, that was already proved wrong in my first post by the many applications that use this feature.
That doesn't demonstrate that it's useful relative to the alternatives. It enables unprivileged OS containers but isn't really any use for app containers.
Pretty much all famous container programms use this. I wonder why if there is no use for it.
Also I would still like to see a simple alternative for unprivileged namespaces to sandbox apps. How do you provide something like bubblewrap without user namespaces? And no that android example below is not the same as long as there is no simple way to use this (which I am not aware of)
Doing things properly is not easy.
That's a bad attitude. It sounds like proper implementations need to be difficult. That's not true. Especially security and above all crypto fails often because it is hard to apply. That is why people like Bruce Schneier have often talked about this. Dan Bernstein has created the crypto library NaCl for that very reason, to allow the use of crypto without overly complex and error prone implementations like needed by openssl.
That is why this sentence is extremly wrong and dangerous. If there is no way to privide users or developers with easy tools to sandbox apps, then one has to be created. Just saying that doing things properly isn't easy will do more harm then features like user namespaces will ever be able to.
Stop misrepresenting my position and arguing against strawmen. Providing easy tools to sandbox applications is important. Those tools should be robust and secure, which is the part that's not easy. That's clearly what I was talking about, but you're only interested in trying to score points without actually trying to understand but simply regurgitating talking points without having a clue. User namespaces are not at all necessary for providing unprivileged sandboxing. For example, SubgraphOS doesn't use user namespaces. You're the one doing harm, by wasting time and hurting the chances of someone actually getting what you want done. User namespaces had a higher chance of being enabled before you showed up, just like what happened with moving away from using md5/sha1 for hashes in PKGBUILDs and what happened with MAC.
And if I am not mistaken, that is pretty much what android does: it provides app developers with easy ways to drop privileges and sandbox their apps.
Therefore I think the wish and need for easy ways to privode security is important.
Bubblewrap is one of the concepts that I think do a great job on providing easy isolation of apps, even if they utilise namespaces for that purpose. (The Tor people seem to agree)
Another argument to authority? Tor tries to build a privacy-oriented browser on top of by far the least secure mainstream browser. Tails doesn't (currently) even use a PaX / grsecurity kernel despite it being fairly easy to integrate particularly for a distribution like that. It doesn't use a full system MAC policy either. Is that really the ivory tower you're looking to for advice?
but no one really wants it for that reason. They want it because it started pretending that it can offer something that it can't actually deliver safely.
Again a claim without prove
The proof is easy to find. You're the one making a proposal but you clearly haven't done your research. It's not my job to spoon feed you.
I do know some of the discussions about this feature on the kernel mailing list. But the opinions even there are not as clear as you want to make us believe.
The kernel configuration disables it by default. It enables UTS, IPC, PID and NET namespaces by default. That's the opinion from upstream on the sane default for a general purpose build: disabled.
Tha is news to me actually. I was under the impression that all namespaces were enabled by default. That changes things to some extend.
All but user namespaces. The other namespaces do not present any real risk by default, since only privileged users have access to them. User namespaces greatly reduce security unless they are disabled at runtime. A system designed around user namespaces can make sure that EVERYTHING runs in a user namespace with user namespaces disabled within them, but on a general purpose system it's a big problem, and that's also the case for a general purpose system with a few sandboxes. Not to mention that user namespaces accomplish essentially nothing for security and the only reason you or anyone else wants them is fact that they open up the unpriv use can of worms, but it's much less secure than alternative approaches to that. It is NOT necessary or even desirable for that. It is only wanted for the political reasons I've already gone into.
It is quite clear that it's a major security risk. It exposes an endless stream of privesc vulnerabilities from all of the attack surface it adds. That attack surface was never exposed like that before and the code is not at all robust against attackers, since it was only exposed to root users before. It is going to take years for it to settle down and become more like core kernel code that was already exposed, and it's always going to be a ton of extra attack surface.
I recently talked about this with IT Sec people with kernel inside knowledge and one strong opinion was that some enable this feature for that exact reason. Because like many other features it will not evolve if everyone disables it. Only the finding of vulnerabilities and design flaws will lead to secure kernel features.
It will never be secure. It will ALWAYS unnecessarily introduce a whole bunch of attack surface. Anyone making nonsense claims like that has absolutely no clue about security and should not be trusted with anything to do with it. Bug finding isn't ever going to change the fact that this approach is insane.
The kernel change that's required is already upstream
Please provide a link, I would very much like to see this but could not find it so far.
sysctl:
user.max_cgroup_namespaces = 257166 user.max_ipc_namespaces = 257166 user.max_mnt_namespaces = 257166 user.max_net_namespaces = 257166 user.max_pid_namespaces = 257166 user.max_user_namespaces = 257166 user.max_uts_namespaces = 257166
A starting point is always setting max_user_namespaces to 0 by default, and enabling the feature at compile-time. The proper way to do this is not forcing people to toggle it on globally to use container software depending on it though. It's scoped per userns and it's meant to be only exposed where it's needed. The way the kernel implemented it makes this painful, but it's doable. It would make sense to enable it, disabled by default via the sysctl, with a policy to not automatically enable it in packages, with the goal of a proper scoped implementation. Or just take a sane approach to sandboxing / app containers...
Thanks that is interesting. Is there a comprehensive documentation about how to use this somewhere?
You know what to look for now.
The point is: All those distros, everyone except arch has decided at some point to no longer restrict the use of unprivileged user namespaces. That's the result we have today, that cannot be denied.
So by enableing this feature I do see a decision that involves the risks. You can of course claim they do not know what they are doing but I think that would be pretty arogant to do.
The majority of people working on desktop Linux security definitely have no clue. It's a disaster and container tech is a terrible approach to addressing it that brings many drawbacks vs. better solutions elsewhere.
Thats without a doubt correct, but Kernel maintainers from distros like debian and red hat usually know what they are doing. Thats why I question the reason the enabled this.
More arguments to authority and again pretending that you can speak for developers of other projects.
I think it's laughable really. You can escape from all of these app container 'sandbox' implementations via pulseaudio / dbus.
Flatpak and firejail both block access to these services. Even with bubblewrap/namespace-sandboxing alone you can easily block them by not granting access to the domain socket files and using network namespaces to block abstract sockets. Does not seem like such a bad idea to me (I not would recommend firejail though)
Not that there are not enoght weak points in sandboxing as long as things like X11 are available.
The point is that they are truly not usable for sandboxing desktop apps and that's the only niche that's not well covered already.
But especially for things like, "parse this file and tell me if it is malicious" a sandbox in an empty user/mount/pid/ipc/network namespace with stong seccomp filters, can be quite useful. Even more so if you need to do stuff like this anyway, sandboxed or not.
You don't need user namespaces to do sandboxing. You keep bringing up irrelevant points and assuming that sandboxing or unprivileged sandboxing depends on user namespaces. It does not. Other approaches to unprivileged use can be inherently safer too. Separately from that, generic sandboxing is a lot weaker than integrated sandboxing which you imply by stating *strong* seccomp filters. In that case namespaces are not really required.
The only proper sandbox you named is the Chromium one, and it doesn't have a hard dependency on this. It's only one of the options to make it work, and they choose to do things differently elsewhere. It's all about the expedience of using an available feature for a platform that's not exactly first tier (desktop Linux) like ChromeOS, Android and Windows.
In any case: arch is the last distribution to disable this feature and I doubt this will go away anytime soon, plus more programms will rely on it.
It's not the last distribution to not have this enabled at all, and some of the distributions enabling it are constraining it to be disabled by default or only accessible to privileged users. The grsecurity patch set restricts user namespaces to privileged users too.
Is there any chance to get the arch main kernel to use such a patch for privileged user namespaces like with grsec? That would at least reduce the aweful number of bugreports for many projekts that use them this way.
I doubt it. Anyway, linux-grsec is in the official repositories. No one really wants / needs user namespaces though. They want *unprivileged* access to namespaces. The uid/gid mapping stuff provided by user namespaces doesn't have much use when you are already using a different approach to unprivileged access. You can use dedicated uid/gids from reserved ranges instead. The uid/gid mapping features may become less half-baked in the future as support appears via filesystems, etc. for actually doing something useful with them but right now there's little point.
Am 02.02.2017 um 17:45 schrieb Daniel Micay via arch-general:
SubgraphOS doesn't use user namespaces.
It also is not a lightweight solution that compares to the tools in question for that matter. But I get your point.
I was under the impression that all namespaces were enabled by default. That changes things to some extend.
All but user namespaces. The other namespaces do not present any real risk by default, since only privileged users have access to them. User namespaces greatly reduce security unless they are disabled at runtime. A system designed around user namespaces can make sure that EVERYTHING runs in a user namespace with user namespaces disabled within them, but on a general purpose system it's a big problem, and that's also the case for a general purpose system with a few sandboxes. Not to mention that user namespaces accomplish essentially nothing for security and the only reason you or anyone else wants them is fact that they open up the unpriv use can of worms, but it's much less secure than alternative approaches to that. It is NOT necessary or even desirable for that. It is only wanted for the political reasons I've already gone into.
I get it. User namespaces privide a wide attack surface to the kernel and decrease security and as you pointed out, the only reason to use it is to get access to priviliges features like other namespaces as an unprivileged user. The reason me and I dare to say likely many others cling to this feature is the impressive isolation namespaces provide for sandboxing and I do not see comparable ways to do this. But I agree that we need secure sandboxing solutions. Using bubblewrap with suid on a system without unprivileged user namespaces would be the closest thing I know right now.
Not that there are not enoght weak points in sandboxing as long as things like X11 are available.
The point is that they are truly not usable for sandboxing desktop apps and that's the only niche that's not well covered already.
Would you care to elaborate this point? As said before I think things like flatpak/bubblewrap can be used if carefully designed. If your point is that it does not allow normal users to safely use it without detailed knowledge of sandboxing weakpoints i would agree. But other then that it is kind of lost on me what draws you to this conclusion. Or rather what would qualify as secure sandbox in your opinion.
Separately from that, generic sandboxing is a lot weaker than integrated sandboxing which you imply by stating *strong* seccomp filters. In that case namespaces are not really required.
In regards to sandboxing desktop applications, seccomp would not be enough. Even wayland cannot be secured without further isolation: https://github.com/MaartenBaert/wayland-keylogger And I have yet to find a successful attempt to filter socket communication for blocking dbus. Still hoping for bus1 to evolve soon.
I doubt it. Anyway, linux-grsec is in the official repositories. No one really wants / needs user namespaces though. They want *unprivileged* access to namespaces.
Then I will drop this topic as it has become clear to me that you know what you are talking about here.
On 02/02/2017 10:29 AM, sivmu wrote:
Am 02.02.2017 um 11:28 schrieb Daniel Micay via arch-general:
On Thu, 2017-02-02 at 02:40 +0100, sivmu wrote:
Am 01.02.2017 um 21:21 schrieb Daniel Micay via arch-general:
it's a nearly useless feature.
That's a baseless claim, that was already proved wrong in my first post by the many applications that use this feature.
That doesn't demonstrate that it's useful relative to the alternatives. It enables unprivileged OS containers but isn't really any use for app containers.
Pretty much all famous container programms use this. I wonder why if there is no use for it.
Also I would still like to see a simple alternative for unprivileged namespaces to sandbox apps. How do you provide something like bubblewrap without user namespaces? And no that android example below is not the same as long as there is no simple way to use this (which I am not aware of)
Doing things properly is not easy.
That's a bad attitude. It sounds like proper implementations need to be difficult. That's not true. Especially security and above all crypto fails often because it is hard to apply. That is why people like Bruce Schneier have often talked about this. Dan Bernstein has created the crypto library NaCl for that very reason, to allow the use of crypto without overly complex and error prone implementations like needed by openssl.
That is why this sentence is extremly wrong and dangerous. If there is no way to privide users or developers with easy tools to sandbox apps, then one has to be created. Just saying that doing things properly isn't easy will do more harm then features like user namespaces will ever be able to.
And if I am not mistaken, that is pretty much what android does: it provides app developers with easy ways to drop privileges and sandbox their apps.
Therefore I think the wish and need for easy ways to privode security is important.
Bubblewrap is one of the concepts that I think do a great job on providing easy isolation of apps, even if they utilise namespaces for that purpose. (The Tor people seem to agree)
Up until here, I was watching this thread with some interest, despite knowing very little about security myself. But I've finally realized you are blatantly trolling. It took a while, despite your extremely aggressive attitude towards people who actually know what they are talking about and disagree with you, but I like to give people the benefit of the doubt... This is *so wrong*, for multiple meanings of the word wrong. You're not even comparing apples to oranges, you're comparing apples to... I don't know, maybe small decorative handcarved wooden knickknacks purporting to be sourced from a Native American reservation. Having someone who works full time on infosec and is one of the core developers for Arch Linux tell you "designing properly-secure backends for sandboxing that don't have security holes -- either through design or bugs -- is hard work and therefor not easy to accomplish" and responding "OMG you're evil and dangerous and have a bad attitude and stuff, because you are promulgating the belief that security libraries should have inscrutable APIs which make it harder for downstream developers to make use of them" is just a flat-out mudslinging lie. You have proven that your only interest in starting this thread is to troll, sling mud at the people responsible for disabling your precious features, and stir up trouble in the process. Please consider taking a break from the internet while you cool down. Also I strongly urge everyone else here to do as I did, and add this thread to your spam filter. Continuing to reply to this trollish behavior can only cause more fighting, it will most assuredly not produce useful results. -- Eli Schwartz
So, why don't you compile your own kernel? Using abs and changing the config-file is the only thing you'd have to do.
participants (12)
-
Bart De Roy
-
Bennett Piater
-
Daniel Micay
-
Doug Newgard
-
Eli Schwartz
-
Leonid Isaev
-
Martin Kühne
-
Maxwell Anselm
-
Ralf Mardorf
-
Shridhar Daithankar
-
sivmu
-
Uwe