Am 02.02.2017 um 11:28 schrieb Daniel Micay via arch-general:
On Thu, 2017-02-02 at 02:40 +0100, sivmu wrote:
Am 01.02.2017 um 21:21 schrieb Daniel Micay via arch-general:
it's a nearly useless feature.
That's a baseless claim, that was already proved wrong in my first post by the many applications that use this feature.
That doesn't demonstrate that it's useful relative to the alternatives. It enables unprivileged OS containers but isn't really any use for app containers.
Pretty much all famous container programms use this. I wonder why if there is no use for it.
Also I would still like to see a simple alternative for unprivileged namespaces to sandbox apps. How do you provide something like bubblewrap without user namespaces? And no that android example below is not the same as long as there is no simple way to use this (which I am not aware of)
Doing things properly is not easy.
That's a bad attitude. It sounds like proper implementations need to be difficult. That's not true. Especially security and above all crypto fails often because it is hard to apply. That is why people like Bruce Schneier have often talked about this. Dan Bernstein has created the crypto library NaCl for that very reason, to allow the use of crypto without overly complex and error prone implementations like needed by openssl. That is why this sentence is extremly wrong and dangerous. If there is no way to privide users or developers with easy tools to sandbox apps, then one has to be created. Just saying that doing things properly isn't easy will do more harm then features like user namespaces will ever be able to. And if I am not mistaken, that is pretty much what android does: it provides app developers with easy ways to drop privileges and sandbox their apps. Therefore I think the wish and need for easy ways to privode security is important. Bubblewrap is one of the concepts that I think do a great job on providing easy isolation of apps, even if they utilise namespaces for that purpose. (The Tor people seem to agree)
but no one really wants it for that reason. They want it because it started pretending that it can offer something that it can't actually deliver safely.
Again a claim without prove
The proof is easy to find. You're the one making a proposal but you clearly haven't done your research. It's not my job to spoon feed you.
I do know some of the discussions about this feature on the kernel mailing list. But the opinions even there are not as clear as you want to make us believe.
The kernel configuration disables it by default. It enables UTS, IPC, PID and NET namespaces by default. That's the opinion from upstream on the sane default for a general purpose build: disabled.
Tha is news to me actually. I was under the impression that all namespaces were enabled by default. That changes things to some extend.
It is quite clear that it's a major security risk. It exposes an endless stream of privesc vulnerabilities from all of the attack surface it adds. That attack surface was never exposed like that before and the code is not at all robust against attackers, since it was only exposed to root users before. It is going to take years for it to settle down and become more like core kernel code that was already exposed, and it's always going to be a ton of extra attack surface.
I recently talked about this with IT Sec people with kernel inside knowledge and one strong opinion was that some enable this feature for that exact reason. Because like many other features it will not evolve if everyone disables it. Only the finding of vulnerabilities and design flaws will lead to secure kernel features.
The kernel change that's required is already upstream
Please provide a link, I would very much like to see this but could not find it so far.
sysctl:
user.max_cgroup_namespaces = 257166 user.max_ipc_namespaces = 257166 user.max_mnt_namespaces = 257166 user.max_net_namespaces = 257166 user.max_pid_namespaces = 257166 user.max_user_namespaces = 257166 user.max_uts_namespaces = 257166
A starting point is always setting max_user_namespaces to 0 by default, and enabling the feature at compile-time. The proper way to do this is not forcing people to toggle it on globally to use container software depending on it though. It's scoped per userns and it's meant to be only exposed where it's needed. The way the kernel implemented it makes this painful, but it's doable. It would make sense to enable it, disabled by default via the sysctl, with a policy to not automatically enable it in packages, with the goal of a proper scoped implementation. Or just take a sane approach to sandboxing / app containers...
Thanks that is interesting. Is there a comprehensive documentation about how to use this somewhere?
The point is: All those distros, everyone except arch has decided at some point to no longer restrict the use of unprivileged user namespaces. That's the result we have today, that cannot be denied.
So by enableing this feature I do see a decision that involves the risks. You can of course claim they do not know what they are doing but I think that would be pretty arogant to do.
The majority of people working on desktop Linux security definitely have no clue. It's a disaster and container tech is a terrible approach to addressing it that brings many drawbacks vs. better solutions elsewhere.
Thats without a doubt correct, but Kernel maintainers from distros like debian and red hat usually know what they are doing. Thats why I question the reason the enabled this.
I think it's laughable really. You can escape from all of these app container 'sandbox' implementations via pulseaudio / dbus.
Flatpak and firejail both block access to these services. Even with bubblewrap/namespace-sandboxing alone you can easily block them by not granting access to the domain socket files and using network namespaces to block abstract sockets. Does not seem like such a bad idea to me (I not would recommend firejail though) Not that there are not enoght weak points in sandboxing as long as things like X11 are available. But especially for things like, "parse this file and tell me if it is malicious" a sandbox in an empty user/mount/pid/ipc/network namespace with stong seccomp filters, can be quite useful. Even more so if you need to do stuff like this anyway, sandboxed or not.
The only proper sandbox you named is the Chromium one, and it doesn't have a hard dependency on this. It's only one of the options to make it work, and they choose to do things differently elsewhere. It's all about the expedience of using an available feature for a platform that's not exactly first tier (desktop Linux) like ChromeOS, Android and Windows.
In any case: arch is the last distribution to disable this feature and I doubt this will go away anytime soon, plus more programms will rely on it.
It's not the last distribution to not have this enabled at all, and some of the distributions enabling it are constraining it to be disabled by default or only accessible to privileged users. The grsecurity patch set restricts user namespaces to privileged users too.
Is there any chance to get the arch main kernel to use such a patch for privileged user namespaces like with grsec? That would at least reduce the aweful number of bugreports for many projekts that use them this way.