[arch-general] user namespaces

sivmu sivmu at web.de
Thu Feb 2 15:29:52 UTC 2017



Am 02.02.2017 um 11:28 schrieb Daniel Micay via arch-general:
> On Thu, 2017-02-02 at 02:40 +0100, sivmu wrote:
>>
>> Am 01.02.2017 um 21:21 schrieb Daniel Micay via arch-general:
>>>>> it's a nearly useless feature. 
>>>>
>>>> That's a baseless claim, that was already proved wrong in my first
>>>> post
>>>> by the many applications that use this feature.
>>>
>>> That doesn't demonstrate that it's useful relative to the
>>> alternatives.
>>> It enables unprivileged OS containers but isn't really any use for
>>> app
>>> containers.
>>>
>>
>> Pretty much all famous container programms use this. I wonder why if
>> there is no use for it.
>>
>> Also I would still like to see a simple alternative for unprivileged
>> namespaces to sandbox apps.
>> How do you provide something like bubblewrap without user namespaces?
>> And no that android example below is not the same as long as there is
>> no
>> simple way to use this (which I am not aware of)
> 
> Doing things properly is not easy.
> 

That's a bad attitude. It sounds like proper implementations need to be
difficult. That's not true. Especially security and above all crypto
fails often because it is hard to apply. That is why people like Bruce
Schneier have often talked about this. Dan Bernstein has created the
crypto library NaCl for that very reason, to allow the use of crypto
without overly complex and error prone implementations like needed by
openssl.

That is why this sentence is extremly wrong and dangerous.
If there is no way to privide users or developers with easy tools to
sandbox apps, then one has to be created. Just saying that doing things
properly isn't easy will do more harm then features like user namespaces
will ever be able to.

And if I am not mistaken, that is pretty much what android does: it
provides app developers with easy ways to drop privileges and sandbox
their apps.

Therefore I think the wish and need for easy ways to privode security is
important.

Bubblewrap is one of the concepts that I think do a great job on
providing easy isolation of apps, even if they utilise namespaces for
that purpose. (The Tor people seem to agree)

>>>>> but no one really wants it for that reason. They
>>>>> want it because it started pretending that it can offer
>>>>> something
>>>>> that
>>>>> it can't actually deliver safely.
>>>>
>>>> Again a claim without prove
>>>
>>> The proof is easy to find. You're the one making a proposal but you
>>> clearly haven't done your research. It's not my job to spoon feed
>>> you.
>>>
>>
>> I do know some of the discussions about this feature on the kernel
>> mailing list. But the opinions even there are not as clear as you want
>> to make us believe.
> 
> The kernel configuration disables it by default. It enables UTS, IPC,
> PID and NET namespaces by default. That's the opinion from upstream on
> the sane default for a general purpose build: disabled.
> 

Tha is news to me actually. I was under the impression that all
namespaces were enabled by default. That changes things to some extend.

> It is quite clear that it's a major security risk. It exposes an endless
> stream of privesc vulnerabilities from all of the attack surface it
> adds. That attack surface was never exposed like that before and the
> code is not at all robust against attackers, since it was only exposed
> to root users before. It is going to take years for it to settle down
> and become more like core kernel code that was already exposed, and it's
> always going to be a ton of extra attack surface.
> 

I recently talked about this with IT Sec people with kernel inside
knowledge and one strong opinion was that some enable this feature for
that exact reason. Because like many other features it will not evolve
if everyone disables it. Only the finding of vulnerabilities and design
flaws will lead to secure kernel features.



>>> The kernel change that's required is already upstream
>>>
>>
>> Please provide a link, I would very much like to see this but could
>> not
>> find it so far.
> 
> sysctl:
> 
> user.max_cgroup_namespaces = 257166
> user.max_ipc_namespaces = 257166
> user.max_mnt_namespaces = 257166
> user.max_net_namespaces = 257166
> user.max_pid_namespaces = 257166
> user.max_user_namespaces = 257166
> user.max_uts_namespaces = 257166
> 
> A starting point is always setting max_user_namespaces to 0 by default,
> and enabling the feature at compile-time. The proper way to do this is
> not forcing people to toggle it on globally to use container software
> depending on it though. It's scoped per userns and it's meant to be only
> exposed where it's needed. The way the kernel implemented it makes this
> painful, but it's doable. It would make sense to enable it, disabled by
> default via the sysctl, with a policy to not automatically enable it in
> packages, with the goal of a proper scoped implementation. Or just take
> a sane approach to sandboxing / app containers...
> 

Thanks that is interesting.
Is there a comprehensive documentation about how to use this somewhere?




>>
>> The point is:
>> All those distros, everyone except arch has decided at some point to
>> no
>> longer restrict the use of unprivileged user namespaces. That's the
>> result we have today, that cannot be denied.
>>
>> So by enableing this feature I do see a decision that involves the
>> risks. You can of course claim they do not know what they are doing
>> but
>> I think that would be pretty arogant to do.
> 
> The majority of people working on desktop Linux security definitely have
> no clue. It's a disaster and container tech is a terrible approach to
> addressing it that brings many drawbacks vs. better solutions elsewhere.
> 

Thats without a doubt correct, but Kernel maintainers from distros like
debian and red hat usually know what they are doing. Thats why I
question the reason the enabled this.


> I think it's laughable really. You can escape from all of these app
> container 'sandbox' implementations via pulseaudio / dbus. 

Flatpak and firejail both block access to these services.
Even with bubblewrap/namespace-sandboxing alone you can easily block
them by not granting access to the domain socket files and using network
namespaces to block abstract sockets.
Does not seem like such a bad idea to me
(I not would recommend firejail though)

Not that there are not enoght weak points in sandboxing as long as
things like X11 are available.

But especially for things like, "parse this file and tell me if it is
malicious" a sandbox in an empty user/mount/pid/ipc/network namespace
with stong seccomp filters, can be quite useful. Even more so if you
need to do stuff like this anyway, sandboxed or not.


> The only
> proper sandbox you named is the Chromium one, and it doesn't have a hard
> dependency on this. It's only one of the options to make it work, and
> they choose to do things differently elsewhere. It's all about the
> expedience of using an available feature for a platform that's not
> exactly first tier (desktop Linux) like ChromeOS, Android and Windows.
> 
>> In any case: arch is the last distribution to disable this feature and
>> I
>> doubt this will go away anytime soon, plus more programms will rely on
>> it.
> 
> It's not the last distribution to not have this enabled at all, and some
> of the distributions enabling it are constraining it to be disabled by
> default or only accessible to privileged users. The grsecurity patch set
> restricts user namespaces to privileged users too.
> 

Is there any chance to get the arch main kernel to use such a patch for
privileged user namespaces like with grsec?
That would at least reduce the aweful number of bugreports for many
projekts that use them this way.



-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://lists.archlinux.org/pipermail/arch-general/attachments/20170202/aa674a83/attachment.asc>


More information about the arch-general mailing list