[arch-general] user namespaces

Wed Feb 1 20:21:39 UTC 2017

On Wed, 2017-02-01 at 19:51 +0100, sivmu wrote:
> 
> Am 01.02.2017 um 07:20 schrieb Daniel Micay via arch-general:
> > On Wed, 2017-02-01 at 00:18 +0100, sivmu wrote:
> > > Summary:
> > > 
> > > Arch Linux is one of the few, if not the only distribution that
> > > still
> > > disables or restricts the use of unprivileged user namespaces, a
> > > feature
> > > that is used by many applications and containers to provide secure
> > > sandboxing.
> > > There have been request to turn this feature on since Linux 3.13
> > > (in
> > > 2013) but they are still being denied. While there may have been
> > > some
> > > reason for doing so a few year ago, leading to many distributions
> > > like
> > > Debian and Red Hat to restrict its use to privileged users via a
> > > kernel
> > > patch (they never disabled it completely), today arch seems to be
> > > the
> > > only distribution to block this feature. Even conservative distros
> > > like
> > > Debian 8 and 9 have this feature fully enabled.
> > 
> > There are still endless unprivileged user namespace vulnerabilities
> 
> 
> You failed to name even one.

I already listed several in the linked issue reports.

> > it's a nearly useless feature. 
> 
> That's a baseless claim, that was already proved wrong in my first
> post
> by the many applications that use this feature.

That doesn't demonstrate that it's useful relative to the alternatives.
It enables unprivileged OS containers but isn't really any use for app
containers.

> > The uid/gid mapping is poorly thought out
> > and immature without the necessary environment (filesystem support,
> > etc.) built around it, 
> 
> Something like mount namespaces, that are designed to be used in
> combination with user namespaces?

That has nothing to do with this.

> > but no one really wants it for that reason. They
> > want it because it started pretending that it can offer something
> > that
> > it can't actually deliver safely.
> 
> Again a claim without prove

The proof is easy to find. You're the one making a proposal but you
clearly haven't done your research. It's not my job to spoon feed you.

> > There are much better ways to do
> > unprivileged sandboxes with significantly less risk than
> > CLONE_NEWUSER
> > or setuid executables where the user controls the environment.
> 
> And yet you fail to name even one alternative. Please do

Uh, yeah, I did. M

> > Anything
> > depending on this mechanism instead of properly designed plumbing
> > for it
> > is simply lazy garbage.
> 
> Another baseless and arrogant claim

Not baseless and it's not arrogant to point out that this is a bad
feature for app containers. It's the truth.

> > Lack of a proper layer on top of the kernel
> > providing infrastructure (systemd is so far from that) on
> > desktop/server
> > Linux is not going to be fixed by delegating everything to the
> > kernel
> > even when it massively increases attack surface.
> > 
> > > I would like to suggest that arch stops to disable this feature in
> > > future kernel versions.
> > > 
> > > Resoning:
> > > 
> > > The original reason to block user namespaces were a number of
> > > security
> > > issues that allowed unprivileged users to access features they
> > > should
> > > not have access to. Due to the nature of user namespaces to
> > > provide
> > > isolated user environments with access to privileged features like
> > > other
> > > namespaces (inside that isolated namespace only), it should be
> > > obvious
> > > that this feature had to be designed carefully in order not to
> > > harm
> > > the
> > > security outside the namespace. Even though there have been
> > > issues,
> > > this
> > > feature is now considered stable enough for distros like debian
> > > and
> > > red
> > > hat to allow its use even for unprivileged users.
> > 
> > There's still an unrelenting torrent of security issues from it. 
> 
> Name one

Look at the discussion on the issue report or do basic research on the
topic. It's your proposal, if you haven't done even basic research
that's your problem.

> > Maybe wait until that stops before proposing this. 
> 
> Vulnerabilities in kernel features will never stop to exist. If we
> disable everything with potential vulnerabilities, we did not have a
> kernel anymore.

It's a very niche feature with better alternatives for sandboxes and app
containers. It exposes all of the netfilter administration code and tons
of other networking and mount code as new attack surface.

> > I don't think it's going to
> > stop because of how this feature is designed. It greatly increases
> > the
> > attack surface and there isn't going to be a mitigating factor that
> > changes this situation. It's a fundamentally flawed, garbage feature
> > and
> >  the arguments for it are nonsense. There are better ways to do
> > this, by
> > simply not tying your hands and refusing to implement anything in
> > user
> > space but instead pretending that all common features must happen in
> > the
> > kernel despite major security risks and poor semantics.
> > 
> 
> So this is actually about you not liking this feature without naming
> any
> real reason making a bunch of baseless accusations and claims.

There are no baseless claims / accusations here. I am not going to spoon
feed you information that's already in the issue reports, easily found
on oss-security, etc.

> > > Moreover there are many applications that use this feature to
> > > provide
> > > or
> > > enhance security
> > > Among them are:
> > > 
> > > lxc, systemd-nspawn, docker, flatpak, bubblewrap, firejail,
> > > firefox,
> > > chromium
> > 
> > There's one well-written sandbox there (Chromium's usage) and it
> > doesn't
> > require this feature. 
> 
> Wrong
>
> https://chromium.googlesource.com/chromium/src/+/master/docs/linux_san
> dboxing.md
> 
> And for suid:
> 
> Quote:
> „The intention is if you want to run Chrome and only use the namespace
> sandbox, you can set --disable-setuid-sandbox.  But if you do so on a
> host without appropriate kernel support for the namespace sandbox,
> Chrome will loudly refuse to run.“

That switch isn't passed, which should be pretty clear considering that
it runs.

> Source:
> https://bugs.chromium.org/p/chromium/issues/detail?id=598454

That doesn't do what you think it does.
> 
> 
> They also don't need this feature on platforms
> > where they have control like Android, since they can implement it in
> > a
> > saner way where it doesn't massively increase kernel attack surface.
> > 
> 
> Android uses minijail (default app sandbox in android 7), which relies
> on user namespaces…
> Just opened a terminal on my android and checked it. Its inside a user
> namespaces.

No, that's incorrect and you're just further demonstrating how far out
of your depth you are here. Google doesn't even enable user namespaces
in the kernel in AOSP / stock Android for Nexus/Pixel. Doubt that any
other vendors are enabling it. It doesn't use any namespaces other than
mount namespaces as part of the multi-user emulation for backwards
compatibility. It certainly doesn't use minijail as the 'default app
sandbox'. It uses minijail as a library to factor out common patterns
involved in privilege dropping, like dropping capabilities. The app
sandbox is done with uid/gid pairs (AIDs) and the full system SELinux
policy (untrusted_app domain for regular non-platform apps and
isolated_app for isolatedProcess services). Permissions are generally
done with IPC checks but some are done with secondary groups. Before it
had SELinux, it was just using the POSIX user/group/permission model to
implement the app sandbox and that's still the base. It has no use case
at all for user namespaces, and process namespaces would not really have
much use either due to hidepid=2 since 7.x combined with uid isolation.
It would just be a mess since they turn a process into a subreaper /
secondary init.

Trying to explain to me how Android works from skimming and
misinterpreting news / documentation and making incorrect assumptions is
not going to get you far.

> > > After working with sandboxing applications for several month, it
> > > seems
> > > clear to me that disabling user namespaces decreases the security
> > > of
> > > the
> > > system significantly. Some of these applications can not provide
> > > core
> > > features due to user namespaces missing. Others have significant
> > > security features disabled for this reasons. But the worst part is
> > > how
> > > some of these projects dealt with the missing feature: Many are
> > > using
> > > suid bits to execute the application as root to get access to the
> > > features they would have inside a user namespace. And for those
> > > who
> > > have
> > > worked with suid applications and their security it will not be
> > > surprising that they have failed to do this securely, leading to
> > > not
> > > just a few local root exploits.
> > 
> > There's no hard requirement that they have to do it that way. They
> > can
> > use a service where the user doesn't control the environment used to
> > spawn the application (like setuid) or full control over the
> > environment
> > where it ends up being run. Application containers *really* do not
> > need
> > this feature. It's far better to do it in a more secure, saner way
> > vs.
> > exposing massive kernel attack surface.
> > 
> 
> Again no real life example for an alternative

Android, which was given as an example. You are going out of the way to
ignore all of the information that's right in front of you.

> > > Taking firejail just as an example:
> > > (CVE-2017-5207)
> > > (CVE-2017-5206)
> > > (CVE-2017-5180)
> > > (CVE-2016-10122)
> > > (CVE-2016-10118)
> > > (CVE-2016-9016)
> > 
> > A junk, insecure application is not a reason to greatly reduce
> > kernel
> > security for everyone.
> > 
> 
> I actually do not really want to argue with you about this one except
> that your claim for reduced kernel security is greatly exaggerated.

Not exaggerated at all. It adds a huge amount of attack surface. It's no
joke to suddenly expect all of netfilter to handle untrusted
administration, and that's just one of a bunch of API surfaces added as
attack surface for unprivileged users.

> And please not that the security of firejail would be grreatly increa
> 
> > > And that is just from the last release...
> > > 
> > > non of these issues would have been possible if user namespaces
> > > could
> > > be
> > > used, which is btw. what bubblewrap does if the feature is
> > > available,
> > > but since it isn’t on arch they have to use suid too (but
> > > bubblewrap
> > > is
> > > designed with security in mind for a change, so no known issues so
> > > far)
> > > Chromium is another case that has to use suid to use its sandbox
> > > and
> > > while I consider the developers very skilled in regards to
> > > security
> > > (they build a very nice broker architecture sandbox on windows
> > > too)
> > > there have been local root exploits in the linux version of
> > > chromium
> > > because of this.
> > 
> > Chromium has had a couple vulnerabilities there. Can you point to
> > any
> > that are full blown privesc?
> 
> Local root exploit in chromium:
> https://bugs.chromium.org/p/chromium/issues/detail?id=76542
> 
> you are welcome

If you read past the initial information (seems to be a consistent
problem for you), you'll see that they determined that it didn't seem to
really be a privilege escalation bug after all. I was already aware of
that issue, and it's exactly why I asked for a real privilege escalation
bug caused by chrome-sandbox because I am not aware of one.

>  I can point to 30+ kernel bugs from the
> > past couple years that are privesc via user namespaces. Also those
> > kernel vulnerabilities impact *everyone*.
> > 
> 
> Please do point out some from the last 6 mounth.

CVE-2016-8655 is a simple one that comes to mind. Not accessible attack
surface to unprivileged users without user namespaces. There are a bunch
more though!

> > > Even while looking at the surface of this problem it becomes clear
> > > this
> > > causes way more problems then it solves. Considering arch will be
> > > or
> > > already is the only linux distribution to disable this feature,
> > > developers of future applications will have to chose between
> > > droppingsupport for arch or to keep using features like suid that
> > > pose
> > > a real security threat opposite to user namespaces.
> > 
> > Nope, you're just ignoring / misrepresenting the facts here and
> > failing
> > to present a real proposal. Try again, and propose something where
> > attack surface is not increased beyond the cases where this feature
> > is
> > actually required. Enabling it globally when people install
> > something
> > like Chromium doesn't qualify.
> > 
> > User namespaces are far more real of a security threat than these
> > fears
> > you're presenting here, and doing it as you propose would impose
> > those
> > risks on EVERYONE so that the few can have their poorly designed
> > container features based on this.
> > 
> 
> I do not share your assessment of the threat posed by userns and you
> have given me no reaseon to share your opinion yet

You haven't done any real research, so you're in no position to draw
conclusions.

> > > Therefore I urge the people responsible to reconsider their choice
> > > an
> > > enable user namespaces in future kernel versions of arch linux.
> > 
> > Present a real proposal taking into account the very real reasons to
> > avoid this that you are skirting around. If you aren't going to
> > present
> > technical solutions to the problems, which are certainly possible
> > and
> > could be implemented, then I don't think anything should be changed.
> > 
> 
> Solutions to change user namespaces inside the kernel? This isn’t the
> kernel mailing list and arch won’t patch the kernel, so I do not get
> what you are proposing.

The kernel change that's required is already upstream

> > I have thoughts on how to enable this while containing the attack
> > surface but seeing as I have no interest in the feature and have a
> > lot
> > of far more important work to do than working on toy features, I
> > don't
> > plan on doing anything about this myself.
> > 
> 
> Please share this either here or via direct mail and I will work on
> this
> as far as I am able.
> 
> > > Bug reports regarding user namespaces:
> > > 
> > > https://bugs.archlinux.org/task/36969
> > > https://bugs.archlinux.org/task/49337
> 
> 
> To make this short, please provide sources for your claim regarding
> the
> kernel attack surface of user namespaces and alternatives to provide
> the
> same funktionality.
> 
> 
> 
> To conclude:
> 
> 
> The people responsible for linux distributions like debian, red hat
> and
> pretty much all other distros, as well as many developers of
> sandboxing
> applications including the tails and chromium people all believe this
> feature is a useful tool to provide unprivileged sandbox applications
> worth the risk.

I haven't seen any such assessment by them about the risk vs. reward and
comparing it to alternative solutions from a security perspective. The
Chromium change has a lot more to do with them only really caring about
ChromeOS (where they can disable userns everywhere but the spawning
process) and Android (where it's not needed due to a better alternative
and user namespaces aren't available).

An argument from authority is worth nothing particularly when those
people are not actually saying what you claim they are, and here is
someone that works full time on infosec that's telling you otherwise.

> Without any real prove of the claims you made in your post, it seems
> you
> rather have a personal grudge against this feature while at the same
> time saying you know better then all these people.
> Sorry but that is pretty rich.
> 
> Don’t get me wrong I would love to discuss with you about this all day
> long but I would like to ask you to reconsider your tone, as you sound
> incredibly arrogant when you put yourself above all those
> voices/people
> without providing real prove for your arguments.

You're the one making a proposal without having done much research into
it, and you're going out of your way to only skim the available info.

Not spoon feeding you information != lack of sources. You're the one
making a proposal about this. It's on you to get yourself up to speed
about the recent bugs exposed as privilege escalation vulns due to user
namespaces. It's easy to find a dozen of them from the past 6 months
simply from basic Google searches / oss-security, but there are many
more if you actually dig deeper into CVEs and bug fixes backported to
stables for these issues without CVEs.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 866 bytes
Desc: This is a digitally signed message part
URL: <https://lists.archlinux.org/pipermail/arch-general/attachments/20170201/ce6be885/attachment-0001.asc>