[aur-general] The Impossible, or: Static analysis of PKGBUILDS (was Re: Enforcing AUR package quality)

Fri Mar 1 13:03:21 UTC 2019

On 2/28/19 5:41 PM, alad via aur-general wrote:
> That's the issue though, how do you do static analysis of a PKGBUILD - a random bash script which should include certain named functions and variables - without executing it? For example, mksrcinfo simply sources the PKGBUILD, i.e. evaluates it in bash.

You can't.

> The aura AUR helper has a side-project which tries to check PKGBUILDs for "security issues" in Haskell. I'm not sure how well this approach scales though.
> 
> https://github.com/aurapm/aura/blob/master/aura/lib/Aura/Pkgbuild/Security.hs

Please don't even consider "scaling" this approach, as it's based on broken assumptions about bash.

I recall tearing that one apart in #archlinux-aur when the developer was around.
Here's a short recap for people seriously thinking this is a solid idea whatsoever:

> https://github.com/aurapm/aura/blob/master/aura/lib/Aura/Pkgbuild/Security.hs#L53

All it does is effectively blacklist a couple programs in PKGBUILD contexts.

The ScriptRunning test doesn't actually matter at all,
as `eval` and `bash` calls are redundant.
`eval $mycode` is the same as `$mycode`

This effectively undermines any of the security checks at all,
as you can wrap whatever you're doing in quotes and then deref the variable to defeat any tests.

Here's a thing that will probably not be caught by anything right now,
with the payload stored in source=() - which is,
due to the nature of url fragments, irrelevant to the rest of the build process:
```
source=('https://coderobe.net/myprogram.tar.gz#ZWNobyBoYXg=')
<<< ${source[0]} cut -d'#' -f2 | $(base64 -d)
```

Now that we have established that, to my knowledge,
there is no program that is able to statically parse bash in such a way
that it can reliably figure out what code is actually executed - or even present at all,
i think it becomes clear that any sort of automatic analysis of security - or even just correctness
of a given PKGBUILD is futile.

This, plus what has already been said before about the reliability of namcap,
should be enough of an indicator that doing this without evaluation is currently effectively impossible.
Discussing it here - especially in a (derailed) TU application just leads to bikeshedding and threads
long enough to repel anyone that isn't already involved in it.

Aside from that:
You can't even automatically figure out the dependencies of a given program.
Even ELFs can dlopen() arbitrary libraries at runtime which may or may not be required.
A mandatory scoring system of the kind that has been proposed in this thread is anything but a good idea.
-- 
Rob (coderobe)

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://lists.archlinux.org/pipermail/aur-general/attachments/20190301/20d2bbe9/attachment.sig>