On Mon, Feb 21, 2011 at 11:46:51AM -0500, keenerd wrote:
R.Daneel approves of these ideas! Some needed patches to my scraper (where did LocationID go?) but overall I'm very happy to see these changes.
The "LocationID" field is gone [1].
There are many ways to safely inspect tarballs, even to get around the zip bomb. I won't claim this is perfect, but it works for Aur3 and me.
Listing paths has already been mentioned. Dotfiles, dotdirs, src/, pkg/ are all simple red flags. For that manner, *any* directories are often a sign something is wrong. As mentioned, you can get the size of files before extracting. I don't know enough about tars to know if an attacker could lie about the size. But even if they can, time/memory quotas greatly limit the damage as DoS could acheive.
See [2] and [3].
Files can also be processed as streams. I originally did binary detection via "file" (which needs the contects to be extracted, it can't be streamed through stdin) but have since implemented a stream-based UTF8 detector. Stream processing gets around disk attacks. Make the stream processor interruptable (when time quota is exceded) and it can return an estimate. By the way, I am not suggesting binary detection. It is just an example of something that lends itself very well to this method.
That is already done with the "PKGBUILD" (which is still extracted). [1] http://mailman.archlinux.org/pipermail/aur-dev/2011-January/001387.html [2] https://bugs.archlinux.org/task/22991 [3] https://bugs.archlinux.org/task/22995