On Mon, Feb 21, 2011 at 11:46:51AM -0500, keenerd wrote:
R.Daneel approves of these ideas! Some needed patches to my scraper (where did LocationID go?) but overall I'm very happy to see these changes.
The "LocationID" field is gone .
There are many ways to safely inspect tarballs, even to get around the zip bomb. I won't claim this is perfect, but it works for Aur3 and me.
Listing paths has already been mentioned. Dotfiles, dotdirs, src/, pkg/ are all simple red flags. For that manner, *any* directories are often a sign something is wrong. As mentioned, you can get the size of files before extracting. I don't know enough about tars to know if an attacker could lie about the size. But even if they can, time/memory quotas greatly limit the damage as DoS could acheive.
See  and .
Files can also be processed as streams. I originally did binary detection via "file" (which needs the contects to be extracted, it can't be streamed through stdin) but have since implemented a stream-based UTF8 detector. Stream processing gets around disk attacks. Make the stream processor interruptable (when time quota is exceded) and it can return an estimate. By the way, I am not suggesting binary detection. It is just an example of something that lends itself very well to this method.
That is already done with the "PKGBUILD" (which is still extracted).