[aur-dev] Package blacklist for the AUR
Hi! I recently started working on a package blacklist for the official AUR which can e.g. be used to prevent people from uploading packages in the official repos to [unsupported] (cf. FS#12902 [1]). Patches can be found in the "pkg-blacklist" branch of my working tree [2]. They currently include some code that adds a "PackageBlacklist" table and a hacky helper utility that can be used to update that table by sync'ing it with some binary repos. Constructive criticism and suggestions welcome! [1] https://bugs.archlinux.org/task/12902 [2] http://git.cryptocrack.de/aur.git/log/?h=pkg-blacklist
On Mon, Feb 7, 2011 at 4:06 AM, Lukas Fleischer <archlinux@cryptocrack.de> wrote:
Hi!
I recently started working on a package blacklist for the official AUR which can e.g. be used to prevent people from uploading packages in the official repos to [unsupported] (cf. FS#12902 [1]). Patches can be found in the "pkg-blacklist" branch of my working tree [2]. They currently include some code that adds a "PackageBlacklist" table and a hacky helper utility that can be used to update that table by sync'ing it with some binary repos.
Constructive criticism and suggestions welcome!
[1] https://bugs.archlinux.org/task/12902 [2] http://git.cryptocrack.de/aur.git/log/?h=pkg-blacklist
All around pretty awesome. I wouldn't have written it in C myself but more power to you! What about provides lists from packages as well? Can you add those to the blacklist? I think you (or someone) should but maybe I am wrong. Two char indent? Eew ;) JK, it's better than tabs... -- -Justin
On Mon, Feb 07, 2011 at 10:18:16AM -0500, Justin Davis wrote:
All around pretty awesome. I wouldn't have written it in C myself but more power to you! What about provides lists from packages as well? Can you add those to the blacklist? I think you (or someone) should but maybe I am wrong.
Adding packages' provides and replaces as well now.
Lukas Fleischer wrote:
Hi!
I recently started working on a package blacklist for the official AUR which can e.g. be used to prevent people from uploading packages in the official repos to [unsupported] (cf. FS#12902 [1]). Patches can be found in the "pkg-blacklist" branch of my working tree [2]. They currently include some code that adds a "PackageBlacklist" table and a hacky helper utility that can be used to update that table by sync'ing it with some binary repos.
Constructive criticism and suggestions welcome!
[1] https://bugs.archlinux.org/task/12902 [2] http://git.cryptocrack.de/aur.git/log/?h=pkg-blacklist Whay are you using LOCK TABLE instead of a transaction? Are you using MyIsam?
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
On Mon, Feb 7, 2011 at 3:06 AM, Lukas Fleischer <archlinux@cryptocrack.de> wrote:
Hi!
I recently started working on a package blacklist for the official AUR which can e.g. be used to prevent people from uploading packages in the official repos to [unsupported] (cf. FS#12902 [1]). Patches can be found in the "pkg-blacklist" branch of my working tree [2]. They currently include some code that adds a "PackageBlacklist" table and a hacky helper utility that can be used to update that table by sync'ing it with some binary repos.
Constructive criticism and suggestions welcome!
[1] https://bugs.archlinux.org/task/12902 [2] http://git.cryptocrack.de/aur.git/log/?h=pkg-blacklist
AUR side: * Using CHAR as a datatype is absolutely silly in new code, use VARCHAR, and why do anything shorter than 255 or 512? * On that note, almost all CHAR usages in the current schema are silly and should be using VARCHAR- anything on the Packages table, PackageCategories, PackageSources, TU_VoteInfo, AccountTypes, Username/Email/Passwd/IRCNick on Users. * Why not just make "Name" your primary key? The ID column is never used. * This is a slight step toward removing DummyPkg stuff from the Packages table- I'm trying to think out how to restructure that data to make things work and still be slightly sound from a relational and keys point of view. Blacklist helper side: * I won't lie, I think this is over-engineered a tad. This can be done in a much shorter and easier to hack shell script since all you need is package names- just pipe bsdtar output through some magic and you have package names. I've attached a sample starter script I use for archweb updates. I'd probably have it call bsdtar and then who knows what. * Oh my, I forgot we are still on MyISAM for the AUR. Please to god switch to InnoDB and use transactions instead. -Dan
On Mon, Feb 07, 2011 at 10:08:54AM -0600, Dan McGee wrote:
AUR side: * Using CHAR as a datatype is absolutely silly in new code, use VARCHAR, and why do anything shorter than 255 or 512? * On that note, almost all CHAR usages in the current schema are silly and should be using VARCHAR- anything on the Packages table, PackageCategories, PackageSources, TU_VoteInfo, AccountTypes, Username/Email/Passwd/IRCNick on Users.
I just copied that from another table schema, but you're absolutely right. I'll work on replacing all that CHAR stuff with VARCHARs later. Might become a kinda ugly updating process from 1.7.0 to 1.8.0 :)
* Why not just make "Name" your primary key? The ID column is never used.
I'm not really sure about this. Some people insist on always having an "ID" column. This will become useful if we add some web frontend to the blacklist, e.g. Not sure if there are any coding guidelines about this at all, but it's correct that we don't necessarily need this here.
Blacklist helper side: * I won't lie, I think this is over-engineered a tad. This can be done in a much shorter and easier to hack shell script since all you need is package names- just pipe bsdtar output through some magic and you have package names. I've attached a sample starter script I use for archweb updates. I'd probably have it call bsdtar and then who knows what.
Our consensus was not to fetch tarballs via HTTP and extract them here, but use libalpm instead. We already discussed using a small PHP script that uses Archive::Tar (PEAR) but discarded that. Using libalpm, future database format changes won't affect us and it just seems cleaner. Of course, we could do that with some shell script which would have to do following things tho: - parse the AUR "config.inc" file: read MySQL host name, socket, user name, password - use pacman(8) to sync local databases - convert packages to MySQL queries using some sed(1)/awk(1) magic - pipe stuff to mysql(1), ensure no errors occurred I just think that it wouldn't be much shorter (that C helper has about 100 SLOC if you strip all that error handling stuff), nor cleaner, nor faster. Best thing would to have PHP bindings for libalpm here... Well. Recommendations and patches welcome :p
* Oh my, I forgot we are still on MyISAM for the AUR. Please to god switch to InnoDB and use transactions instead.
Full ack. I'll talk to Loui about that.
On Mon, Feb 07, 2011 at 06:50:14PM +0100, Lukas Fleischer wrote:
Blacklist helper side: * I won't lie, I think this is over-engineered a tad. This can be done in a much shorter and easier to hack shell script since all you need is package names- just pipe bsdtar output through some magic and you have package names. I've attached a sample starter script I use for archweb updates. I'd probably have it call bsdtar and then who knows what.
Our consensus was not to fetch tarballs via HTTP and extract them here, but use libalpm instead. We already discussed using a small PHP script that uses Archive::Tar (PEAR) but discarded that. Using libalpm, future database format changes won't affect us and it just seems cleaner.
Of course, we could do that with some shell script which would have to do following things tho:
- parse the AUR "config.inc" file: read MySQL host name, socket, user name, password - use pacman(8) to sync local databases - convert packages to MySQL queries using some sed(1)/awk(1) magic - pipe stuff to mysql(1), ensure no errors occurred
I just think that it wouldn't be much shorter (that C helper has about 100 SLOC if you strip all that error handling stuff), nor cleaner, nor faster. Best thing would to have PHP bindings for libalpm here... Well. Recommendations and patches welcome :p
Oh, and I'll probably make this script add package names listed in packages' "provides" arrays to the blacklist as well (thanks to Justin Davis, just read his reply).
On Mon 07 Feb 2011 18:56 +0100, Lukas Fleischer wrote:
On Mon, Feb 07, 2011 at 06:50:14PM +0100, Lukas Fleischer wrote:
Blacklist helper side: * I won't lie, I think this is over-engineered a tad. This can be done in a much shorter and easier to hack shell script since all you need is package names- just pipe bsdtar output through some magic and you have package names. I've attached a sample starter script I use for archweb updates. I'd probably have it call bsdtar and then who knows what.
Our consensus was not to fetch tarballs via HTTP and extract them here, but use libalpm instead. We already discussed using a small PHP script that uses Archive::Tar (PEAR) but discarded that. Using libalpm, future database format changes won't affect us and it just seems cleaner.
Of course, we could do that with some shell script which would have to do following things tho:
- parse the AUR "config.inc" file: read MySQL host name, socket, user name, password - use pacman(8) to sync local databases - convert packages to MySQL queries using some sed(1)/awk(1) magic - pipe stuff to mysql(1), ensure no errors occurred
I just think that it wouldn't be much shorter (that C helper has about 100 SLOC if you strip all that error handling stuff), nor cleaner, nor faster. Best thing would to have PHP bindings for libalpm here... Well. Recommendations and patches welcome :p
Oh, and I'll probably make this script add package names listed in packages' "provides" arrays to the blacklist as well (thanks to Justin Davis, just read his reply).
Awesome :D. Rather than creating another config file, would it be possible to just point to the AUR's config.inc? Maybe a PHP wrapper around aurblup... Kind of like the cleanup script? (That script should probably be changed to point directly to the config file.) Also I found a couple typos: something about dwm, and leightweight. Thanks Lukas, you're the new hero of the AUR. :D
On Tue, Feb 08, 2011 at 07:46:27PM -0500, Loui Chang wrote:
Awesome :D. Rather than creating another config file, would it be possible to just point to the AUR's config.inc? Maybe a PHP wrapper around aurblup...
It already does. If you have a look at the "config.h.proto" file, the first define ("AUR_CONFIG") points to the location of the AUR "config.inc" file. aurblup extracts MySQL access data from there. We still need that additional config file tho, as it also defines which paths to use for libalpm DBs, which mirror to sync with and with repos to sync. Actually, it's not a real config file, as it's a C header file and will most likely never be changed after an initial setup.
Also I found a couple typos: something about dwm, and leightweight.
Fixed in my local working tree.
Thanks Lukas, you're the new hero of the AUR. :D
Haha, thanks :D
On Mon, Feb 07, 2011 at 06:50:14PM +0100, Lukas Fleischer wrote:
On Mon, Feb 07, 2011 at 10:08:54AM -0600, Dan McGee wrote:
AUR side: * Using CHAR as a datatype is absolutely silly in new code, use VARCHAR, and why do anything shorter than 255 or 512? * On that note, almost all CHAR usages in the current schema are silly and should be using VARCHAR- anything on the Packages table, PackageCategories, PackageSources, TU_VoteInfo, AccountTypes, Username/Email/Passwd/IRCNick on Users.
I just copied that from another table schema, but you're absolutely right. I'll work on replacing all that CHAR stuff with VARCHARs later. Might become a kinda ugly updating process from 1.7.0 to 1.8.0 :)
Fixed.
* Oh my, I forgot we are still on MyISAM for the AUR. Please to god switch to InnoDB and use transactions instead.
Full ack. I'll talk to Loui about that.
Fixed. Everything should be compatible with InnoDB now (talking about what's in my working tree - didn't push to gerolde yet). Also, aurblup uses transactions by default.
participants (5)
-
Dan McGee
-
Justin Davis
-
Linas
-
Loui Chang
-
Lukas Fleischer