I was thinking about defining a few SQL queries which would run regularly and could report things like
* duplicates (by name AND by other criteria such as sources)
Hmm when I read "sources" I was thinking "a URL to download source that is no longer accessible" (or no longer downloads the exact same thing as before, thus broken hashes...). Of course that's somewhat fuzzy because sites can be down for short periods of time (& the latter about hashes, is rarer, and more wasteful to check since a script would have to download the whole thing). But maybe if a URL is broken 3 or 4 times in a row or some algorithm, it could be useful, if it doesn't create too much load on anybody's machine. (I've run into broken source URLs trying to use a few AUR packages, I think). -Isaac