[arch-dev-public] [idea] global link database for all packages
Hi, I just got an idea which might be worth to think about. Namcap is quite useful but due to its limitation of only seeing a certain pkg file at a time it cannot answer all questions. The idea is to create a database (similar to the file list we already create) which includes lists of files and to which they are linked. Dumping this togehter with our pkgdb and file lists in a hughe SQL db we can answer questions such as: * What are the hidden deps of a package? * Which pacakge need a rebuld if I bump a certain package? * Find missing deps * What happens if I remove a dep from a package? ** e.g. if I remove openssl as dep for Qt I need to review every package that directly or indirectly depends on Qt and check if it needs openssl (which was hidden by the Qt dep before) * Maybe check for so name conflicts or packages providing the same one? (libgl, java) * check a package deps without the need to actually install its deps. (like namcap) I made a quick and dirty script based on the createFileLists script: http://users.archlinux.de/~pierre/tmp/createLinkLists.txt It should work fine and even incremental but its runtime is awful. So, if you think that might be a good idea there is a lot of room for improvements. I don't know if its possible with bash but we don't really need to extract every file: * check if a file is an elf file * extract the header * move to next file -- Pierre Schmitz, http://users.archlinux.de/~pierre
On Wednesday 24 June 2009 23:00:02 Daenyth Blank wrote:
file is several times slower than readelf. But I have found some other optimization: * Only extract files from /opt /lib /sbin /bin /usr/lib /usr/sbin /usr/bin * run the script for both arches in parallel * treat links to libs and executables as files (this way we can check for exact so names) * The script uses results from previous runs and as a result should be fast enough to be run on gerolde (I can provide an inital data set) The resultsing db files are quite small: 20 KB for core and 320 KB for extra. So, what do you think about adding this to our cron jobs? We could run it on a daily base and update the db files. It should be easy to write clients which check for possible rebuild candidates and do all kinds of integrity checks. -- Pierre Schmitz, http://users.archlinux.de/~pierre
On Sunday 28 June 2009 18:13:54 Pierre Schmitz wrote:
The resultsing db files are quite small: 20 KB for core and 320 KB for extra.
I have uploaded some example files to http://users.archlinux.de/~pierre/tmp/extra.links.tar.gz (also for core, community and testing) -- Pierre Schmitz, http://users.archlinux.de/~pierre
Pierre Schmitz wrote:
You have given the links for every file examined. Do we need that much information? Is there use beyond a global list for the package? That would also have the advantage of simplifying the format (no need for % symbols) which would simplify writing clients to generate rebuild lists or integrity checks etc. Allan
On Sunday 28 June 2009 18:44:54 Allan McRae wrote:
I have thought about it. But the db files are already small and if you don't need that information you can pipe it through "grep -v '%' | sort -u". On the other hand one could use that information for optdepends, possible split candidates or if you want to know which feature of package a needs pacakge b. -- Pierre Schmitz, http://users.archlinux.de/~pierre
On Sunday 28 June 2009 18:52:45 Pierre Schmitz wrote:
I have thought about it.
Just forgot: I don't have a strong oppionion about having those detailed information or not. If we think a simple list per package would be better I am fine with that. -- Pierre Schmitz, http://users.archlinux.de/~pierre
On Sun, Jun 28, 2009 at 11:13 AM, Pierre Schmitz<pierre@archlinux.de> wrote:
Part of me feels like we should adopt Gerardo's script for this purpose, as it seems a little more robust. Would you have a problem with that?
On Monday 29 June 2009 17:54:39 Aaron Griffin wrote:
Did not know about that. Do you mean this one? http://github.com/djgera/pkgdyn/blob/92191b7cff428159c42080d36d7db936c13a5d2... What do you mean by more robust? -- Pierre Schmitz, http://users.archlinux.de/~pierre
On Mon, Jun 29, 2009 at 11:53 AM, Pierre Schmitz<pierre@archlinux.de> wrote:
I meant pkgdyn itself: http://github.com/djgera/pkgdyn/tree/master
On Monday 29 June 2009 19:02:08 Aaron Griffin wrote:
It does a lot more than my stupid script so its hard to compare. My goal was just to provide some raw data which can be used by and for a lot of things; pkgsyn could use that data, too. Anyway: the awk script in dynup does not seem to be a bad idea; so I might add something similar to my script. -- Pierre Schmitz, http://users.archlinux.de/~pierre
participants (4)
-
Aaron Griffin
-
Allan McRae
-
Daenyth Blank
-
Pierre Schmitz