[pacman-dev] Thoughts on database initialization and loading process

Sat Apr 16 20:53:36 EDT 2011

On 17/04/11 09:40, Dan McGee wrote:
> Hey guys,
>
> Hopefully someone has some insight, but I fear that this involves some
> pretty intimate knowledge with the alpm dirty internals.
>
> I implemented the local DB "version" check as something on the
> transaction process, due to the fact that we call get_pkgcache or
> find_package and all those fun methods all over the place with the
> expectation that they normally won't fail in any way. This was a bit
> of a hack, and has become more apparent as I look at some of the GPG
> issues.
>
> Due to the lazy loading of the package cache for databases, we never
> really know when a database may be initialized, or which call site
> will do so. This makes it hard for us to 1) do the local DB sanity
> check ASAP and in the right place, and 2) implement signature checking
> on databases each time it is accessed, which seems like the prudent
> decision to me.
>
> That leaves us with a few options:
> 1) Don't check database signatures every time, only when we "need to",
> e.g. right after download. Not a fan of this one.

I am not a fan of this either, but I was slowly leaning that way 
earlier...  In order for someone to tamper with your database once you 
have downloaded and verified it, they would need root access on your 
machine.  If I have root access to your machine, I am not going to spend 
my time altering a database that will be overwritten on the next update. 
  Saying that, I want to see this done properly, and checking a 
signature every time we load the db falls into what I define as "properly".

> 2) Only check database signatures if we go through the transaction
> layer- just like we do now with the local version check.

-1.  That could give interesting issues.  e.g.  "pacman -Ss pkg" -> 
everything seems fine.   "pacman -S pkg" -> your database is not signed 
properly.  That is just not consistent.

> 3) Add some explicit alpm_dbs_set_in_stone() method that needs to be
> called before databases can be used, but after they are registered,
> that takes care of some bookkeeping stuff like this. Stinks because it
> introduces a rather kludgy API that depends on methods being called in
> procedural order (not to mention all frontends will need
> modification), but by far the easiest way to resolve this.
> 4) Fix callsites everywhere that may force a load of the pkgcache to
> actually check error return values for the "local db too old" and "gpg
> sig check failed" cases, among other, rather than just assuming the
> pkgcache is NULL without looking at pm_errno. Best solution as it
> requires no API change, but will definitely have a large LOC impact.

I think this is the probably the best solution, not because lack of API 
change but because it seems the most correct thing to be doing.  I also 
think we should do more return value checking in a lot of places in the 
code base...

Allan