On Wed, Jun 8, 2011 at 1:28 PM, Rémy Oudompheng <remyoudompheng@gmail.com> wrote:
On Tue 07 June 2011 at 16:36 -0500, Dan McGee wrote:
This is a bitch of a diffstat, unfortunately, and the patches aren't all that fun to look through. I have it pushed to my repo as alpm-cleanups if you'd rather grab it there. The bright side is it should successfully build and test after each patch as long as they are applied in the order sent here (or in my repo).
Comments/suggestion/feedback welcome. Once these are applied, I think our API makes a lot more sense from a consumer standpoint as we have a single object, tracked by the "client", tracking all state in the library rather than the state being held in a global variable in the backend library.
This patch series (and the previous one sent and now applied) makes it *much* easier for the next few patches, which will ensure DBs are signature-checked and verified at the right time, and at a time all clients expect them to be checked so error codes can be checked.
It seems a bit odd to me that packages acquired a handle member in the process: databases are intrinsically stateful, or at least related to a state, since they have paths, locks, and are concerned by transactions.
But packages are twofold: - sometimes they come from a database: they at least inherit a global state through alpm_pkg_get_db(pkg)->handle. - sometimes they come from a file: then things are very different: the user must free the structure when finished with alpm_pkg_free(), and the package itself is much more pure: it doesn't have a related DBPath, no associated root, no options, just a file path. This last point applies iff it is loaded from a file- you stated it but just need to make this point clear before the next bit.
I support the idea that if a package is loaded from a file, there is not really a reason why it should be linked to any handle. The downside of that is that catching error codes would need some modifications in related functions, but it doesn't look like package functions throw many error codes.
The following needs to be addressed, and I am in no rush to do so: * alpm_pkg_compute_requiredby() uses the local DB for PKG_FROM_FILE packages, calls find_requiredby(), which calls db_get_pkgcache()... you get the point, this one needs either a handle passed in, or we need to keep the handle off the package. * _alpm_pkg_should_ignore() * commit_single_pkg() was using newpkg->handle, where newpkg was the PKG_FROM_FILE being installed, but this is unnecessary as we have a handle anyway; fixed. * The _package_changelog_*() implementation sets PM_ERR_LIBARCHIVE. Without breaking the abstraction I'm not sure how this one can be fixed. * alpm_pkg_check_pgp_signature() - heavy errno/logging deps here. * alpm_sync_newversion() - errno/logging deps here. When loading a package, even from a file, we definitely use the handle a lot. * parse_descfile() uses handle for logging purposes. Looking at them, they are all DEBUG level now but several seem like they should be WARNING or even ERROR level messages. * _alpm_pkg_load_internal() and alpm_pkg_load(). Oh my. It looks like most usages here are pm_errno and logging. Even though we could add a *pm_errno arg, logging is a different story. And to me, saying "you need to provide a handle but the resulting pkg won't be linked to it" just makes this confusing. Can I turn your question around now that I've pointed out a few reasons why not having a handle attached to every package would be a P in the A? What benefit would having a package without a handle be?
Another reason why it bothers me is that all transaction functions need a handle: alpm_trans*, alpm_sync_sysupgrade, but not alpm_add_pkg nor alpm_remove_pkg. I have no problem with adding a handle to these two methods- I just didn't because it seemed prudent to only change the API signatures of methods if necessary. So I am fine with adding this, regardless of anything else. My original hackjob before I split it into a patch set did in fact add a handle parameter.
One may argue that having a handle argument would be redundant with the pkg->handle thing, but that's a case where it seems really logical to me, and more consistent with the rest of the API, to declare explicitly the handle with holds the transaction where a package is added/removed. The signature alpm_add_pkg(pmpkg_t *pkg) would require some introspection to detect where the package will be installed.
Even with this all said, you would run into some rather catastrophic failures if you starting mixing packages from different handles. We could definitely add a few more asserts() to prevent such problems.
And actually, it really seems like it is *impossible* to know where alpm_add_pkg() will install the package, since the current API does not provide the necessary alpm_pkg_get_handle() to do so (but we of course could provide such a function: do we need it?) It's not impossible, given that if a package was loaded using a specific handle, it will be installed using that handle. The only caveat is that the frontend application needs to keep track of this. Not sure if we need it.
-Dan