[pacman-dev] [PATCH] Complete rework of package accessor logic
Hopefully we've finally arrived at package handling nirvana, or at least
this commit will get us a hell of a lot closer. The former method of getting
the depends list for a package was the following:
1. call alpm_pkg_get_depends()
2. this method would check if the package came from the cache
3. if so, ensure our cache level is correct, otherwise call db_load
4. finally return the depends list
Why did this suck? Because getting the depends list from the package
shouldn't care about whether the package was loaded from a file, from the
'package cache', or some other system which we can't even use because the
damn thing is so complicated. It should just return the depends list.
So what does this commit change? It adds a pointer to a struct of function
pointers to every package for all of these 'package operations' as I've
decided to call them (I know, sounds completely straightforward, right?). So
now when we call an alpm_pkg_get-* function, we don't do any of the cache
logic or anything else there- we let the actual backend handle it by
delegating all work to the method at pkg->ops->get_depends.
Now that be_package has achieved equal status with be_files, we can treat
packages from these completely different load points differently. We know a
package loaded from a pkg.tar.gz will have all of its fields populated, so
we can set up all its accessor functions to be direct accessors. On the
other hand, the packages loaded from the local and sync DBs are not always
fully-loaded, so their accessor functions are routed through the same logic
as before.
Net result? More code. However, this code now make it roughly 52 times
easier to open the door to something like a read-only tar.gz database
backend.
Are you still reading? I'm impressed. Looking at the patch will probably be
clearer than this long-winded explanation.
Signed-off-by: Dan McGee
Signed-off-by: Dan McGee
On Mon, May 12, 2008 at 3:28 AM, Dan McGee
So what does this commit change? It adds a pointer to a struct of function pointers to every package for all of these 'package operations' as I've decided to call them (I know, sounds completely straightforward, right?). So now when we call an alpm_pkg_get-* function, we don't do any of the cache logic or anything else there- we let the actual backend handle it by delegating all work to the method at pkg->ops->get_depends.
Now that be_package has achieved equal status with be_files, we can treat packages from these completely different load points differently. We know a package loaded from a pkg.tar.gz will have all of its fields populated, so we can set up all its accessor functions to be direct accessors. On the other hand, the packages loaded from the local and sync DBs are not always fully-loaded, so their accessor functions are routed through the same logic as before.
So functions that are tied down to a specific database shouldn't be in package.c at all, right? They should just be put in the corresponding be_* files? But I will go on below with the checkmd5sum() example.
Net result? More code. However, this code now make it roughly 52 times easier to open the door to something like a read-only tar.gz database backend.
That's indeed quite a lot of code, but it's pretty straightforward, and it indeed seems like it would be easier to implement another backend.
+/* Default package accessor functions. These will get overridden by any + * backend logic that needs lazy access, such as the local database through + * a lazy-laod cache. However, the defaults will work just fine for fully- + * populated package structures. */
you mean load? :)
+/** Package operations struct. This struct contains function pointers to + * all methods used to access data in a package to allow for things such + * as lazy package intialization (such as used by the file backend). Each + * backend is free to define a stuct containing pointers to a specific + * implementation of these methods. Some backends may find using the + * defined default_pkg_ops struct to work just fine for their needs. + */ +struct pkg_operations { + const char *(*get_filename) (pmpkg_t *); + const char *(*get_name) (pmpkg_t *); + const char *(*get_version) (pmpkg_t *); + const char *(*get_desc) (pmpkg_t *); + const char *(*get_url) (pmpkg_t *); + time_t (*get_builddate) (pmpkg_t *); + time_t (*get_installdate) (pmpkg_t *); + const char *(*get_packager) (pmpkg_t *); + const char *(*get_md5sum) (pmpkg_t *); + const char *(*get_arch) (pmpkg_t *); + unsigned long (*get_size) (pmpkg_t *); + unsigned long (*get_isize) (pmpkg_t *); + pmpkgreason_t (*get_reason) (pmpkg_t *); + + alpm_list_t *(*get_licenses) (pmpkg_t *); + alpm_list_t *(*get_groups) (pmpkg_t *); + alpm_list_t *(*get_depends) (pmpkg_t *); + alpm_list_t *(*get_optdepends) (pmpkg_t *); + alpm_list_t *(*get_conflicts) (pmpkg_t *); + alpm_list_t *(*get_provides) (pmpkg_t *); + alpm_list_t *(*get_replaces) (pmpkg_t *); + alpm_list_t *(*get_deltas) (pmpkg_t *); + alpm_list_t *(*get_files) (pmpkg_t *); + alpm_list_t *(*get_backup) (pmpkg_t *); + + void *(*changelog_open) (pmpkg_t *); + size_t (*changelog_read) (void *, size_t, const pmpkg_t *, const void *); + int (*changelog_close) (const pmpkg_t *, void *); + + /* still to add: + * free() + * dup() + * checkmd5sum() ? + * has_scriptlet() + * compute_requiredby() + */ +}; +
Well checkmd5sum is tied down to cache db, so it belongs neither in package.c, nor in pkg_operations, right? Also let me remind you that this function isn't used anywhere :) But you and Aaron wanted to keep it because it might be useful for frontends. About has_scriptlet, how do we implement that for sync db? Or well, maybe it could just return false.. Because it seems like all these package ops can't fail anymore. And what is the problem with compute_requiredby, it seems to be independent from the backends, it just uses normal accessors. Actually that function might fit better in deps.c, with all others dep computing things.
On Sun, May 11, 2008 at 08:28:59PM -0500, Dan McGee
@@ -1,8 +1,7 @@ /* * be_files.c * - * Copyright (c) 2006 by Christian Hamar
- * Copyright (c) 2006 by Miklos Vajna + * Copyright (c) 2002-2008 by Judd Vinet * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by
feel free to do this, but remember that i was even called a GPL violator (by Aaron) just because of i did a similar change in pacman-g2 ;) (or an other example is Git where for example something is rewritten in C from bash the names are not removed.)
On Mon, May 12, 2008 at 5:20 PM, Miklos Vajna
On Sun, May 11, 2008 at 08:28:59PM -0500, Dan McGee
wrote: @@ -1,8 +1,7 @@ /* * be_files.c * - * Copyright (c) 2006 by Christian Hamar
- * Copyright (c) 2006 by Miklos Vajna + * Copyright (c) 2002-2008 by Judd Vinet * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by feel free to do this, but remember that i was even called a GPL violator (by Aaron) just because of i did a similar change in pacman-g2 ;)
(or an other example is Git where for example something is rewritten in C from bash the names are not removed.)
Oh, so you do still read the list? I was just checking. :) These patches aren't final at all, I actually think I copied things around too much as I have quite a bit of stuff locally that I'm working on. I'm not even close to committing these directly, and I can pretty much see that I'm going to completely rewrite the trio of cache/be_files/db (along with package) and rework just about everything, so that is what I was anticipating. Looks like I jumped the gun. But before you go howling at the copyright police (which is completely different than a GPL violation, so I apologize for that mixup in the past), it sure would be nice if you gave us some credit too when you do port over our commits and work. And our project name is not pacman-g1. http://git.frugalware.org/gitweb/gitweb.cgi?p=pacman-g2.git;a=commitdiff;h=f... http://git.frugalware.org/gitweb/gitweb.cgi?p=pacman-g2.git;a=commitdiff;h=4... (ok, this one is a joke) http://git.frugalware.org/gitweb/gitweb.cgi?p=pacman-g2.git;a=commitdiff;h=0... (you did give credit here, thanks, but still couldn't set the author right?) http://git.frugalware.org/gitweb/gitweb.cgi?p=pacman-g2.git;a=commitdiff;h=9... (nice jab there, pretty mature) -Dan
On Mon, May 12, 2008 at 09:09:00PM -0500, Dan McGee
Oh, so you do still read the list? I was just checking. :)
heh, barely.
But before you go howling at the copyright police (which is completely different than a GPL violation, so I apologize for that mixup in the past), it sure would be nice if you gave us some credit too when you do port over our commits and work. And our project name is not pacman-g1.
ok, i'll try to stop using it, just mplayer was always called as mplayer-g1 on the mplayer-g2 mailing list and i got that bad habit, my bad. (probably "old pacman" is even more arrogant :P)
http://git.frugalware.org/gitweb/gitweb.cgi?p=pacman-g2.git;a=commitdiff;h=f... http://git.frugalware.org/gitweb/gitweb.cgi?p=pacman-g2.git;a=commitdiff;h=4... (ok, this one is a joke) http://git.frugalware.org/gitweb/gitweb.cgi?p=pacman-g2.git;a=commitdiff;h=0... (you did give credit here, thanks, but still couldn't set the author right?)
yup, i think you discovered the --author switch of git-commit way before me! (nowadays i usually use it where appreciate)
http://git.frugalware.org/gitweb/gitweb.cgi?p=pacman-g2.git;a=commitdiff;h=9... (nice jab there, pretty mature)
oh well, different targets. this is just a decision. we try to keep a stable api, you don't. both has benefits. and at the end users can choose. that's the best :)
http://git.frugalware.org/gitweb/gitweb.cgi?p=pacman-g2.git;a=commitdiff;h=9...
(nice jab there, pretty mature)
oh well, different targets. this is just a decision. we try to keep a stable api, you don't. both has benefits. and at the end users can choose. that's the best :)
IIRC, this was reverted, but that is just one more argument to your 'not stable API' "campaign". In my opinion the parameters of callback functions should be reworked (warning, API change ;-). My preferred solution would be putting one param, a pointer to a complicated union or whatever, which can be accessed via alpm_info_get_xfered(ptr) etc. thus making it much more flexible. Thoughts? Bye ---------------------------------------------------- SZTE Egyetemi Könyvtár - http://www.bibl.u-szeged.hu This mail sent through IMP: http://horde.org/imp/
2008/5/14 Nagy Gabor
http://git.frugalware.org/gitweb/gitweb.cgi?p=pacman-g2.git;a=commitdiff;h=9...
(nice jab there, pretty mature)
oh well, different targets. this is just a decision. we try to keep a stable api, you don't. both has benefits. and at the end users can choose. that's the best :)
IIRC, this was reverted, but that is just one more argument to your 'not stable API' "campaign".
It doesn't make any sense to stabilize such a crappy API. But just as reminder, the API remains stable between minor releases (3.x.y), but not between major releases (3.x) About the dltotal thing, Dan indeed reverted it, but we need to add that feature back with another way. We currently have this callback : void cb_dl_progress(const char *filename, int xfered, int total) Dan suggested to add another one 2 days ago, I believe it was something like : void cb_dl_total_progress(int xfered, int total) But I already forgot. Dan, can you confirm this? :)
In my opinion the parameters of callback functions should be reworked (warning, API change ;-). My preferred solution would be putting one param, a pointer to a complicated union or whatever, which can be accessed via alpm_info_get_xfered(ptr) etc. thus making it much more flexible. Thoughts?
This might not be a bad idea, any work that would allow us to get rid of these TODO in callback.c is welcome : /* TODO this is one of the worst ever functions written. void *data ? wtf */ /* TODO we take this route based on data2 being not null? WTF */
On Wed, May 14, 2008 at 4:46 AM, Xavier
2008/5/14 Nagy Gabor
: About the dltotal thing, Dan indeed reverted it, but we need to add that feature back with another way. We currently have this callback : void cb_dl_progress(const char *filename, int xfered, int total) Dan suggested to add another one 2 days ago, I believe it was something like : void cb_dl_total_progress(int xfered, int total) But I already forgot. Dan, can you confirm this? :)
Yes, this is what I was thinking. The frontend(s) could then use these two functions accordingly. Both would be called numerous times during the download as they are now.
In my opinion the parameters of callback functions should be reworked (warning, API change ;-). My preferred solution would be putting one param, a pointer to a complicated union or whatever, which can be accessed via alpm_info_get_xfered(ptr) etc. thus making it much more flexible. Thoughts?
I don't think this complexity is quite needed for the download functions. Maybe the other functions later on though. Download we are dealing with 4 numbers and a filename. -Dan
On Wed, May 14, 2008 at 7:46 PM, Dan McGee
On Wed, May 14, 2008 at 4:46 AM, Xavier
wrote: 2008/5/14 Nagy Gabor
: About the dltotal thing, Dan indeed reverted it, but we need to add that feature back with another way. We currently have this callback : void cb_dl_progress(const char *filename, int xfered, int total) Dan suggested to add another one 2 days ago, I believe it was something like : void cb_dl_total_progress(int xfered, int total) But I already forgot. Dan, can you confirm this? :)
Yes, this is what I was thinking. The frontend(s) could then use these two functions accordingly. Both would be called numerous times during the download as they are now.
In my opinion the parameters of callback functions should be reworked (warning, API change ;-). My preferred solution would be putting one param, a pointer to a complicated union or whatever, which can be accessed via alpm_info_get_xfered(ptr) etc. thus making it much more flexible. Thoughts?
I don't think this complexity is quite needed for the download functions. Maybe the other functions later on though. Download we are dealing with 4 numbers and a filename.
Using a struct for this would be a shade more clear. But it's really just smoke-and-mirrors. Whether all that data is a struct or raw params means little.
participants (6)
-
Aaron Griffin
-
Dan McGee
-
Dan McGee
-
Miklos Vajna
-
Nagy Gabor
-
Xavier