[pacman-dev] [BUG] alpm_list_remove need an initalized pointer for void **data
hi. i am playing with alpm and db4 to make a little program for me. just for fun. and in this new release of libalpm (along pacman 3.2.0) i have found that alpm_list_remove behave differently than before. so the last parameter (void **data) need to be initialized to avoid a seg fault. it happens in alpm_db_unregister if you unregister a sync db if you run this, you will see it happen #include <stdio.h> #include <stdlib.h> #include <alpm.h> int main(void) { pmdb_t *db; pmpkg_t *pkg; const char *s; alpm_initialize(); alpm_option_set_root("/"); alpm_option_set_dbpath("/var/lib/pacman"); alpm_option_add_cachedir("/var/cache/pacman/pkg"); alpm_option_set_logfile("/dev/stdout"); db = alpm_db_register_sync("extra"); puts("alpm_db_unregister(db)"); alpm_db_unregister(db); puts("alpm_release()"); alpm_release(); exit(EXIT_SUCCESS); } i wonder how it has not generated more bug in libalpm. so i check all uninitalised pointer in libalpm before alpm_list_remove and i found 3 occurences see the patch diff --git a/lib/libalpm/cache.c b/lib/libalpm/cache.c index 032cc97..b7681db 100644 --- a/lib/libalpm/cache.c +++ b/lib/libalpm/cache.c @@ -135,7 +135,7 @@ int _alpm_db_add_pkgincache(pmdb_t *db, pmpkg_t *pkg) int _alpm_db_remove_pkgfromcache(pmdb_t *db, pmpkg_t *pkg) { - void *vdata; + void *vdata=NULL; pmpkg_t *data; ALPM_LOG_FUNC; diff --git a/lib/libalpm/db.c b/lib/libalpm/db.c index d9a3931..c0d322c 100644 --- a/lib/libalpm/db.c +++ b/lib/libalpm/db.c @@ -145,7 +145,7 @@ int SYMEXPORT alpm_db_unregister(pmdb_t *db) * databases by walking through the list returned by * alpm_option_get_syncdbs, because the db is removed from that list here. */ - void *data; + void *data=NULL; handle->dbs_sync = alpm_list_remove(handle->dbs_sync, db, _alpm_db_cmp, &data); if(data) { diff --git a/lib/libalpm/remove.c b/lib/libalpm/remove.c index 864fafa..6ea16c0 100644 --- a/lib/libalpm/remove.c +++ b/lib/libalpm/remove.c @@ -118,7 +118,7 @@ static void remove_prepare_keep_needed(pmtrans_t *trans, pmdb_t *db, alpm_list_t *i; for(i = lp; i; i = i->next) { pmdepmissing_t *miss = (pmdepmissing_t *)i->data; - void *vpkg; + void *vpkg=NULL; pmpkg_t *pkg = _alpm_pkg_find(trans->packages, miss->causingpkg); if(pkg == NULL) { continue;
On Thu, Aug 7, 2008 at 7:42 AM, solsTiCe d'Hiver <solstice.dhiver@gmail.com> wrote:
hi.
i am playing with alpm and db4 to make a little program for me. just for fun.
and in this new release of libalpm (along pacman 3.2.0) i have found that alpm_list_remove behave differently than before. so the last parameter (void **data) need to be initialized to avoid a seg fault.
it happens in alpm_db_unregister if you unregister a sync db if you run this, you will see it happen
#include <stdio.h> #include <stdlib.h> #include <alpm.h>
int main(void) { pmdb_t *db; pmpkg_t *pkg; const char *s;
alpm_initialize(); alpm_option_set_root("/"); alpm_option_set_dbpath("/var/lib/pacman"); alpm_option_add_cachedir("/var/cache/pacman/pkg"); alpm_option_set_logfile("/dev/stdout"); db = alpm_db_register_sync("extra"); puts("alpm_db_unregister(db)"); alpm_db_unregister(db); puts("alpm_release()"); alpm_release(); exit(EXIT_SUCCESS); }
i wonder how it has not generated more bug in libalpm.
I think the problem is something different; note the problem occurs in db_cmp: Program received signal SIGSEGV, Segmentation fault. _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 363 return(strcmp(db1->treename, db2->treename)); (gdb) bt #0 _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 #1 0xb8006a8c in alpm_list_remove (haystack=0x97b6140, needle=0x97b60f0, fn=0xb800d5b0 <_alpm_db_cmp>, data=0xbf841064) at alpm_list.c:314 #2 0xb800ead2 in alpm_db_unregister (db=0x97b60f0) at db.c:149 #3 0x08048770 in main () at test.c:17 (gdb) p d1 $1 = (const void *) 0x97b60f0 (gdb) p d2 $2 = (const void *) 0x97b60f0 (gdb) p d1->treename Attempt to dereference a generic pointer. (gdb) p ((pmdb_t)d1)->treename $3 = 0x97b60f0 "\020a{\t0a{\t\220p{\t" (gdb) p ((pmdb_t)d2)->treename $4 = 0x0 I'm not completely sure what is going on here. Here is the start of list_remove. If data is anything except null, we set it to null anyway: alpm_list_t SYMEXPORT *alpm_list_remove(alpm_list_t *haystack, const void *needle, alpm_list_fn_cmp fn, void **data) { alpm_list_t *i = haystack, *tmp = NULL; if(data) { *data = NULL; } -Dan
On Thu, Aug 7, 2008 at 3:41 PM, Dan McGee <dpmcgee@gmail.com> wrote:
(gdb) p d1 $1 = (const void *) 0x97b60f0 (gdb) p d2 $2 = (const void *) 0x97b60f0 (gdb) p d1->treename Attempt to dereference a generic pointer. (gdb) p ((pmdb_t)d1)->treename $3 = 0x97b60f0 "\020a{\t0a{\t\220p{\t" (gdb) p ((pmdb_t)d2)->treename $4 = 0x0
wtf, how is that even possible?
On Thu, Aug 7, 2008 at 9:07 AM, Xavier <shiningxc@gmail.com> wrote:
On Thu, Aug 7, 2008 at 3:41 PM, Dan McGee <dpmcgee@gmail.com> wrote:
(gdb) p d1 $1 = (const void *) 0x97b60f0 (gdb) p d2 $2 = (const void *) 0x97b60f0 (gdb) p d1->treename Attempt to dereference a generic pointer. (gdb) p ((pmdb_t)d1)->treename $3 = 0x97b60f0 "\020a{\t0a{\t\220p{\t" (gdb) p ((pmdb_t)d2)->treename $4 = 0x0
wtf, how is that even possible?
I'm not sure, I feel like GDB is doing something odd here considering d1 and d2 are identical addresses. Oh shoot, I see something now. :) I de-refed the instance type, not a pointer type. (gdb) p ((pmdb_t*)d1)->treename $1 = 0x891d130 "extra" (gdb) p ((pmdb_t*)d2)->treename $2 = 0x891d130 "extra" That makes more sense, but with those values, I have no idea why it is segfaulting. Anyone? I thought it might be due to strcmp() not liking the same exact string for both inputs, but a quick test put that fear to rest. -Dan
for me it is a problem with alpm_list_remove but i am not an expert. look at this #include <alpm.h> int compare_int(const void *a, const void *b) { const int* ia = (const int *)a; const int* ib = (const int *)b; return (*ia > *ib) - (*ia < *ib); } int main(void) { alpm_list_t *l=NULL; int a=2,a2=2; l= alpm_list_add(l, &a); l= alpm_list_add(l, &a2); void **data=NULL; l = alpm_list_remove(l, &a, compare_int, data); puts("still ok ?"); void **data2; l = alpm_list_remove(l, &a2, compare_int, data2); return 0; } it only seg fault at the second alpm_list_remove
On Thu, Aug 7, 2008 at 5:23 PM, solsTiCe d'Hiver <solstice.dhiver@gmail.com> wrote:
for me it is a problem with alpm_list_remove but i am not an expert.
And Dan is an expert so he must be right :)
look at this #include <alpm.h>
int compare_int(const void *a, const void *b) { const int* ia = (const int *)a; const int* ib = (const int *)b; return (*ia > *ib) - (*ia < *ib); }
int main(void) { alpm_list_t *l=NULL; int a=2,a2=2; l= alpm_list_add(l, &a); l= alpm_list_add(l, &a2); void **data=NULL; l = alpm_list_remove(l, &a, compare_int, data); puts("still ok ?"); void **data2; l = alpm_list_remove(l, &a2, compare_int, data2); return 0; }
it only seg fault at the second alpm_list_remove
Indeed, this second example segfault in list_remove. But not your first one, which segfaults on the fn(...) call inside list_remove. And the fn parameter is _alpm_db_cmp. Try this : void *data; l = alpm_list_remove(l, &a, compare_int, &data); void *data2; l = alpm_list_remove(l, &a2, compare_int, &data2);
On Thu, Aug 7, 2008 at 10:23 AM, solsTiCe d'Hiver <solstice.dhiver@gmail.com> wrote:
for me it is a problem with alpm_list_remove but i am not an expert.
look at this #include <alpm.h>
int compare_int(const void *a, const void *b) { const int* ia = (const int *)a; const int* ib = (const int *)b; return (*ia > *ib) - (*ia < *ib); }
int main(void) { alpm_list_t *l=NULL; int a=2,a2=2; l= alpm_list_add(l, &a); l= alpm_list_add(l, &a2); void **data=NULL; l = alpm_list_remove(l, &a, compare_int, data); puts("still ok ?"); void **data2; l = alpm_list_remove(l, &a2, compare_int, data2); return 0; }
it only seg fault at the second alpm_list_remove
OK, you've found a much better test case here, thanks. This points us to line 302, the NULL set below: if(data) { *data = NULL; } This check doesn't seem quite right- shouldn't we be checking *data for existance before we go setting things? Something like: if(data && *data) { *data = NULL; } Or what do we do here? Double pointers start confusing me. :) -Dan
On Thu, Aug 7, 2008 at 10:34 AM, Xavier <shiningxc@gmail.com> wrote:
Indeed, this second example segfault in list_remove. But not your first one, which segfaults on the fn(...) call inside list_remove. And the fn parameter is _alpm_db_cmp.
It appears we still have a problem with the db_cmp function, I agree. We might want to try and isolate this issue as well.
On Thu, Aug 7, 2008 at 5:36 PM, Dan McGee <dpmcgee@gmail.com> wrote:
OK, you've found a much better test case here, thanks. This points us to line 302, the NULL set below:
if(data) { *data = NULL; }
This check doesn't seem quite right- shouldn't we be checking *data for existance before we go setting things? Something like:
if(data && *data) { *data = NULL; }
Or what do we do here? Double pointers start confusing me. :)
We already have this kind of code everywhere in libalpm. It does not cause any problems when it is used correctly. And I don't know how to make it safer either. How can we know if data is actually a valid address, and so if we can access *data or not.
Or what do we do here? Double pointers start confusing me. i reach here the limit of my knowlegde of C too.
but why not : if(data) { data = NULL; } if have found by trial and error that's the only thing that is not seg faulting and void **data; data=NULL; is the same than void **data=NULL;
On Thu, Aug 7, 2008 at 3:41 PM, Dan McGee <dpmcgee@gmail.com> wrote:
I think the problem is something different; note the problem occurs in db_cmp:
Program received signal SIGSEGV, Segmentation fault. _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 363 return(strcmp(db1->treename, db2->treename)); (gdb) bt #0 _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 #1 0xb8006a8c in alpm_list_remove (haystack=0x97b6140, needle=0x97b60f0, fn=0xb800d5b0 <_alpm_db_cmp>, data=0xbf841064) at alpm_list.c:314 #2 0xb800ead2 in alpm_db_unregister (db=0x97b60f0) at db.c:149 #3 0x08048770 in main () at test.c:17 (gdb) p d1 $1 = (const void *) 0x97b60f0 (gdb) p d2 $2 = (const void *) 0x97b60f0 (gdb) p d1->treename Attempt to dereference a generic pointer. (gdb) p ((pmdb_t)d1)->treename $3 = 0x97b60f0 "\020a{\t0a{\t\220p{\t" (gdb) p ((pmdb_t)d2)->treename $4 = 0x0
-int _alpm_db_cmp(const void *db1, const void *db2) +int _alpm_db_cmp(const void *d1, const void *d2) { - ALPM_LOG_FUNC; - return(strcmp(((pmdb_t *)db1)->treename, ((pmdb_t *)db2)->treename)); + pmdb_t *db1 = (pmdb_t *)db1; + pmdb_t *db2 = (pmdb_t *)db2; + return(strcmp(db1->treename, db2->treename)); } Oh my god, who could have wrote such a stupid code :@ commit f43805d875ad5c672afbbfff48bded2087204773 Author: Chantry Xavier <shiningxc@gmail.com> Date: Sat May 10 18:47:42 2008 +0200 Oh my god, it was me. /me bangs his head against the wall 100 times.
On Fri, Aug 8, 2008 at 6:32 AM, Xavier <shiningxc@gmail.com> wrote:
On Thu, Aug 7, 2008 at 3:41 PM, Dan McGee <dpmcgee@gmail.com> wrote:
I think the problem is something different; note the problem occurs in db_cmp:
Program received signal SIGSEGV, Segmentation fault. _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 363 return(strcmp(db1->treename, db2->treename)); (gdb) bt #0 _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 #1 0xb8006a8c in alpm_list_remove (haystack=0x97b6140, needle=0x97b60f0, fn=0xb800d5b0 <_alpm_db_cmp>, data=0xbf841064) at alpm_list.c:314 #2 0xb800ead2 in alpm_db_unregister (db=0x97b60f0) at db.c:149 #3 0x08048770 in main () at test.c:17 (gdb) p d1 $1 = (const void *) 0x97b60f0 (gdb) p d2 $2 = (const void *) 0x97b60f0 (gdb) p d1->treename Attempt to dereference a generic pointer. (gdb) p ((pmdb_t)d1)->treename $3 = 0x97b60f0 "\020a{\t0a{\t\220p{\t" (gdb) p ((pmdb_t)d2)->treename $4 = 0x0
-int _alpm_db_cmp(const void *db1, const void *db2) +int _alpm_db_cmp(const void *d1, const void *d2) { - ALPM_LOG_FUNC; - return(strcmp(((pmdb_t *)db1)->treename, ((pmdb_t *)db2)->treename)); + pmdb_t *db1 = (pmdb_t *)db1; + pmdb_t *db2 = (pmdb_t *)db2; + return(strcmp(db1->treename, db2->treename)); }
Oh my god, who could have wrote such a stupid code :@
commit f43805d875ad5c672afbbfff48bded2087204773 Author: Chantry Xavier <shiningxc@gmail.com> Date: Sat May 10 18:47:42 2008 +0200
Oh my god, it was me. /me bangs his head against the wall 100 times.
Why on earth did that even compile? This seems to work just fine: diff --git a/lib/libalpm/db.c b/lib/libalpm/db.c index d9a3931..191c8ba 100644 --- a/lib/libalpm/db.c +++ b/lib/libalpm/db.c @@ -358,8 +358,8 @@ void _alpm_db_free(pmdb_t *db) int _alpm_db_cmp(const void *d1, const void *d2) { - pmdb_t *db1 = (pmdb_t *)db1; - pmdb_t *db2 = (pmdb_t *)db2; + pmdb_t *db1 = (pmdb_t *)d1; + pmdb_t *db2 = (pmdb_t *)d2; return(strcmp(db1->treename, db2->treename)); } diff --git a/lib/libalpm/db.h b/lib/libalpm/db.h index eb0af1a..96fac0d 100644 --- a/lib/libalpm/db.h +++ b/lib/libalpm/db.h @@ -50,7 +50,7 @@ struct __pmdb_t { /* db.c, database general calls */ pmdb_t *_alpm_db_new(const char *dbpath, const char *treename); void _alpm_db_free(pmdb_t *db); -int _alpm_db_cmp(const void *db1, const void *db2); +int _alpm_db_cmp(const void *d1, const void *d2); alpm_list_t *_alpm_db_search(pmdb_t *db, const alpm_list_t *needles); pmdb_t *_alpm_db_register_local(void); pmdb_t *_alpm_db_register_sync(const char *treename);
On Thu, Aug 7, 2008 at 6:10 PM, solsTiCe d'Hiver <solstice.dhiver@gmail.com> wrote:
Or what do we do here? Double pointers start confusing me. i reach here the limit of my knowlegde of C too.
but why not : if(data) { data = NULL; }
if have found by trial and error that's the only thing that is not seg faulting
and void **data; data=NULL; is the same than void **data=NULL;
If you do that and then call list_remove like this : alpm_list_remove(..., data); Then it is the same than alpm_list_remove(...,NULL); So what is the point? In this case, the list_remove can't give us back a pointer to the removed node. We don't want data to be NULL, we want it to be a valid address, where the list_remove function will be able to store something. I already gave you the usual way to provide that, used everywhere in pacman/libalpm, which is to define void *data (a pointer), and then give &data (the address of that pointer) as a parameter. If you insist on using a void **data (a pointer to another pointer) and passing data as a parameter, then you need the data value to be a valid memory address. If you let it undefined, it has usually some random value. And when you try to use that value as an address, by using *data, then boom, segfault. That is when you need malloc. malloc will allocate some memory and return a pointer to that allocated memory. data is then a valid memory address and *data can be used. Here is a more fun explanation : http://cslibrary.stanford.edu/104/
oops my bad...
participants (3)
-
Dan McGee
-
solsTiCe d'Hiver
-
Xavier