[pacman-dev] [BUG] alpm_list_remove need an initalized pointer for void **data
hi.
i am playing with alpm and db4 to make a little program for me. just for fun.
and in this new release of libalpm (along pacman 3.2.0) i have found that
alpm_list_remove behave differently than before. so the last parameter (void
**data) need to be initialized to avoid a seg fault.
it happens in alpm_db_unregister if you unregister a sync db
if you run this, you will see it happen
#include
On Thu, Aug 7, 2008 at 7:42 AM, solsTiCe d'Hiver
hi.
i am playing with alpm and db4 to make a little program for me. just for fun.
and in this new release of libalpm (along pacman 3.2.0) i have found that alpm_list_remove behave differently than before. so the last parameter (void **data) need to be initialized to avoid a seg fault.
it happens in alpm_db_unregister if you unregister a sync db if you run this, you will see it happen
#include
#include #include int main(void) { pmdb_t *db; pmpkg_t *pkg; const char *s;
alpm_initialize(); alpm_option_set_root("/"); alpm_option_set_dbpath("/var/lib/pacman"); alpm_option_add_cachedir("/var/cache/pacman/pkg"); alpm_option_set_logfile("/dev/stdout"); db = alpm_db_register_sync("extra"); puts("alpm_db_unregister(db)"); alpm_db_unregister(db); puts("alpm_release()"); alpm_release(); exit(EXIT_SUCCESS); }
i wonder how it has not generated more bug in libalpm.
I think the problem is something different; note the problem occurs in db_cmp: Program received signal SIGSEGV, Segmentation fault. _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 363 return(strcmp(db1->treename, db2->treename)); (gdb) bt #0 _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 #1 0xb8006a8c in alpm_list_remove (haystack=0x97b6140, needle=0x97b60f0, fn=0xb800d5b0 <_alpm_db_cmp>, data=0xbf841064) at alpm_list.c:314 #2 0xb800ead2 in alpm_db_unregister (db=0x97b60f0) at db.c:149 #3 0x08048770 in main () at test.c:17 (gdb) p d1 $1 = (const void *) 0x97b60f0 (gdb) p d2 $2 = (const void *) 0x97b60f0 (gdb) p d1->treename Attempt to dereference a generic pointer. (gdb) p ((pmdb_t)d1)->treename $3 = 0x97b60f0 "\020a{\t0a{\t\220p{\t" (gdb) p ((pmdb_t)d2)->treename $4 = 0x0 I'm not completely sure what is going on here. Here is the start of list_remove. If data is anything except null, we set it to null anyway: alpm_list_t SYMEXPORT *alpm_list_remove(alpm_list_t *haystack, const void *needle, alpm_list_fn_cmp fn, void **data) { alpm_list_t *i = haystack, *tmp = NULL; if(data) { *data = NULL; } -Dan
On Thu, Aug 7, 2008 at 3:41 PM, Dan McGee
(gdb) p d1 $1 = (const void *) 0x97b60f0 (gdb) p d2 $2 = (const void *) 0x97b60f0 (gdb) p d1->treename Attempt to dereference a generic pointer. (gdb) p ((pmdb_t)d1)->treename $3 = 0x97b60f0 "\020a{\t0a{\t\220p{\t" (gdb) p ((pmdb_t)d2)->treename $4 = 0x0
wtf, how is that even possible?
On Thu, Aug 7, 2008 at 9:07 AM, Xavier
On Thu, Aug 7, 2008 at 3:41 PM, Dan McGee
wrote: (gdb) p d1 $1 = (const void *) 0x97b60f0 (gdb) p d2 $2 = (const void *) 0x97b60f0 (gdb) p d1->treename Attempt to dereference a generic pointer. (gdb) p ((pmdb_t)d1)->treename $3 = 0x97b60f0 "\020a{\t0a{\t\220p{\t" (gdb) p ((pmdb_t)d2)->treename $4 = 0x0
wtf, how is that even possible?
I'm not sure, I feel like GDB is doing something odd here considering d1 and d2 are identical addresses. Oh shoot, I see something now. :) I de-refed the instance type, not a pointer type. (gdb) p ((pmdb_t*)d1)->treename $1 = 0x891d130 "extra" (gdb) p ((pmdb_t*)d2)->treename $2 = 0x891d130 "extra" That makes more sense, but with those values, I have no idea why it is segfaulting. Anyone? I thought it might be due to strcmp() not liking the same exact string for both inputs, but a quick test put that fear to rest. -Dan
for me it is a problem with alpm_list_remove but i am not an expert.
look at this
#include
On Thu, Aug 7, 2008 at 5:23 PM, solsTiCe d'Hiver
for me it is a problem with alpm_list_remove but i am not an expert.
And Dan is an expert so he must be right :)
look at this #include
int compare_int(const void *a, const void *b) { const int* ia = (const int *)a; const int* ib = (const int *)b; return (*ia > *ib) - (*ia < *ib); }
int main(void) { alpm_list_t *l=NULL; int a=2,a2=2; l= alpm_list_add(l, &a); l= alpm_list_add(l, &a2); void **data=NULL; l = alpm_list_remove(l, &a, compare_int, data); puts("still ok ?"); void **data2; l = alpm_list_remove(l, &a2, compare_int, data2); return 0; }
it only seg fault at the second alpm_list_remove
Indeed, this second example segfault in list_remove. But not your first one, which segfaults on the fn(...) call inside list_remove. And the fn parameter is _alpm_db_cmp. Try this : void *data; l = alpm_list_remove(l, &a, compare_int, &data); void *data2; l = alpm_list_remove(l, &a2, compare_int, &data2);
On Thu, Aug 7, 2008 at 10:34 AM, Xavier
Indeed, this second example segfault in list_remove. But not your first one, which segfaults on the fn(...) call inside list_remove. And the fn parameter is _alpm_db_cmp.
It appears we still have a problem with the db_cmp function, I agree. We might want to try and isolate this issue as well.
On Thu, Aug 7, 2008 at 10:23 AM, solsTiCe d'Hiver
for me it is a problem with alpm_list_remove but i am not an expert.
look at this #include
int compare_int(const void *a, const void *b) { const int* ia = (const int *)a; const int* ib = (const int *)b; return (*ia > *ib) - (*ia < *ib); }
int main(void) { alpm_list_t *l=NULL; int a=2,a2=2; l= alpm_list_add(l, &a); l= alpm_list_add(l, &a2); void **data=NULL; l = alpm_list_remove(l, &a, compare_int, data); puts("still ok ?"); void **data2; l = alpm_list_remove(l, &a2, compare_int, data2); return 0; }
it only seg fault at the second alpm_list_remove
OK, you've found a much better test case here, thanks. This points us to line 302, the NULL set below: if(data) { *data = NULL; } This check doesn't seem quite right- shouldn't we be checking *data for existance before we go setting things? Something like: if(data && *data) { *data = NULL; } Or what do we do here? Double pointers start confusing me. :) -Dan
On Thu, Aug 7, 2008 at 5:36 PM, Dan McGee
OK, you've found a much better test case here, thanks. This points us to line 302, the NULL set below:
if(data) { *data = NULL; }
This check doesn't seem quite right- shouldn't we be checking *data for existance before we go setting things? Something like:
if(data && *data) { *data = NULL; }
Or what do we do here? Double pointers start confusing me. :)
We already have this kind of code everywhere in libalpm. It does not cause any problems when it is used correctly. And I don't know how to make it safer either. How can we know if data is actually a valid address, and so if we can access *data or not.
Or what do we do here? Double pointers start confusing me. i reach here the limit of my knowlegde of C too.
but why not : if(data) { data = NULL; } if have found by trial and error that's the only thing that is not seg faulting and void **data; data=NULL; is the same than void **data=NULL;
On Thu, Aug 7, 2008 at 6:10 PM, solsTiCe d'Hiver
Or what do we do here? Double pointers start confusing me. i reach here the limit of my knowlegde of C too.
but why not : if(data) { data = NULL; }
if have found by trial and error that's the only thing that is not seg faulting
and void **data; data=NULL; is the same than void **data=NULL;
If you do that and then call list_remove like this : alpm_list_remove(..., data); Then it is the same than alpm_list_remove(...,NULL); So what is the point? In this case, the list_remove can't give us back a pointer to the removed node. We don't want data to be NULL, we want it to be a valid address, where the list_remove function will be able to store something. I already gave you the usual way to provide that, used everywhere in pacman/libalpm, which is to define void *data (a pointer), and then give &data (the address of that pointer) as a parameter. If you insist on using a void **data (a pointer to another pointer) and passing data as a parameter, then you need the data value to be a valid memory address. If you let it undefined, it has usually some random value. And when you try to use that value as an address, by using *data, then boom, segfault. That is when you need malloc. malloc will allocate some memory and return a pointer to that allocated memory. data is then a valid memory address and *data can be used. Here is a more fun explanation : http://cslibrary.stanford.edu/104/
oops my bad...
On Thu, Aug 7, 2008 at 3:41 PM, Dan McGee
I think the problem is something different; note the problem occurs in db_cmp:
Program received signal SIGSEGV, Segmentation fault. _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 363 return(strcmp(db1->treename, db2->treename)); (gdb) bt #0 _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 #1 0xb8006a8c in alpm_list_remove (haystack=0x97b6140, needle=0x97b60f0, fn=0xb800d5b0 <_alpm_db_cmp>, data=0xbf841064) at alpm_list.c:314 #2 0xb800ead2 in alpm_db_unregister (db=0x97b60f0) at db.c:149 #3 0x08048770 in main () at test.c:17 (gdb) p d1 $1 = (const void *) 0x97b60f0 (gdb) p d2 $2 = (const void *) 0x97b60f0 (gdb) p d1->treename Attempt to dereference a generic pointer. (gdb) p ((pmdb_t)d1)->treename $3 = 0x97b60f0 "\020a{\t0a{\t\220p{\t" (gdb) p ((pmdb_t)d2)->treename $4 = 0x0
-int _alpm_db_cmp(const void *db1, const void *db2)
+int _alpm_db_cmp(const void *d1, const void *d2)
{
- ALPM_LOG_FUNC;
- return(strcmp(((pmdb_t *)db1)->treename, ((pmdb_t *)db2)->treename));
+ pmdb_t *db1 = (pmdb_t *)db1;
+ pmdb_t *db2 = (pmdb_t *)db2;
+ return(strcmp(db1->treename, db2->treename));
}
Oh my god, who could have wrote such a stupid code :@
commit f43805d875ad5c672afbbfff48bded2087204773
Author: Chantry Xavier
On Fri, Aug 8, 2008 at 6:32 AM, Xavier
On Thu, Aug 7, 2008 at 3:41 PM, Dan McGee
wrote: I think the problem is something different; note the problem occurs in db_cmp:
Program received signal SIGSEGV, Segmentation fault. _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 363 return(strcmp(db1->treename, db2->treename)); (gdb) bt #0 _alpm_db_cmp (d1=0x97b60f0, d2=0x97b60f0) at db.c:363 #1 0xb8006a8c in alpm_list_remove (haystack=0x97b6140, needle=0x97b60f0, fn=0xb800d5b0 <_alpm_db_cmp>, data=0xbf841064) at alpm_list.c:314 #2 0xb800ead2 in alpm_db_unregister (db=0x97b60f0) at db.c:149 #3 0x08048770 in main () at test.c:17 (gdb) p d1 $1 = (const void *) 0x97b60f0 (gdb) p d2 $2 = (const void *) 0x97b60f0 (gdb) p d1->treename Attempt to dereference a generic pointer. (gdb) p ((pmdb_t)d1)->treename $3 = 0x97b60f0 "\020a{\t0a{\t\220p{\t" (gdb) p ((pmdb_t)d2)->treename $4 = 0x0
-int _alpm_db_cmp(const void *db1, const void *db2) +int _alpm_db_cmp(const void *d1, const void *d2) { - ALPM_LOG_FUNC; - return(strcmp(((pmdb_t *)db1)->treename, ((pmdb_t *)db2)->treename)); + pmdb_t *db1 = (pmdb_t *)db1; + pmdb_t *db2 = (pmdb_t *)db2; + return(strcmp(db1->treename, db2->treename)); }
Oh my god, who could have wrote such a stupid code :@
commit f43805d875ad5c672afbbfff48bded2087204773 Author: Chantry Xavier
Date: Sat May 10 18:47:42 2008 +0200 Oh my god, it was me. /me bangs his head against the wall 100 times.
Why on earth did that even compile? This seems to work just fine: diff --git a/lib/libalpm/db.c b/lib/libalpm/db.c index d9a3931..191c8ba 100644 --- a/lib/libalpm/db.c +++ b/lib/libalpm/db.c @@ -358,8 +358,8 @@ void _alpm_db_free(pmdb_t *db) int _alpm_db_cmp(const void *d1, const void *d2) { - pmdb_t *db1 = (pmdb_t *)db1; - pmdb_t *db2 = (pmdb_t *)db2; + pmdb_t *db1 = (pmdb_t *)d1; + pmdb_t *db2 = (pmdb_t *)d2; return(strcmp(db1->treename, db2->treename)); } diff --git a/lib/libalpm/db.h b/lib/libalpm/db.h index eb0af1a..96fac0d 100644 --- a/lib/libalpm/db.h +++ b/lib/libalpm/db.h @@ -50,7 +50,7 @@ struct __pmdb_t { /* db.c, database general calls */ pmdb_t *_alpm_db_new(const char *dbpath, const char *treename); void _alpm_db_free(pmdb_t *db); -int _alpm_db_cmp(const void *db1, const void *db2); +int _alpm_db_cmp(const void *d1, const void *d2); alpm_list_t *_alpm_db_search(pmdb_t *db, const alpm_list_t *needles); pmdb_t *_alpm_db_register_local(void); pmdb_t *_alpm_db_register_sync(const char *treename);
participants (3)
-
Dan McGee
-
solsTiCe d'Hiver
-
Xavier