[pacman-dev] [PATCH 1/2] Use macros in sync DB parsing
This simplifies a lot of the repetative code and makes it obvious where the tricky or different ones are (e.g. depends, dates). It also makes it significantly easier to change the way this code works in the future. There should be no functional change with this patch. Signed-off-by: Dan McGee <dan@archlinux.org> --- This is really just setup for the following patch, or that one would be much more of a mess and you wouldn't be able to seperate the actual functional changes from the necessary ones. This does reduce the possibility for making errors in this copy-paste code though. -Dan lib/libalpm/be_sync.c | 151 +++++++++++++++++-------------------------------- 1 files changed, 52 insertions(+), 99 deletions(-) diff --git a/lib/libalpm/be_sync.c b/lib/libalpm/be_sync.c index 72caa50..137fc1b 100644 --- a/lib/libalpm/be_sync.c +++ b/lib/libalpm/be_sync.c @@ -218,6 +218,24 @@ static int sync_db_populate(pmdb_t *db) return(count); } +#define READ_NEXT(s) do { \ + if(_alpm_archive_fgets(s, sizeof(s), archive) == NULL) goto error; \ + _alpm_strtrim(s); \ +} while(0) + +#define READ_AND_STORE(f) do { \ + READ_NEXT(line); \ + STRDUP(f, line, goto error); \ +} while(0) + +#define READ_AND_STORE_ALL(f) do { \ + char *linedup; \ + READ_NEXT(line); \ + if(strlen(line) == 0) break; \ + STRDUP(linedup, line, goto error); \ + f = alpm_list_add(f, linedup); \ +} while(1) /* note the while(1) and not (0) */ + static int sync_db_read(pmdb_t *db, struct archive *archive, struct archive_entry *entry) { char line[1024]; @@ -264,63 +282,35 @@ static int sync_db_read(pmdb_t *db, struct archive *archive, struct archive_entr while(_alpm_archive_fgets(line, sizeof(line), archive) != NULL) { _alpm_strtrim(line); if(strcmp(line, "%NAME%") == 0) { - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - if(strcmp(_alpm_strtrim(line), pkg->name) != 0) { + READ_NEXT(line); + if(strcmp(line, pkg->name) != 0) { _alpm_log(PM_LOG_ERROR, _("%s database is inconsistent: name " "mismatch on package %s\n"), db->treename, pkg->name); } } else if(strcmp(line, "%VERSION%") == 0) { - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - if(strcmp(_alpm_strtrim(line), pkg->version) != 0) { + READ_NEXT(line); + if(strcmp(line, pkg->version) != 0) { _alpm_log(PM_LOG_ERROR, _("%s database is inconsistent: version " "mismatch on package %s\n"), db->treename, pkg->name); } } else if(strcmp(line, "%FILENAME%") == 0) { - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - STRDUP(pkg->filename, _alpm_strtrim(line), goto error); + READ_AND_STORE(pkg->filename); } else if(strcmp(line, "%DESC%") == 0) { - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - STRDUP(pkg->desc, _alpm_strtrim(line), goto error); + READ_AND_STORE(pkg->desc); } else if(strcmp(line, "%GROUPS%") == 0) { - while(_alpm_archive_fgets(line, sizeof(line), archive) && strlen(_alpm_strtrim(line))) { - char *linedup; - STRDUP(linedup, _alpm_strtrim(line), goto error); - pkg->groups = alpm_list_add(pkg->groups, linedup); - } + READ_AND_STORE_ALL(pkg->groups); } else if(strcmp(line, "%URL%") == 0) { - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - STRDUP(pkg->url, _alpm_strtrim(line), goto error); + READ_AND_STORE(pkg->url); } else if(strcmp(line, "%LICENSE%") == 0) { - while(_alpm_archive_fgets(line, sizeof(line), archive) && - strlen(_alpm_strtrim(line))) { - char *linedup; - STRDUP(linedup, _alpm_strtrim(line), goto error); - pkg->licenses = alpm_list_add(pkg->licenses, linedup); - } + READ_AND_STORE_ALL(pkg->licenses); } else if(strcmp(line, "%ARCH%") == 0) { - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - STRDUP(pkg->arch, _alpm_strtrim(line), goto error); + READ_AND_STORE(pkg->arch); } else if(strcmp(line, "%BUILDDATE%") == 0) { - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - _alpm_strtrim(line); - + READ_NEXT(line); char first = tolower((unsigned char)line[0]); if(first > 'a' && first < 'z') { - struct tm tmp_tm = {0}; /* initialize to null in case of failure */ + /* initialize to null in case of failure */ + struct tm tmp_tm = {0}; setlocale(LC_TIME, "C"); strptime(line, "%a %b %e %H:%M:%S %Y", &tmp_tm); pkg->builddate = mktime(&tmp_tm); @@ -329,46 +319,28 @@ static int sync_db_read(pmdb_t *db, struct archive *archive, struct archive_entr pkg->builddate = atol(line); } } else if(strcmp(line, "%PACKAGER%") == 0) { - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - STRDUP(pkg->packager, _alpm_strtrim(line), goto error); + READ_AND_STORE(pkg->packager); } else if(strcmp(line, "%CSIZE%") == 0) { - /* NOTE: the CSIZE and SIZE fields both share the "size" field - * in the pkginfo_t struct. This can be done b/c CSIZE - * is currently only used in sync databases, and SIZE is - * only used in local databases. + /* Note: the CSIZE and SIZE fields both share the "size" field in the + * pkginfo_t struct. This can be done b/c CSIZE is currently only used + * in sync databases, and SIZE is only used in local databases. */ - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - pkg->size = atol(_alpm_strtrim(line)); + READ_NEXT(line); + pkg->size = atol(line); /* also store this value to isize if isize is unset */ if(pkg->isize == 0) { pkg->isize = pkg->size; } } else if(strcmp(line, "%ISIZE%") == 0) { - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - pkg->isize = atol(_alpm_strtrim(line)); + READ_NEXT(line); + pkg->isize = atol(line); } else if(strcmp(line, "%MD5SUM%") == 0) { - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - STRDUP(pkg->md5sum, _alpm_strtrim(line), goto error); + READ_AND_STORE(pkg->md5sum); } else if(strcmp(line, "%REPLACES%") == 0) { - while(_alpm_archive_fgets(line, sizeof(line), archive) && - strlen(_alpm_strtrim(line))) { - char *linedup; - STRDUP(linedup, _alpm_strtrim(line), goto error); - pkg->replaces = alpm_list_add(pkg->replaces, linedup); - } + READ_AND_STORE_ALL(pkg->replaces); } else if(strcmp(line, "%EPOCH%") == 0) { - if(_alpm_archive_fgets(line, sizeof(line), archive) == NULL) { - goto error; - } - pkg->epoch = atoi(_alpm_strtrim(line)); + READ_NEXT(line); + pkg->epoch = atoi(line); } else if(strcmp(line, "%FORCE%") == 0) { /* For backward compatibility, treat force as a non-zero epoch * but only if we didn't already have a known epoch value. */ @@ -376,39 +348,20 @@ static int sync_db_read(pmdb_t *db, struct archive *archive, struct archive_entr pkg->epoch = 1; } } else if(strcmp(line, "%DEPENDS%") == 0) { - while(_alpm_archive_fgets(line, sizeof(line), archive) && - strlen(_alpm_strtrim(line))) { - pmdepend_t *dep = _alpm_splitdep(_alpm_strtrim(line)); - pkg->depends = alpm_list_add(pkg->depends, dep); + /* Different than the rest because of the _alpm_splitdep call. */ + while(1) { + READ_NEXT(line); + if(strlen(line) == 0) break; + pkg->depends = alpm_list_add(pkg->depends, _alpm_splitdep(line)); } } else if(strcmp(line, "%OPTDEPENDS%") == 0) { - while(_alpm_archive_fgets(line, sizeof(line), archive) && - strlen(_alpm_strtrim(line))) { - char *linedup; - STRDUP(linedup, _alpm_strtrim(line), goto error); - pkg->optdepends = alpm_list_add(pkg->optdepends, linedup); - } + READ_AND_STORE_ALL(pkg->optdepends); } else if(strcmp(line, "%CONFLICTS%") == 0) { - while(_alpm_archive_fgets(line, sizeof(line), archive) && - strlen(_alpm_strtrim(line))) { - char *linedup; - STRDUP(linedup, _alpm_strtrim(line), goto error); - pkg->conflicts = alpm_list_add(pkg->conflicts, linedup); - } + READ_AND_STORE_ALL(pkg->conflicts); } else if(strcmp(line, "%PROVIDES%") == 0) { - while(_alpm_archive_fgets(line, sizeof(line), archive) && - strlen(_alpm_strtrim(line))) { - char *linedup; - STRDUP(linedup, _alpm_strtrim(line), goto error); - pkg->provides = alpm_list_add(pkg->provides, linedup); - } + READ_AND_STORE_ALL(pkg->provides); } else if(strcmp(line, "%DELTAS%") == 0) { - while(_alpm_archive_fgets(line, sizeof(line), archive) && strlen(_alpm_strtrim(line))) { - pmdelta_t *delta = _alpm_delta_parse(line); - if(delta) { - pkg->deltas = alpm_list_add(pkg->deltas, delta); - } - } + READ_AND_STORE_ALL(pkg->deltas); } } } else { -- 1.7.3.3
The old function was written in a time before we relied on it for nearly every operation. Since then, we have switched to the archive backend and now fast parsing is a big deal. The former function made a per-character call to the libarchive archive_read_data() function, which resulted in some 21 million calls in a typical "load all sync dbs" operation. If we instead do some buffering of our own and read the blocks directly, and then find our newlines from there, we can cut out the multiple layers of overhead and go from archive to parsed data much quicker. Both users of the former function are switched over to the new signature, made easier by the macros now in place in the sync backend parsing code. Performance: for a `pacman -Su` (no upgrades available), _alpm_archive_fgets() goes from being 29% of the total time to 12% The time spent on the libarchive function being called dropped from 24% to 6%. This pushes _alpm_pkg_find back to the title of slowest low-level function. Signed-off-by: Dan McGee <dan@archlinux.org> --- I wish this one was a bit easier to comprehend and grok, but the cost of performance here is a bit more complicated fgets function than the very simplistic one we had before. The new function revolves around the zero-copy, retrieve an entire block functinoality of libarchive. We create a structure to keep all the bookkeeping info we need around, including a buffer of our own where each line ends up and auto-resizes itself as we poll through an archive file entry, and also cleans itself up when the file is EOF. This means malloc/free is not necessary by the caller, which keeps most concerns in the fgets function. Please let me know if it is unclear how the new function works, so more comments can be put in there. -Dan lib/libalpm/be_package.c | 12 ++++-- lib/libalpm/be_sync.c | 14 ++++-- lib/libalpm/util.c | 97 +++++++++++++++++++++++++++++++++++---------- lib/libalpm/util.h | 18 ++++++++- 4 files changed, 109 insertions(+), 32 deletions(-) diff --git a/lib/libalpm/be_package.c b/lib/libalpm/be_package.c index 32ec993..4c1bb08 100644 --- a/lib/libalpm/be_package.c +++ b/lib/libalpm/be_package.c @@ -155,17 +155,21 @@ static struct pkg_operations *get_file_pkg_ops() */ static int parse_descfile(struct archive *a, pmpkg_t *newpkg) { - char line[PATH_MAX]; char *ptr = NULL; char *key = NULL; int linenum = 0; + struct archive_read_buffer buf; ALPM_LOG_FUNC; - /* loop until we reach EOF (where archive_fgets will return NULL) */ - while(_alpm_archive_fgets(line, PATH_MAX, a) != NULL) { + memset(&buf, 0, sizeof(buf)); + buf.max_line_size = PATH_MAX; + + /* loop until we reach EOF or other error */ + while(_alpm_archive_fgets(a, &buf) == ARCHIVE_OK) { + char *line = buf.line ? _alpm_strtrim(buf.line) : ""; + linenum++; - _alpm_strtrim(line); if(strlen(line) == 0 || line[0] == '#') { continue; } diff --git a/lib/libalpm/be_sync.c b/lib/libalpm/be_sync.c index 137fc1b..0e688b1 100644 --- a/lib/libalpm/be_sync.c +++ b/lib/libalpm/be_sync.c @@ -219,8 +219,8 @@ static int sync_db_populate(pmdb_t *db) } #define READ_NEXT(s) do { \ - if(_alpm_archive_fgets(s, sizeof(s), archive) == NULL) goto error; \ - _alpm_strtrim(s); \ + if(_alpm_archive_fgets(archive, &buf) != ARCHIVE_OK) goto error; \ + s = buf.line ? _alpm_strtrim(buf.line) : ""; \ } while(0) #define READ_AND_STORE(f) do { \ @@ -238,10 +238,10 @@ static int sync_db_populate(pmdb_t *db) static int sync_db_read(pmdb_t *db, struct archive *archive, struct archive_entry *entry) { - char line[1024]; const char *entryname = NULL; char *filename, *pkgname, *p, *q; pmpkg_t *pkg; + struct archive_read_buffer buf; ALPM_LOG_FUNC; @@ -260,6 +260,9 @@ static int sync_db_read(pmdb_t *db, struct archive *archive, struct archive_entr _alpm_log(PM_LOG_FUNCTION, "loading package data from archive entry %s\n", entryname); + memset(&buf, 0, sizeof(buf)); + buf.max_line_size = PATH_MAX; + /* get package and db file names */ STRDUP(pkgname, entryname, RET_ERR(PM_ERR_MEMORY, -1)); p = pkgname + strlen(pkgname); @@ -279,8 +282,9 @@ static int sync_db_read(pmdb_t *db, struct archive *archive, struct archive_entr if(strcmp(filename, "desc") == 0 || strcmp(filename, "depends") == 0 || strcmp(filename, "deltas") == 0) { - while(_alpm_archive_fgets(line, sizeof(line), archive) != NULL) { - _alpm_strtrim(line); + while(_alpm_archive_fgets(archive, &buf) == ARCHIVE_OK) { + char *line = _alpm_strtrim(buf.line); + if(strcmp(line, "%NAME%") == 0) { READ_NEXT(line); if(strcmp(line, pkg->name) != 0) { diff --git a/lib/libalpm/util.c b/lib/libalpm/util.c index 1291ea0..b007ae6 100644 --- a/lib/libalpm/util.c +++ b/lib/libalpm/util.c @@ -771,33 +771,86 @@ int _alpm_test_md5sum(const char *filepath, const char *md5sum) return(ret); } -char *_alpm_archive_fgets(char *line, size_t size, struct archive *a) +/* Note: does NOT handle sparse files on purpose for speed. */ +int _alpm_archive_fgets(struct archive *a, struct archive_read_buffer *b) { - /* for now, just read one char at a time until we get to a - * '\n' char. we can optimize this later with an internal - * buffer. */ - /* leave room for zero terminator */ - char *last = line + size - 1; - char *i; - - for(i = line; i < last; i++) { - int ret = archive_read_data(a, i, 1); - /* special check for first read- if null, return null, - * this indicates EOF */ - if(i == line && (ret <= 0 || *i == '\0')) { - return(NULL); + char *i = NULL; + int64_t offset; + int done = 0; + + while(1) { + /* have we processed this entire block? */ + if(b->block + b->block_size == b->block_offset) { + if(b->ret == ARCHIVE_EOF) { + /* reached end of archive on the last read, now we are out of data */ + goto cleanup; + } + + /* zero-copy- this is the entire next block of data. */ + b->ret = archive_read_data_block(a, (void*)&b->block, + &b->block_size, &offset); + b->block_offset = b->block; + + /* error or end of archive with no data read, cleanup */ + if(b->ret < ARCHIVE_OK || + (b->block_size == 0 && b->ret == ARCHIVE_EOF)) { + goto cleanup; + } } - /* check if read value was null or newline */ - if(ret <= 0 || *i == '\0' || *i == '\n') { - last = i + 1; - break; + + /* loop through the block looking for EOL characters */ + for(i = b->block_offset; i < b->block + b->block_size; i++) { + /* check if read value was null or newline */ + if(*i == '\0' || *i == '\n') { + done = 1; + break; + } } - } - /* always null terminate the buffer */ - *last = '\0'; + /* allocate our buffer, or ensure our existing one is big enough */ + if(!b->line) { + /* set the initial buffer to the read block_size */ + CALLOC(b->line, b->block_size + 1, sizeof(char), RET_ERR(PM_ERR_MEMORY, -1)); + b->line_size = b->block_size + 1; + b->line_offset = b->line; + } else { + size_t needed = (b->line_offset - b->line) + (i - b->block_offset) + 1; + if(needed > b->max_line_size) { + RET_ERR(PM_ERR_MEMORY, -1); + } + if(needed > b->line_size) { + /* need to realloc + copy data to fit total length */ + char *new; + CALLOC(new, needed, sizeof(char), RET_ERR(PM_ERR_MEMORY, -1)); + memcpy(new, b->line, b->line_size); + b->line_size = needed; + b->line_offset = new + (b->line_offset - b->line); + free(b->line); + b->line = new; + } + } + + if(done) { + size_t len = i - b->block_offset; + memcpy(b->line_offset, b->block_offset, len); + b->line_offset[len] = '\0'; + b->block_offset = ++i; + /* this is the main return point; from here you can read b->line */ + return(ARCHIVE_OK); + } else { + /* we've looked through the whole block but no newline, copy it */ + size_t len = b->block + b->block_size - b->block_offset; + memcpy(b->line_offset, b->block_offset, len); + } + } - return(line); +cleanup: + { + int ret = b->ret; + FREE(b->line); + memset(b, 0, sizeof(b)); + return(ret); + } } int _alpm_splitname(const char *target, pmpkg_t *pkg) diff --git a/lib/libalpm/util.h b/lib/libalpm/util.h index 0b80420..dfdc6a7 100644 --- a/lib/libalpm/util.h +++ b/lib/libalpm/util.h @@ -59,6 +59,22 @@ _alpm_log(PM_LOG_DEBUG, "returning error %d from %s : %s\n", err, __func__, alpm_strerrorlast()); \ return(ret); } while(0) +/** + * Used as a buffer/state holder for _alpm_archive_fgets(). + */ +struct archive_read_buffer { + char *line; + char *line_offset; + size_t line_size; + size_t max_line_size; + + char *block; + char *block_offset; + size_t block_size; + + int ret; +}; + int _alpm_makepath(const char *path); int _alpm_makepath_mode(const char *path, mode_t mode); int _alpm_copyfile(const char *src, const char *dest); @@ -76,7 +92,7 @@ char *_alpm_filecache_find(const char *filename); const char *_alpm_filecache_setup(void); int _alpm_lstat(const char *path, struct stat *buf); int _alpm_test_md5sum(const char *filepath, const char *md5sum); -char *_alpm_archive_fgets(char *line, size_t size, struct archive *a); +int _alpm_archive_fgets(struct archive *a, struct archive_read_buffer *b); int _alpm_splitname(const char *target, pmpkg_t *pkg); unsigned long _alpm_hash_sdbm(const char *str); -- 1.7.3.3
On 15/12/10 16:48, Dan McGee wrote:
Please let me know if it is unclear how the new function works, so more comments can be put in there.
This is a really nice improvement over the old character at a time implementation. I found the comments in the function to clearly describe what is happening. I only have one query and what will forever be known as "the worlds most trivial correction" to make. <snip>
+ /* loop until we reach EOF or other error */ + while(_alpm_archive_fgets(a,&buf) == ARCHIVE_OK) { + char *line = buf.line ? _alpm_strtrim(buf.line) : "";
<snip>
+ if(_alpm_archive_fgets(archive,&buf) != ARCHIVE_OK) goto error; \ + s = buf.line ? _alpm_strtrim(buf.line) : ""; \
<snip>
+ while(_alpm_archive_fgets(archive,&buf) == ARCHIVE_OK) { + char *line = _alpm_strtrim(buf.line);
I am unsure why the ?: NULL check is done with the first two and not the third. I might be missing something here... <snip> Prepare yourself for this insight I am about to impart:
+ /* zero-copy- this is the entire next block of data. */
Space between "zero-copy" and the hyphen. :P Allan
On Wed, Dec 15, 2010 at 7:48 AM, Dan McGee <dan@archlinux.org> wrote:
The old function was written in a time before we relied on it for nearly every operation. Since then, we have switched to the archive backend and now fast parsing is a big deal.
The former function made a per-character call to the libarchive archive_read_data() function, which resulted in some 21 million calls in a typical "load all sync dbs" operation. If we instead do some buffering of our own and read the blocks directly, and then find our newlines from there, we can cut out the multiple layers of overhead and go from archive to parsed data much quicker.
Both users of the former function are switched over to the new signature, made easier by the macros now in place in the sync backend parsing code.
Performance: for a `pacman -Su` (no upgrades available), _alpm_archive_fgets() goes from being 29% of the total time to 12% The time spent on the libarchive function being called dropped from 24% to 6%.
This pushes _alpm_pkg_find back to the title of slowest low-level function.
Signed-off-by: Dan McGee <dan@archlinux.org> ---
I wish this one was a bit easier to comprehend and grok, but the cost of performance here is a bit more complicated fgets function than the very simplistic one we had before.
The new function revolves around the zero-copy, retrieve an entire block functinoality of libarchive. We create a structure to keep all the bookkeeping info we need around, including a buffer of our own where each line ends up and auto-resizes itself as we poll through an archive file entry, and also cleans itself up when the file is EOF. This means malloc/free is not necessary by the caller, which keeps most concerns in the fgets function.
Please let me know if it is unclear how the new function works, so more comments can be put in there.
Since we have just one line, I would not have bothered trying to alloc exactly what we need. I would just have used a static or dynamic buf of max line size, and be done with it. But anyway you already implemented the more complex way and I don't see a problem with that as long as its tested (especially the re-alloc part) :) I could not spot a problem by simply reading the code. Another minor observation : in the !done case, the alloc block does a strict allocation and thus does not take into account that we will very likely have to extend it on the next iteration (if we did not reach the last archive block). Finally can it actually happen with our sync archives which only has very small files that a line is split across two blocks ? In any cases, this is another very welcome improvement.
On Sun, Dec 19, 2010 at 11:32 AM, Xavier Chantry <chantry.xavier@gmail.com> wrote:
On Wed, Dec 15, 2010 at 7:48 AM, Dan McGee <dan@archlinux.org> wrote:
The old function was written in a time before we relied on it for nearly every operation. Since then, we have switched to the archive backend and now fast parsing is a big deal.
The former function made a per-character call to the libarchive archive_read_data() function, which resulted in some 21 million calls in a typical "load all sync dbs" operation. If we instead do some buffering of our own and read the blocks directly, and then find our newlines from there, we can cut out the multiple layers of overhead and go from archive to parsed data much quicker.
Both users of the former function are switched over to the new signature, made easier by the macros now in place in the sync backend parsing code.
Performance: for a `pacman -Su` (no upgrades available), _alpm_archive_fgets() goes from being 29% of the total time to 12% The time spent on the libarchive function being called dropped from 24% to 6%.
This pushes _alpm_pkg_find back to the title of slowest low-level function.
Signed-off-by: Dan McGee <dan@archlinux.org> ---
I wish this one was a bit easier to comprehend and grok, but the cost of performance here is a bit more complicated fgets function than the very simplistic one we had before.
The new function revolves around the zero-copy, retrieve an entire block functinoality of libarchive. We create a structure to keep all the bookkeeping info we need around, including a buffer of our own where each line ends up and auto-resizes itself as we poll through an archive file entry, and also cleans itself up when the file is EOF. This means malloc/free is not necessary by the caller, which keeps most concerns in the fgets function.
Please let me know if it is unclear how the new function works, so more comments can be put in there.
Since we have just one line, I would not have bothered trying to alloc exactly what we need. I would just have used a static or dynamic buf of max line size, and be done with it. But anyway you already implemented the more complex way and I don't see a problem with that as long as its tested (especially the re-alloc part) :)
I'm not sure what you are saying you would do with a dynamic buffer- doesn't it make sense to pick the original size as whatever we get back first, rather than any other arbitrary size that we might have to resize anyway? And I've always hated static buffer sizes; see the PATH_MAX junk for more reason for that.
I could not spot a problem by simply reading the code.
Another minor observation : in the !done case, the alloc block does a strict allocation and thus does not take into account that we will very likely have to extend it on the next iteration (if we did not reach the last archive block).
Finally can it actually happen with our sync archives which only has very small files that a line is split across two blocks ?
It can with the smoke002.py pactest I'll include that just blew a hole through my realloc code; necessary changes below. Pactest created packages usually produce a block size of 65536 or so. I also bumped the max line size to 512KB rather than PATH_MAX (4096 bytes), so now the delayed allocation makes a bit more sense.
In any cases, this is another very welcome improvement.
Thanks! -Dan diff --git a/lib/libalpm/util.c b/lib/libalpm/util.c index 87880de..d34eab5 100644 --- a/lib/libalpm/util.c +++ b/lib/libalpm/util.c @@ -841,6 +842,8 @@ int _alpm_archive_fgets(struct archive *a, struct archive_read_buffer *b) /* we've looked through the whole block but no newline, copy it */ size_t len = b->block + b->block_size - b->block_offset; memcpy(b->line_offset, b->block_offset, len); + b->line_offset += len; + b->block_offset = i; } }
On 22/12/10 04:22, Dan McGee wrote:
On Sun, Dec 19, 2010 at 11:32 AM, Xavier Chantry <chantry.xavier@gmail.com> wrote:
On Wed, Dec 15, 2010 at 7:48 AM, Dan McGee<dan@archlinux.org> wrote:
The old function was written in a time before we relied on it for nearly every operation. Since then, we have switched to the archive backend and now fast parsing is a big deal.
The former function made a per-character call to the libarchive archive_read_data() function, which resulted in some 21 million calls in a typical "load all sync dbs" operation. If we instead do some buffering of our own and read the blocks directly, and then find our newlines from there, we can cut out the multiple layers of overhead and go from archive to parsed data much quicker.
Both users of the former function are switched over to the new signature, made easier by the macros now in place in the sync backend parsing code.
Performance: for a `pacman -Su` (no upgrades available), _alpm_archive_fgets() goes from being 29% of the total time to 12% The time spent on the libarchive function being called dropped from 24% to 6%.
This pushes _alpm_pkg_find back to the title of slowest low-level function.
Signed-off-by: Dan McGee<dan@archlinux.org> ---
I wish this one was a bit easier to comprehend and grok, but the cost of performance here is a bit more complicated fgets function than the very simplistic one we had before.
The new function revolves around the zero-copy, retrieve an entire block functinoality of libarchive. We create a structure to keep all the bookkeeping info we need around, including a buffer of our own where each line ends up and auto-resizes itself as we poll through an archive file entry, and also cleans itself up when the file is EOF. This means malloc/free is not necessary by the caller, which keeps most concerns in the fgets function.
Please let me know if it is unclear how the new function works, so more comments can be put in there.
Since we have just one line, I would not have bothered trying to alloc exactly what we need. I would just have used a static or dynamic buf of max line size, and be done with it. But anyway you already implemented the more complex way and I don't see a problem with that as long as its tested (especially the re-alloc part) :)
I'm not sure what you are saying you would do with a dynamic buffer- doesn't it make sense to pick the original size as whatever we get back first, rather than any other arbitrary size that we might have to resize anyway? And I've always hated static buffer sizes; see the PATH_MAX junk for more reason for that.
I could not spot a problem by simply reading the code.
Another minor observation : in the !done case, the alloc block does a strict allocation and thus does not take into account that we will very likely have to extend it on the next iteration (if we did not reach the last archive block).
Finally can it actually happen with our sync archives which only has very small files that a line is split across two blocks ?
It can with the smoke002.py pactest I'll include that just blew a hole through my realloc code; necessary changes below. Pactest created packages usually produce a block size of 65536 or so.
I also bumped the max line size to 512KB rather than PATH_MAX (4096 bytes), so now the delayed allocation makes a bit more sense.
In any cases, this is another very welcome improvement.
Thanks!
-Dan
diff --git a/lib/libalpm/util.c b/lib/libalpm/util.c index 87880de..d34eab5 100644 --- a/lib/libalpm/util.c +++ b/lib/libalpm/util.c @@ -841,6 +842,8 @@ int _alpm_archive_fgets(struct archive *a, struct archive_read_buffer *b) /* we've looked through the whole block but no newline, copy it */ size_t len = b->block + b->block_size - b->block_offset; memcpy(b->line_offset, b->block_offset, len); + b->line_offset += len; + b->block_offset = i; } }
Would this fix some sort of infinite loop in database reading? I just experienced pacman "freezing" with 100% CPU usage while reading the [extra] database. I figured out that I had not rebuilt my pacman-git package since applying this diff on my working branch and with a rebuild, everything went away. So I figure this was the fix... but I want to make sure that it makes sense to be the fix and I have not just ignored another issue that is no longer obviously exposed. Allan
On Thu, Dec 23, 2010 at 5:11 PM, Allan McRae <allan@archlinux.org> wrote:
On 22/12/10 04:22, Dan McGee wrote:
On Sun, Dec 19, 2010 at 11:32 AM, Xavier Chantry <chantry.xavier@gmail.com> wrote:
On Wed, Dec 15, 2010 at 7:48 AM, Dan McGee<dan@archlinux.org> wrote:
The old function was written in a time before we relied on it for nearly every operation. Since then, we have switched to the archive backend and now fast parsing is a big deal.
The former function made a per-character call to the libarchive archive_read_data() function, which resulted in some 21 million calls in a typical "load all sync dbs" operation. If we instead do some buffering of our own and read the blocks directly, and then find our newlines from there, we can cut out the multiple layers of overhead and go from archive to parsed data much quicker.
Both users of the former function are switched over to the new signature, made easier by the macros now in place in the sync backend parsing code.
Performance: for a `pacman -Su` (no upgrades available), _alpm_archive_fgets() goes from being 29% of the total time to 12% The time spent on the libarchive function being called dropped from 24% to 6%.
This pushes _alpm_pkg_find back to the title of slowest low-level function.
Signed-off-by: Dan McGee<dan@archlinux.org> ---
I wish this one was a bit easier to comprehend and grok, but the cost of performance here is a bit more complicated fgets function than the very simplistic one we had before.
The new function revolves around the zero-copy, retrieve an entire block functinoality of libarchive. We create a structure to keep all the bookkeeping info we need around, including a buffer of our own where each line ends up and auto-resizes itself as we poll through an archive file entry, and also cleans itself up when the file is EOF. This means malloc/free is not necessary by the caller, which keeps most concerns in the fgets function.
Please let me know if it is unclear how the new function works, so more comments can be put in there.
Since we have just one line, I would not have bothered trying to alloc exactly what we need. I would just have used a static or dynamic buf of max line size, and be done with it. But anyway you already implemented the more complex way and I don't see a problem with that as long as its tested (especially the re-alloc part) :)
I'm not sure what you are saying you would do with a dynamic buffer- doesn't it make sense to pick the original size as whatever we get back first, rather than any other arbitrary size that we might have to resize anyway? And I've always hated static buffer sizes; see the PATH_MAX junk for more reason for that.
I could not spot a problem by simply reading the code.
Another minor observation : in the !done case, the alloc block does a strict allocation and thus does not take into account that we will very likely have to extend it on the next iteration (if we did not reach the last archive block).
Finally can it actually happen with our sync archives which only has very small files that a line is split across two blocks ?
It can with the smoke002.py pactest I'll include that just blew a hole through my realloc code; necessary changes below. Pactest created packages usually produce a block size of 65536 or so.
I also bumped the max line size to 512KB rather than PATH_MAX (4096 bytes), so now the delayed allocation makes a bit more sense.
In any cases, this is another very welcome improvement.
Thanks!
-Dan
diff --git a/lib/libalpm/util.c b/lib/libalpm/util.c index 87880de..d34eab5 100644 --- a/lib/libalpm/util.c +++ b/lib/libalpm/util.c @@ -841,6 +842,8 @@ int _alpm_archive_fgets(struct archive *a, struct archive_read_buffer *b) /* we've looked through the whole block but no newline, copy it */ size_t len = b->block + b->block_size - b->block_offset; memcpy(b->line_offset, b->block_offset, len); + b->line_offset += len; + b->block_offset = i; } }
Would this fix some sort of infinite loop in database reading? I just experienced pacman "freezing" with 100% CPU usage while reading the [extra] database. I figured out that I had not rebuilt my pacman-git package since applying this diff on my working branch and with a rebuild, everything went away.
So I figure this was the fix... but I want to make sure that it makes sense to be the fix and I have not just ignored another issue that is no longer obviously exposed.
Probably- try with the new patch on my fgets-perf branch in my repo. -Dan
On 24/12/10 14:18, Dan McGee wrote:
On Thu, Dec 23, 2010 at 5:11 PM, Allan McRae<allan@archlinux.org> wrote:
On 22/12/10 04:22, Dan McGee wrote:
On Sun, Dec 19, 2010 at 11:32 AM, Xavier Chantry <chantry.xavier@gmail.com> wrote:
On Wed, Dec 15, 2010 at 7:48 AM, Dan McGee<dan@archlinux.org> wrote:
The old function was written in a time before we relied on it for nearly every operation. Since then, we have switched to the archive backend and now fast parsing is a big deal.
The former function made a per-character call to the libarchive archive_read_data() function, which resulted in some 21 million calls in a typical "load all sync dbs" operation. If we instead do some buffering of our own and read the blocks directly, and then find our newlines from there, we can cut out the multiple layers of overhead and go from archive to parsed data much quicker.
Both users of the former function are switched over to the new signature, made easier by the macros now in place in the sync backend parsing code.
Performance: for a `pacman -Su` (no upgrades available), _alpm_archive_fgets() goes from being 29% of the total time to 12% The time spent on the libarchive function being called dropped from 24% to 6%.
This pushes _alpm_pkg_find back to the title of slowest low-level function.
Signed-off-by: Dan McGee<dan@archlinux.org> ---
I wish this one was a bit easier to comprehend and grok, but the cost of performance here is a bit more complicated fgets function than the very simplistic one we had before.
The new function revolves around the zero-copy, retrieve an entire block functinoality of libarchive. We create a structure to keep all the bookkeeping info we need around, including a buffer of our own where each line ends up and auto-resizes itself as we poll through an archive file entry, and also cleans itself up when the file is EOF. This means malloc/free is not necessary by the caller, which keeps most concerns in the fgets function.
Please let me know if it is unclear how the new function works, so more comments can be put in there.
Since we have just one line, I would not have bothered trying to alloc exactly what we need. I would just have used a static or dynamic buf of max line size, and be done with it. But anyway you already implemented the more complex way and I don't see a problem with that as long as its tested (especially the re-alloc part) :)
I'm not sure what you are saying you would do with a dynamic buffer- doesn't it make sense to pick the original size as whatever we get back first, rather than any other arbitrary size that we might have to resize anyway? And I've always hated static buffer sizes; see the PATH_MAX junk for more reason for that.
I could not spot a problem by simply reading the code.
Another minor observation : in the !done case, the alloc block does a strict allocation and thus does not take into account that we will very likely have to extend it on the next iteration (if we did not reach the last archive block).
Finally can it actually happen with our sync archives which only has very small files that a line is split across two blocks ?
It can with the smoke002.py pactest I'll include that just blew a hole through my realloc code; necessary changes below. Pactest created packages usually produce a block size of 65536 or so.
I also bumped the max line size to 512KB rather than PATH_MAX (4096 bytes), so now the delayed allocation makes a bit more sense.
In any cases, this is another very welcome improvement.
Thanks!
-Dan
diff --git a/lib/libalpm/util.c b/lib/libalpm/util.c index 87880de..d34eab5 100644 --- a/lib/libalpm/util.c +++ b/lib/libalpm/util.c @@ -841,6 +842,8 @@ int _alpm_archive_fgets(struct archive *a, struct archive_read_buffer *b) /* we've looked through the whole block but no newline, copy it */ size_t len = b->block + b->block_size - b->block_offset; memcpy(b->line_offset, b->block_offset, len); + b->line_offset += len; + b->block_offset = i; } }
Would this fix some sort of infinite loop in database reading? I just experienced pacman "freezing" with 100% CPU usage while reading the [extra] database. I figured out that I had not rebuilt my pacman-git package since applying this diff on my working branch and with a rebuild, everything went away.
So I figure this was the fix... but I want to make sure that it makes sense to be the fix and I have not just ignored another issue that is no longer obviously exposed.
Probably- try with the new patch on my fgets-perf branch in my repo.
I should have been clearer. I do not see the issue with the newest version of your patch. This was mainly me confirming to myself that the changes made to that patch fixed the issue, which on further inspection appears to be true. So all is good. Allan
participants (4)
-
Allan McRae
-
Dan McGee
-
Dan McGee
-
Xavier Chantry