I haven't tried this yet, but you beat me to my own suggestion, well done. :) On Thu, Mar 27, 2008 at 6:13 AM, Chantry Xavier <shiningxc@gmail.com> wrote:
Signed-off-by: Chantry Xavier <shiningxc@gmail.com> ---
Well, I saw that comment, so I gave it a try, but I am not sure if I did it correctly. It doesn't seem too bad but maybe there are still cleaner ways. Also I wasn't sure which buf size to choose, and how to comment the code.
Xav
lib/libalpm/util.c | 30 ++++++++++++++++++++++++------ 1 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/lib/libalpm/util.c b/lib/libalpm/util.c index 92a0a26..0b7750e 100644 --- a/lib/libalpm/util.c +++ b/lib/libalpm/util.c @@ -675,22 +675,40 @@ int _alpm_test_md5sum(const char *filepath, const char *md5sum)
char *_alpm_archive_fgets(char *line, size_t size, struct archive *a) { - /* for now, just read one char at a time until we get to a - * '\n' char. we can optimize this later with an internal - * buffer. */ + /* internal buffer */ + static char buf[BUFSIZ]; + /* index of the next char to read in the buffer */ + static int index = 0; + /* total number of char in the buffer */ + static int nread = 0; Can't index just be a char* pointer instead? That should make things a bit simpler below. if(index == &buf), for example, and line[loop] = index++;
/* leave room for zero terminator */ int want = size - 1; int loop;
for(loop = 0; loop < want; loop++) { - int ret = archive_read_data(a, line + loop, 1); + if(index == 0) { + /* fill the buffer */ + nread = archive_read_data(a, buf, BUFSIZ); + } + if(index < nread) { + /* get the next char from the buffer */ + line[loop] = buf[index++]; + } else { + /* nothing left to read */ + line[loop] = '\0'; + index = 0; + } + if(index == BUFSIZ) { + /* end of the buffer, so reset the index to refill it */ + index = 0; + }
Unless I'm mistaken, this should correctly handle the buffer running out in the middle of a fgets call. The best way to test your code here might be to set the buffer to something very small such as 4, and see what happens. This will also ensure you never overrun your buffer, etc.
/* special check for first read- if null, return null, * this indicates EOF */ - if(loop == 0 && (ret <= 0 || line[loop] == '\0')) { + if(loop == 0 && line[loop] == '\0') { return(NULL); } /* check if read value was null or newline */ - if(ret <= 0 || line[loop] == '\0' || line[loop] == '\n') { + if(line[loop] == '\0' || line[loop] == '\n') { want = loop + 1; break; } -- 1.5.4.4
The big issues: 1. Once you use archive_fgets, you can't do a manual archive_read* call since data is buffered here. This should get noted in a comment. 2. I can't call this function with one archive followed by a call with another archive. This would obviously cause issues, and we should probably find a way to address this. We might need a helper struct that the caller has to pass in containing anything static so this function is thread-safe, etc. 3. This can be optimized even more by removing the 1 char at a time loop and making good use of memchr and memcpy. This is what I originally planned to do when I got around to making this a buffered implementation. Good work though, glad to see someone else is interested in making improvements here, and this is a large component of the read directly from db.tar.gz files, although you can see from my other patch it can already be put to use elsewhere in the code. -Dan