[pacman-dev] Bug in libalpm: sizeof(off_t)

Dave Reisner d at falconindy.com
Mon Nov 25 11:20:17 EST 2013


On Mon, Nov 25, 2013 at 10:54:05AM -0500, Jeremy Heiner wrote:
> On Mon, Nov 25, 2013 at 10:27 AM, Dave Reisner <d at falconindy.com> wrote:
> > On Mon, Nov 25, 2013 at 09:30:01AM -0500, Jeremy Heiner wrote:
> >> On Mon, Nov 25, 2013 at 9:12 AM, Allan McRae <allan at archlinux.org> wrote:
> >> > From searching around, we are definitely not the first people to run
> >> > into this issue, but seem to be the only ones who are trying to code around.
> >> >
> >> > Allan
> >>
> >> I think the reason is that most libraries that face this issue have
> >> more exposure to it than libalpm does. For example GPGME passes file
> >> descriptors across its API, and that is a much bigger can of worms.
> >> It's not surprising that they choose to avoid that mess. But the
> >> libalpm API only passes off_t across. All the file stuff is completely
> >> encapsulated, so most of the largefile problems are simply not
> >> problems for libalpm.
> >> Jeremy
> >
> > A cursory examination of alpm.h shows that this is definitely not the
> > case -- it isn't only the "file stuff", and even that API isn't
> > abstracted away since we publicly expose off_t in some of our structs.
> > And, we have callbacks which pass off_t as well. At best, you read a
> > corrupt value. At worst, you crash after wrongly calculating offsets of
> > other struct members.
> >
> > This isn't a new problem for alpm or anyone else dealing with large
> > files. It's simply a trait of the target system which has to be adhered
> > to.
> >
> > I propose we do nothing.
> >
> > d
> 
> Hi, Dave. Thanks for your reply, but I am a bit confused.
> I said that the only exposure the libalpm API has to the largefile
> problems is that it passes off_t across. You seem to be saying there
> is greater exposure than that, but all the examples you pointed to are
> where off_t is being passed across the API. I did look carefully
> through the API for any other largefile issues and found zero. Of
> course I am only human. So if, as you said, that "is definitely not
> the case", then could you please point me to where the exposure (other
> than off_t) is in the libalpm API?
> Jeremy
> 

Your confusion confuses me, since you seem to point out the problem and
then dismiss it for reasons unknown. _FILE_OFFSET_BITS is explicitly for
determining the size of off_t and the usage of various *64 functions.
I'm not sure why you're expecting other cases that aren't directly
related to the usage of off_t and passing it across API calls.

If the library and the application have different definitions of how
large an off_t is, then you run into potential corruption and crashes.

Consider the following struct:

  struct alpm_file_t {
    char *name;
    off_t size;
    mode_t mode;
  }

When compiling for i686, this struct changes its size (and therefore
changes ABI) based on the value of _FILE_OFFSET_BITS:

32: the struct is 12 bytes
64: the struct is 16 bytes

This isn't relevant for x86_64 where off_t is always 64 bits.

If an i686 system compiles libalpm without large file support and then
compiles an application *with* large file support that links to alpm,
you read a full 64 bits when encountering an off_t (which is wrong). The
inverse of this (where ALPM has LFS and the application does not) is
equally painful as it can still result in miscalculation of the offsets
of struct members.

d


More information about the pacman-dev mailing list