[pacman-dev] Bug in libalpm: sizeof(off_t)

Jeremy Heiner scalaprotractor at gmail.com
Mon Nov 25 11:56:54 EST 2013


On Mon, Nov 25, 2013 at 11:20 AM, Dave Reisner <d at falconindy.com> wrote:
> On Mon, Nov 25, 2013 at 10:54:05AM -0500, Jeremy Heiner wrote:
>> On Mon, Nov 25, 2013 at 10:27 AM, Dave Reisner <d at falconindy.com> wrote:
>> > On Mon, Nov 25, 2013 at 09:30:01AM -0500, Jeremy Heiner wrote:
>> >> On Mon, Nov 25, 2013 at 9:12 AM, Allan McRae <allan at archlinux.org> wrote:
>> >> > From searching around, we are definitely not the first people to run
>> >> > into this issue, but seem to be the only ones who are trying to code around.
>> >> >
>> >> > Allan
>> >>
>> >> I think the reason is that most libraries that face this issue have
>> >> more exposure to it than libalpm does. For example GPGME passes file
>> >> descriptors across its API, and that is a much bigger can of worms.
>> >> It's not surprising that they choose to avoid that mess. But the
>> >> libalpm API only passes off_t across. All the file stuff is completely
>> >> encapsulated, so most of the largefile problems are simply not
>> >> problems for libalpm.
>> >> Jeremy
>> >
>> > A cursory examination of alpm.h shows that this is definitely not the
>> > case -- it isn't only the "file stuff", and even that API isn't
>> > abstracted away since we publicly expose off_t in some of our structs.
>> > And, we have callbacks which pass off_t as well. At best, you read a
>> > corrupt value. At worst, you crash after wrongly calculating offsets of
>> > other struct members.
>> >
>> > This isn't a new problem for alpm or anyone else dealing with large
>> > files. It's simply a trait of the target system which has to be adhered
>> > to.
>> >
>> > I propose we do nothing.
>> >
>> > d
>>
>> Hi, Dave. Thanks for your reply, but I am a bit confused.
>> I said that the only exposure the libalpm API has to the largefile
>> problems is that it passes off_t across. You seem to be saying there
>> is greater exposure than that, but all the examples you pointed to are
>> where off_t is being passed across the API. I did look carefully
>> through the API for any other largefile issues and found zero. Of
>> course I am only human. So if, as you said, that "is definitely not
>> the case", then could you please point me to where the exposure (other
>> than off_t) is in the libalpm API?
>> Jeremy
>>
>
> Your confusion confuses me, since you seem to point out the problem and
> then dismiss it for reasons unknown. _FILE_OFFSET_BITS is explicitly for
> determining the size of off_t and the usage of various *64 functions.
> I'm not sure why you're expecting other cases that aren't directly
> related to the usage of off_t and passing it across API calls.
>
> If the library and the application have different definitions of how
> large an off_t is, then you run into potential corruption and crashes.
>
> Consider the following struct:
>
>   struct alpm_file_t {
>     char *name;
>     off_t size;
>     mode_t mode;
>   }
>
> When compiling for i686, this struct changes its size (and therefore
> changes ABI) based on the value of _FILE_OFFSET_BITS:
>
> 32: the struct is 12 bytes
> 64: the struct is 16 bytes
>
> This isn't relevant for x86_64 where off_t is always 64 bits.
>
> If an i686 system compiles libalpm without large file support and then
> compiles an application *with* large file support that links to alpm,
> you read a full 64 bits when encountering an off_t (which is wrong). The
> inverse of this (where ALPM has LFS and the application does not) is
> equally painful as it can still result in miscalculation of the offsets
> of struct members.
>
> d
>

Yes, what you have described is the extent of the exposure to the
problem that the libalpm API has. Allen was pointing to other
libraries which have greater exposure (e.g. the file descriptors
passed across the GPGME API) and suggesting that their solution be
adopted universally. My point was that the two different levels of
exposure are different. The lower level admits a solution, while the
greater level seems intractable. The solution I proposed addresses the
exposure that libalpm has, but would be inadequate for GPGME. You seem
to be suggesting that my proposal for libalpm be rejected because it
fails to solve GPGME's problem. That is the source of my confusion.
Jeremy


More information about the pacman-dev mailing list