[pacman-dev] Bug in libalpm: sizeof(off_t)

Tue Nov 26 02:14:55 EST 2013

On Mon, Nov 25, 2013 at 1:28 PM, Dave Reisner <d at falconindy.com> wrote:
> On Mon, Nov 25, 2013 at 11:56:54AM -0500, Jeremy Heiner wrote:
>> On Mon, Nov 25, 2013 at 11:20 AM, Dave Reisner <d at falconindy.com> wrote:
>> > On Mon, Nov 25, 2013 at 10:54:05AM -0500, Jeremy Heiner wrote:
>> >> On Mon, Nov 25, 2013 at 10:27 AM, Dave Reisner <d at falconindy.com> wrote:
>> >> > On Mon, Nov 25, 2013 at 09:30:01AM -0500, Jeremy Heiner wrote:
>> >> >> On Mon, Nov 25, 2013 at 9:12 AM, Allan McRae <allan at archlinux.org> wrote:
>> >> >> > From searching around, we are definitely not the first people to run
>> >> >> > into this issue, but seem to be the only ones who are trying to code around.
>> >> >> >
>> >> >> > Allan
>> >> >>
>> >> >> I think the reason is that most libraries that face this issue have
>> >> >> more exposure to it than libalpm does. For example GPGME passes file
>> >> >> descriptors across its API, and that is a much bigger can of worms.
>> >> >> It's not surprising that they choose to avoid that mess. But the
>> >> >> libalpm API only passes off_t across. All the file stuff is completely
>> >> >> encapsulated, so most of the largefile problems are simply not
>> >> >> problems for libalpm.
>> >> >> Jeremy
>> >> >
>> >> > A cursory examination of alpm.h shows that this is definitely not the
>> >> > case -- it isn't only the "file stuff", and even that API isn't
>> >> > abstracted away since we publicly expose off_t in some of our structs.
>> >> > And, we have callbacks which pass off_t as well. At best, you read a
>> >> > corrupt value. At worst, you crash after wrongly calculating offsets of
>> >> > other struct members.
>> >> >
>> >> > This isn't a new problem for alpm or anyone else dealing with large
>> >> > files. It's simply a trait of the target system which has to be adhered
>> >> > to.
>> >> >
>> >> > I propose we do nothing.
>> >> >
>> >> > d
>> >>
>> >> Hi, Dave. Thanks for your reply, but I am a bit confused.
>> >> I said that the only exposure the libalpm API has to the largefile
>> >> problems is that it passes off_t across. You seem to be saying there
>> >> is greater exposure than that, but all the examples you pointed to are
>> >> where off_t is being passed across the API. I did look carefully
>> >> through the API for any other largefile issues and found zero. Of
>> >> course I am only human. So if, as you said, that "is definitely not
>> >> the case", then could you please point me to where the exposure (other
>> >> than off_t) is in the libalpm API?
>> >> Jeremy
>> >>
>> >
>> > Your confusion confuses me, since you seem to point out the problem and
>> > then dismiss it for reasons unknown. _FILE_OFFSET_BITS is explicitly for
>> > determining the size of off_t and the usage of various *64 functions.
>> > I'm not sure why you're expecting other cases that aren't directly
>> > related to the usage of off_t and passing it across API calls.
>> >
>> > If the library and the application have different definitions of how
>> > large an off_t is, then you run into potential corruption and crashes.
>> >
>> > Consider the following struct:
>> >
>> >   struct alpm_file_t {
>> >     char *name;
>> >     off_t size;
>> >     mode_t mode;
>> >   }
>> >
>> > When compiling for i686, this struct changes its size (and therefore
>> > changes ABI) based on the value of _FILE_OFFSET_BITS:
>> >
>> > 32: the struct is 12 bytes
>> > 64: the struct is 16 bytes
>> >
>> > This isn't relevant for x86_64 where off_t is always 64 bits.
>> >
>> > If an i686 system compiles libalpm without large file support and then
>> > compiles an application *with* large file support that links to alpm,
>> > you read a full 64 bits when encountering an off_t (which is wrong). The
>> > inverse of this (where ALPM has LFS and the application does not) is
>> > equally painful as it can still result in miscalculation of the offsets
>> > of struct members.
>> >
>> > d
>> >
>>
>> Yes, what you have described is the extent of the exposure to the
>> problem that the libalpm API has. Allen was pointing to other
>> libraries which have greater exposure (e.g. the file descriptors
>> passed across the GPGME API) and suggesting that their solution be
>> adopted universally. My point was that the two different levels of
>> exposure are different. The lower level admits a solution, while the
>> greater level seems intractable. The solution I proposed addresses the
>> exposure that libalpm has, but would be inadequate for GPGME. You seem
>> to be suggesting that my proposal for libalpm be rejected because it
>> fails to solve GPGME's problem. That is the source of my confusion.
>> Jeremy
>>
>
> I think Allan was simply pointing out an instance where a library
> documents the effect of LFS on the library. The extent of the exposure
> doesn't seem worth quantifying. It's an ABI mismatch to compile some
> things with LFS and some without. IMO, this is a target machine concern,
> not one of every single library to perform due dilligence. Do you also
> want to try and code around various compiler flags which can affect ABI
> (see gcc(1) for flags like -mstack-offset, or -m{soft,hard}-float)?
>
> If you'd still prefer to solve this, I'd suggest you take a look at
> /usr/include/curl/curl{build-*,rules}.h which solve similar mismatches
> at compile time.
>
> I'll also mention that rtorrent used to have a runtime check for this to
> ensure that sizeof(off_t) == 8, but this was mostly because the software
> hadn't been thoroughly tested with sizeof(off_t) == 4. So, do we know
> that ALPM behaves sanely when LFS isn't enabled?
>
> Really think this should just be solved by documenting...
>
> d
>

Yes, Allan cited GPGME's "just document it" approach as typical. Which
it is. I'm only pointing out that libalpm has a superior API when it
comes to largefile issues, and thus has options that are not available
to those more typical libraries. How is it not worth distinguishing
between having some options available versus having none?

I agree with you that libraries should _not_ bear the responsibility
to check every single potential ABI mismatch. But it is not an
all-or-nothing situation. There is no slippery slope from doing simple
checks for common stumbling blocks to checking every single potential
ABI mismatch. The impossibility of checking absolutely everything is
not a good reason to omit checks that are easy and appropriate.

The curl approach is clearly overkill for the libalpm problem. Again
you seem to be criticizing my proposal for libalpm because it fails to
solve some other library's problem. And that still confuses me.

The behavior of libalpm without LFS is an interesting question, but
not at all relevant to what I've proposed. I'm fairly certain that the
committee that gave us LFS seventeen(?) years ago would say "it'll
work fine". Then again, they did give us LFS. ;)
Jeremy