[pacman-dev] Version comparison algorithm.

Richard Dodd richard.o.dodd at gmail.com
Wed Mar 20 11:19:30 UTC 2019


I'm just trying to get a mathematically sound implementation of all the
properties (traits in rust parlance) that the version exhibits. If there is
an equivalence relation and a hash function as above, then I can store
versions in a hash set/map. If there is a total ordering, then I can store
versions in a binary tree set/map. I'm making a library, so I want to allow
as much as possible without allowing anything unsound. Rust's type system
allows me to make these promises in code, rather than just documenting them.

You can see my implementation at
https://github.com/derekdreery/alpm-experimental/blob/master/src/version.rs,
if you're interested.


On Sat, Mar 16, 2019 at 2:02 PM Richard Dodd <richard.o.dodd at gmail.com>
wrote:

> I'm writing to ask about the version comparison function `rpmvercmp` in
> `libalpm/version.cL83`. The algorithm is complex and I'm trying to
> understand its behavior. I want to hash using the package name and version
> as a key, and so I need a hash function where `i1 == i2 => hash(i1) ==
> hash(i2)` according to the version comparison operation. I'll describe how
> the algorithm behaves and then ask my questions.
>
> The algorithm works on a byte string and uses ascii comparison rules (no
> unicode).
>
>  - First, split the input up into blocks of *alpha*, *digit* or
> *non-alphanum*.
>  - For each pair of blocks
>     - If the types are different, then *non-alphanum* is newer than
> *numeric*, which is newer than *alpha*
>     - If the types are the same, then the rules are
>       - For *non-alphanum*, compare lengths, longer is newer, equal
> lengths are equal segments (so *--* and *::* are the same)
>       - For *alpha* just do a lexicographic comparison (so *b* is newer
> than *a* etc.)
>       - For *numeric*, do a numeric comparison. (this can be done by
> skipping leading zeros, then comparing length, then comparing
> lexicographically, to avoid overflow on integer conversion)
>   - If one input is longer than the other, and all sections so far have
> been equal, then if the next section of the longer is *alpha*, it is older,
> and if it is *numeric* it is newer. (so "1a" is older than "1", "a1" is
> newer than "a").
>   - If the inputs have the same number of sections that are all equal,
> then they are equal.
>
> Questions:
>
>  1. Is the algorithm correctly described here.
>  2. This should mean that if I hash length for *non-numeric*, the string
> with stripped zeros for *numeric*, and just the string for *alpha*, the has
> law should be upheld. Is this correct?
>
> Thanks
> Rich
>
>
>


More information about the pacman-dev mailing list