[aur-general] Standardized CPAN Package Versions
Hi, I have CC'd this to aur-general as this concerns all CPAN packagers on Arch. CPAN package versions are a mess on Arch Linux. It seems that many if not most CPAN packagers on Arch are unaware of how CPAN deals with versions and thus they do not correctly translate the CPAN version. Most of the time a naive translation is used instead, and this prevents Pacman from working correctly. To give an example, CPAN considers "1.15" to be a later version than "1.23.0", whereas Pacman will consider the latter version the newer version. This is because "1.15" is short for "1.150" on CPAN, whereas "1.23.0" is short for "1.023.0". This can be confirmed using the "version" module[1]. I have written a simple Perl script[2] that addresses this issue.* It accepts CPAN versions as command-line arguments and prints out standardized Pacman package versions. The package versions enable Pacman to correctly compare CPAN packages, e.g. when resolving minimal dependencies. I propose that the script be packaged and made an official tool for Perl package developers.** It should also be included on the Perl Package Guidelines wiki page[3]. I hope that this will be seriously considered as it addresses a real issue. Tools that generate CPAN packages for Pacman (e.g. Bauerbill) cannot work properly when there is no consistency in how Arch packages are versioned. Please note as well that this is completely separate from my previous (and continued) contention regarding the "provides" array of CPAN packages. Regards, Xyne [1]: http://search.cpan.org/~jpeacock/version-0.88/ [2]: http://xyne.archlinux.ca/scripts/pacman/ver_cpan2pacman [3]: https://wiki.archlinux.org/index.php/Perl_Package_Guidelines * The script only depends on Perl. The name and license are preliminary. I will gladly change both to something more suitable. * I would be willing to maintain it, but an official tool should probably be included in [core] or [extra], not [community].
On Sun, Feb 6, 2011 at 5:16 PM, Xyne <xyne@archlinux.ca> wrote:
I have CC'd this to aur-general as this concerns all CPAN packagers on Arch.
CPAN package versions are a mess on Arch Linux. It seems that many if not most CPAN packagers on Arch are unaware of how CPAN deals with versions and thus they do not correctly translate the CPAN version. Most of the time a naive translation is used instead, and this prevents Pacman from working correctly.
To give an example, CPAN considers "1.15" to be a later version than "1.23.0", whereas Pacman will consider the latter version the newer version. This is because "1.15" is short for "1.150" on CPAN, whereas "1.23.0" is short for "1.023.0". This can be confirmed using the "version" module[1].
I have written a simple Perl script[2] that addresses this issue.* It accepts CPAN versions as command-line arguments and prints out standardized Pacman package versions. The package versions enable Pacman to correctly compare CPAN packages, e.g. when resolving minimal dependencies.
I propose that the script be packaged and made an official tool for Perl package developers.** It should also be included on the Perl Package Guidelines wiki page[3].
I hope that this will be seriously considered as it addresses a real issue. Tools that generate CPAN packages for Pacman (e.g. Bauerbill) cannot work properly when there is no consistency in how Arch packages are versioned.
Please note as well that this is completely separate from my previous (and continued) contention regarding the "provides" array of CPAN packages.
I hate to ask but if it's to be official, do you have any unit tests? --Kaiting. -- Kiwis and Limes: http://kaitocracy.blogspot.com/
On Sun, Feb 6, 2011 at 7:52 PM, Kaiting Chen <kaitocracy@gmail.com> wrote:
I hate to ask but if it's to be official, do you have any unit tests?
I realize that all it does is hook into the 'version' module and resolves some underscore business. But tests make everyone feel better, even though no one likes writing them. --Kaiting. -- Kiwis and Limes: http://kaitocracy.blogspot.com/
Kaiting Chen wrote:
On Sun, Feb 6, 2011 at 7:52 PM, Kaiting Chen <kaitocracy@gmail.com> wrote:
I hate to ask but if it's to be official, do you have any unit tests?
I realize that all it does is hook into the 'version' module and resolves some underscore business. But tests make everyone feel better, even though no one likes writing them. --Kaiting.
I'm sorry, but I consider this a nonsensical pseudo-technical question. As mentioned in my other reply, the "version" module is part of the official Perl distribution (and thus included in the "perl" package). It can therefore be considered official and it should be stable, and therefor so should its output when parsing versions. The function in my script is extremely simple. It does the following: * checks if the passed version is defined * checks if the passed version can be parsed by "version" * converts the version to a pure decimal form with "numify" from version * replaces the underscore (representing an alpha version) with "a" (see Pacman's documentation if you do not know how it treats letters in versions) * inserts a decimal point between the major and minor version numbers, which formats all versions to x.xxx or x.xxx.xxx The last two steps are the only extras. Everything else is within the "version" module. The replacement of "_" with "a" follows directly from the meaning of "_" on CPAN and the way Pacman handles letters in versions as alpha releases. The final step provides a standardized package version that Pacman can understand and which is the most human-readable. Both of the final steps are simple linear transformations. Where exactly would you like to see unit tests for this script? Regards, Xyne
On Sun, Feb 6, 2011 at 8:57 PM, Xyne <xyne@archlinux.ca> wrote:
Both of the final steps are simple linear transformations. Where exactly would you like to see unit tests for this script?
Ahh, never mind Xyne. I didn't really understand the (extended) regex at the bottom which is why I asked, but taking another look it seems like it's pretty simple and if you feel confident that it will produce vercmp correct behavior then I'll take your word for it. --Kaiting. -- Kiwis and Limes: http://kaitocracy.blogspot.com/
On Sun, Feb 6, 2011 at 5:16 PM, Xyne <xyne@archlinux.ca> wrote:
Hi,
I have CC'd this to aur-general as this concerns all CPAN packagers on Arch.
CPAN package versions are a mess on Arch Linux. It seems that many if not most CPAN packagers on Arch are unaware of how CPAN deals with versions and thus they do not correctly translate the CPAN version. Most of the time a naive translation is used instead, and this prevents Pacman from working correctly.
To give an example, CPAN considers "1.15" to be a later version than "1.23.0", whereas Pacman will consider the latter version the newer version. This is because "1.15" is short for "1.150" on CPAN, whereas "1.23.0" is short for "1.023.0". This can be confirmed using the "version" module[1].
Could you give real-life examples? I have not seen a case where CPAN is confused by that. You seem to be saying that the packagers are at fault here but I always blamed the CPAN module authors. From what I remember, the biggest problem is when changing the number of digits. For example, if you go 0.8, 0.9, to 0.10. Ding. Bad. Switching from decimal to dotted decimal (multiple decimal points) also has problems. Regarding the "version" module, ou will get different results using the older "qv" function. I assume you are using the "parse" class-method. Some old code still uses "qv" I imagine. I need to convert some of mine, for example. I will have to look carefully at this behavior before I do. How do you know that CPAN uses the version module for comparing versions? Again, do you have an example of bad version comparison CPAN is performing? Self-indulgent side-story: I once suspected CPAN used the date a package was uploaded for comparing versions. This is half true. Look at the Taint module: http://search.cpan.org/dist/Taint/ the only example I know of. The latest version is older in age than the other two versions on CPAN. Which one is downloaded? 0.9, the highest version which is also the one with the oldest upload date. Which one is displayed first? 0.7, the version uploaded most recently.
I have written a simple Perl script[2] that addresses this issue.* It accepts CPAN versions as command-line arguments and prints out standardized Pacman package versions. The package versions enable Pacman to correctly compare CPAN packages, e.g. when resolving minimal dependencies.
Because I do not understand the problem very well I have a hard time deciding if your script fixes it. -- -Justin
Justin Davis wrote:
Could you give real-life examples? I have not seen a case where CPAN is confused by that. You seem to be saying that the packagers are at fault here but I always blamed the CPAN module authors. From what I remember, the biggest problem is when changing the number of digits. For example, if you go 0.8, 0.9, to 0.10. Ding. Bad. Switching from decimal to dotted decimal (multiple decimal points) also has problems.
You seem to think that I am saying that the versions on CPAN are wrong. I am not. I am saying that Arch packagers do not understand the CPAN version schemes and thus fail to correctly convert CPAN versions to Pacman package versions ($pkgver). For example, a CPAN package could be update from 0.199 to 0.2. Pacman will consider 0.199 to be the newer version, e.g.: $ vercmp 0.199 0.2 1 A "force" flag would thus be required to update the package. The standardized versions would be 0.199 and 0.200, which pacman can correctly compare. The whole point of this is that CPAN has a very specific versioning scheme that does not directly translate to Arch. It has syntax for major versions, minor versions, alpha versions, etc. They is also a mixture of different syntax due to legacy version strings that have not been updated. The provided script can generate standardized versions using the "version" module which was designed for this, and which is distributed as part of the Perl distribution and can thus be considered official itself. As the developer of tools to package CPAN packages for Pacman automatically (Pacpan and Bauerbill), I can assure you that the the lack of standardized versions in Arch poses a real-world problem. There is no way to reliably generate PKGBUILDs with versioned dependencies as long as there is no standard conversions. The versions on CPAN can be directly compared using the version module. We must format versions in a way that Pacman can compare, and that is what this script does. Regards, Xyne
On Sun, Feb 6, 2011 at 8:36 PM, Xyne <xyne@archlinux.ca> wrote:
You seem to think that I am saying that the versions on CPAN are wrong. I am not. I am saying that Arch packagers do not understand the CPAN version schemes and thus fail to correctly convert CPAN versions to Pacman package versions ($pkgver).
You said a version comparison done by "CPAN" is wrong here:
To give an example, CPAN considers "1.15" to be a later version than "1.23.0", ...
You seem to be lumping the version module and CPAN together. This is what is confusing in your message. You mentioned a CPAN version scheme but there is no such thing. CPAN authors are free to version things like crazy, however they like. I can upload a distribution file to the CPAN with a version that is less than my last version. CPAN will politely notify me of my mistake and release the distribution anyways.
For example, a CPAN package could be update from 0.199 to 0.2. Pacman will consider 0.199 to be the newer version, e.g.:
$ vercmp 0.199 0.2 1
A "force" flag would thus be required to update the package. The standardized versions would be 0.199 and 0.200, which pacman can correctly compare.
The whole point of this is that CPAN has a very specific versioning scheme that does not directly translate to Arch. It has syntax for major versions, minor versions, alpha versions, etc. They is also a mixture of different syntax due to legacy version strings that have not been updated. The provided script can generate standardized versions using the "version" module which was designed for this, and which is distributed as part of the Perl distribution and can thus be considered official itself.
CPAN has no specific versioning scheme at all. Many versions of distributions on CPAN follow the simple decimal format like 1.23. Some have the dotted-decimal format of 1.23.45 etc. Others have dates for versions like 20101234. Sometimes new releases of a distribution change from one scheme to the next. It's chaos. The version module has in the past been unreliable. It is bloated and even changes behavior. This is mentioned in my previous email. The old behavior used the 'qv' function while the new behavior uses the 'parse' class method. Yes, they give different results. I even tried using the version module for my module's $VERSION, which ended up prefixing the version in my distribution file with a 'v'. Annoying.
As the developer of tools to package CPAN packages for Pacman automatically (Pacpan and Bauerbill), I can assure you that the the lack of standardized versions in Arch poses a real-world problem. There is no way to reliably generate PKGBUILDs with versioned dependencies as long as there is no standard conversions.
Certainly you know by now, I create a similar tool with my CPANPLUS::Dist::Arch module. You know I have seen the same problems. How about we slow down a little and work together to try to fix this. A good first step would be to clearly define the problem. There is also plenty of test data available, the entire CPAN and Backpan with thousands of versions to play with. What Kaiting says is absolutely correct. Why not test this out and gather some data before asking everyone to start using it? I would have to see for myself whether this worked for a majority of cases before adopting it. Before that, clearly defining the problem in a document with some real data would be helpful. Then some scripts could be made to gather versions and comparisons. We could even work with the perl community to try to clean up their versions or flag the offensive versions. There is also the problem which stopped me from probing further on this subject. If some packages do not use the same method as I do in normalizing versions than it is all for naught. There could be two packages with different version strings, representative of the same original CPAN distribution, which pacman evaluates differently. -- -Justin
On 02/07/2011 03:36 AM, Xyne wrote:
Justin Davis wrote:
Could you give real-life examples? I have not seen a case where CPAN is confused by that. You seem to be saying that the packagers are at fault here but I always blamed the CPAN module authors. From what I remember, the biggest problem is when changing the number of digits. For example, if you go 0.8, 0.9, to 0.10. Ding. Bad. Switching from decimal to dotted decimal (multiple decimal points) also has problems.
You seem to think that I am saying that the versions on CPAN are wrong. I am not. I am saying that Arch packagers do not understand the CPAN version schemes and thus fail to correctly convert CPAN versions to Pacman package versions ($pkgver).
For example, a CPAN package could be update from 0.199 to 0.2. Pacman will consider 0.199 to be the newer version, e.g.:
$ vercmp 0.199 0.2 1
doesn't epoch from pacman 3.5 solve this crazy versions in the future? -- Ionuț
participants (4)
-
Ionuț Bîru
-
Justin Davis
-
Kaiting Chen
-
Xyne