[aur-general] pkgstats and unused [community] packages
I've been polling <https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of the packages in [community] are not installed by anyone. Can we get a list of these unused packages? Would it be a good idea to start moving them to [unsupported]? -- Chris
On 10/25/2010 01:35 PM, Christopher Brannon wrote:
I've been polling<https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of the packages in [community] are not installed by anyone. Can we get a list of these unused packages? Would it be a good idea to start moving them to [unsupported]?
-- Chris
is indeed a good idea. lets clean it up! :D -- Ionuț
On 25.10.2010 12:35, Christopher Brannon wrote:
I've been polling <https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of the packages in [community] are not installed by anyone.
The page doesn't show the whole database. -- Florian Pritz -- {flo,bluewind}@server-speed.net
On 25 October 2010 15:06, Florian Pritz <bluewind@server-speed.net> wrote:
On 25.10.2010 12:35, Christopher Brannon wrote:
I've been polling <https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of the packages in [community] are not installed by anyone.
The page doesn't show the whole database.
-- Florian Pritz -- {flo,bluewind}@server-speed.net<bluewind%7D@server-speed.net>
Is there a way to get the whole database ? Would be cool if we could play with the raw statistics from pkgstat ;)
On Mon, 25 Oct 2010 15:13:18 +0100, Brieuc ROBLIN <brieuc.roblin@gmail.com> wrote:
On 25 October 2010 15:06, Florian Pritz <bluewind@server-speed.net> wrote:
On 25.10.2010 12:35, Christopher Brannon wrote:
I've been polling <https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of the packages in [community] are not installed by anyone.
The page doesn't show the whole database.
-- Florian Pritz -- {flo,bluewind}@server-speed.net<bluewind%7D@server-speed.net>
Is there a way to get the whole database ? Would be cool if we could play with the raw statistics from pkgstat ;)
I can't tell when it'll be done but I am working on a new implementation. I plan to provide the raw data in different formats (plain text, csv and json). The current page only displays packages with >= 1% usage. So 50% of community is used by less than 1%. -- Pierre Schmitz, https://users.archlinux.de/~pierre
On 25 October 2010 16:13, Pierre Schmitz <pierre@archlinux.de> wrote:
On Mon, 25 Oct 2010 15:13:18 +0100, Brieuc ROBLIN <brieuc.roblin@gmail.com> wrote:
On 25 October 2010 15:06, Florian Pritz <bluewind@server-speed.net> wrote:
On 25.10.2010 12:35, Christopher Brannon wrote:
I've been polling <https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of the packages in [community] are not installed by anyone.
The page doesn't show the whole database.
-- Florian Pritz -- {flo,bluewind}@server-speed.net<bluewind%7D@server-speed.net> <bluewind%7D@server-speed.net <bluewind%257D@server-speed.net>>
Is there a way to get the whole database ? Would be cool if we could play with the raw statistics from pkgstat ;)
I can't tell when it'll be done but I am working on a new implementation. I plan to provide the raw data in different formats (plain text, csv and json).
The current page only displays packages with >= 1% usage. So 50% of community is used by less than 1%.
-- Pierre Schmitz, https://users.archlinux.de/~pierre<https://users.archlinux.de/%7Epierre>
That would be great. if you need any help to get things done, feel free to ask ;)
Brieuc ROBLIN wrote:
On 25 October 2010 15:06, Florian Pritz <bluewind@server-speed.net> wrote:
On 25.10.2010 12:35, Christopher Brannon wrote:
I've been polling <https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of the packages in [community] are not installed by anyone.
The page doesn't show the whole database.
-- Florian Pritz -- {flo,bluewind}@server-speed.net<bluewind%7D@server-speed.net>
Is there a way to get the whole database ? Would be cool if we could play with the raw statistics from pkgstat ;)
Not all users submit stats so unless Arch installs spyware on everyone's system or pools download stats from the mirrors, those states are not sufficient to motivate removals.
Well, we can discuss the generated list of the unused packages, I'd be agree if a package that I maintain it will be removed (in fact is not remove at all, is a move to AUR) if it's unused. Cheers -- Angel Velásquez angvp @ irc.freenode.net Arch Linux Developer / Trusted User Linux Counter: #359909 http://www.angvp.com
On 10/25/2010 08:49 PM, Ángel Velásquez wrote:
Well, we can discuss the generated list of the unused packages, I'd be agree if a package that I maintain it will be removed (in fact is not remove at all, is a move to AUR) if it's unused.
Cheers
here is the list http://wiki.archlinux.org/index.php/Community_cleanup -- Ionuț
On Monday 25 October 2010 19:50:16 Ionuț Bîru wrote:
here is the list http://wiki.archlinux.org/index.php/Community_cleanup I agree to remove them too. At least packages with 0% of usage can be removed.
-- Andrea Scarpino Arch Linux Developer
On Mon, Oct 25, 2010 at 8:22 PM, Andrea Scarpino <andrea@archlinux.org>wrote:
On Monday 25 October 2010 19:50:16 Ionuț Bîru wrote:
here is the list http://wiki.archlinux.org/index.php/Community_cleanup I agree to remove them too. At least packages with 0% of usage can be removed.
-- Andrea Scarpino Arch Linux Developer
And what if we are actually using some of them ? I can think of "cacti" for instance. -- Cédric Girard
On 10/25/2010 09:22 PM, Andrea Scarpino wrote:
On Monday 25 October 2010 19:50:16 Ionuț Bîru wrote:
here is the list http://wiki.archlinux.org/index.php/Community_cleanup I agree to remove them too. At least packages with 0% of usage can be removed.
lets don't rush. first lets discuss what can be kept. for example i want to keep thunberbird-spell and gambas2, python2-docs, python2-pysfml -- Ionuț
On 10/25/2010 09:22 PM, Andrea Scarpino wrote:
On Monday 25 October 2010 19:50:16 Ionuț Bîru wrote:
here is the list http://wiki.archlinux.org/index.php/Community_cleanup
I agree to remove them too. At least packages with 0% of usage can be removed.
lets don't rush. first lets discuss what can be kept.
for example i want to keep thunberbird-spell and gambas2, python2-docs, python2-pysfml
Hi, I think there are some server packages, for example ejabberd and roundcubemail. These packages are only interesting for root servers, but not for desktop computers. But I think they should stay in community, because there are not so many root servers with pkgstats installed. oidentd and pam_mysql you need only in a big network infrastructur. But without that package my university hadn't change there installation from ubuntu to archlinux. I think such packages should stay too. Greez Michael
On 25.10.2010 23:41, archlinux@michael.trunner.de wrote:
Hi, I think there are some server packages, for example ejabberd and roundcubemail. These packages are only interesting for root servers, but not for desktop computers. But I think they should stay in community, because there are not so many root servers with pkgstats installed.
oidentd and pam_mysql you need only in a big network infrastructur. But without that package my university hadn't change there installation from ubuntu to archlinux. I think such packages should stay too.
From my packages I would like to leave in community at least: - ejabberd* packages + jabber transports (pyicqt, yahoo-t, etc) - perl-* modules - synce packages - emacs related packages (emacs-muse, wanderlust, etc) - haskell modules - documentation packages - xl2tpd should be on install isos I think (https://bugs.archlinux.org/task/13357) - sisctrl - sshguard - simh and probably some others.
Am 25.10.2010 19:46, schrieb Xyne:
Brieuc ROBLIN wrote:
On 25 October 2010 15:06, Florian Pritz <bluewind@server-speed.net> wrote:
On 25.10.2010 12:35, Christopher Brannon wrote:
I've been polling <https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of the packages in [community] are not installed by anyone.
The page doesn't show the whole database.
-- Florian Pritz -- {flo,bluewind}@server-speed.net<bluewind%7D@server-speed.net>
Is there a way to get the whole database ? Would be cool if we could play with the raw statistics from pkgstat ;)
Not all users submit stats so unless Arch installs spyware on everyone's system or pools download stats from the mirrors, those states are not sufficient to motivate removals.
I agree. We have a rule for not letting packages enter community which have less than 10 votes _or_ less than 1 percent usage in pkgstats. There were good reasons to have this decision be based on two sources. If we want to enforce removal of packages that do fulfil one of these conditions, we need at least another proposal with a discussion and a voting period afterwards. And I would be against it. Regards Stefan
It looks like someone cleaned up extra :) xf86-input-mutouch was removed and I can not find it in aur.
On 10/25/2010 10:34 PM, Sergej Pupykin wrote:
It looks like someone cleaned up extra :)
xf86-input-mutouch was removed and I can not find it in aur.
yes Date: Monday, October 25, 2010 @ 13:19:02 Author: jgc Revision: 96932 Remove, no longer supported or developed upstream Deleted: xf86-input-elographics/ xf86-input-fpit/ xf86-input-hyperpen/ xf86-input-mutouch/ xf86-input-penmount/ xf86-video-radeonhd/ -- Ionuț
Xyne <xyne@archlinux.ca> writes:
Not all users submit stats so unless Arch installs spyware on everyone's system or pools download stats from the mirrors, those states are not sufficient to motivate removals.
That's an excellent point. It is also fairly trivial to submit bogus data, though I doubt that anyone is doing so. Yes, one metric isn't good enough for determining what should stay and what should go. The more I think about it, the less I like the concept. -- Chris
On Mon 25 Oct 2010 19:46 +0200, Xyne wrote:
Brieuc ROBLIN wrote:
On 25 October 2010 15:06, Florian Pritz <bluewind@server-speed.net> wrote:
On 25.10.2010 12:35, Christopher Brannon wrote:
I've been polling <https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of the packages in [community] are not installed by anyone.
The page doesn't show the whole database.
-- Florian Pritz -- {flo,bluewind}@server-speed.net<bluewind%7D@server-speed.net>
Is there a way to get the whole database ? Would be cool if we could play with the raw statistics from pkgstat ;)
Not all users submit stats so unless Arch installs spyware on everyone's system or pools download stats from the mirrors, those states are not sufficient to motivate removals.
Well, I've argued in the past that if not enough users care enough to give feedback to the developers about what they care about in the distro, either through votes, or pkgstats, or some other way then it's not something that the devs should have to worry about.
Can you vote on packages in community? Also wouldn't it make more sense to pull the usage data from the download servers? Kaiting. On Mon, Oct 25, 2010 at 10:38 PM, Loui Chang <louipc.ist@gmail.com> wrote:
Brieuc ROBLIN wrote:
On 25 October 2010 15:06, Florian Pritz <bluewind@server-speed.net> wrote:
On 25.10.2010 12:35, Christopher Brannon wrote:
I've been polling <https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of
packages in [community] are not installed by anyone.
The page doesn't show the whole database.
-- Florian Pritz -- {flo,bluewind}@server-speed.net<bluewind%7D@server-speed.net> <bluewind%7D@server-speed.net <bluewind%257D@server-speed.net>>
Is there a way to get the whole database ? Would be cool if we could
On Mon 25 Oct 2010 19:46 +0200, Xyne wrote: the play
with the raw statistics from pkgstat ;)
Not all users submit stats so unless Arch installs spyware on everyone's system or pools download stats from the mirrors, those states are not sufficient to motivate removals.
Well, I've argued in the past that if not enough users care enough to give feedback to the developers about what they care about in the distro, either through votes, or pkgstats, or some other way then it's not something that the devs should have to worry about.
-- Kiwis and Limes: http://kaitocracy.blogspot.com/
On Mon, Oct 25, 2010 at 10:38 PM, Loui Chang <louipc.ist@gmail.com> wrote:
On Mon 25 Oct 2010 19:46 +0200, Xyne wrote:
Not all users submit stats so unless Arch installs spyware on everyone's system or pools download stats from the mirrors, those states are not sufficient to motivate removals.
Well, I've argued in the past that if not enough users care enough to give feedback to the developers about what they care about in the distro, either through votes, or pkgstats, or some other way then it's not something that the devs should have to worry about.
On Mon 25 Oct 2010 22:35 -0400, Kaiting Chen wrote:
Can you vote on packages in community? Also wouldn't it make more sense to pull the usage data from the download servers?
You can no longer vote on community packages. Pulling usage data from the mirrors would be pretty tricky. I remembered it being discussed either on the bug tracker or mailing list. The only option in this case is pkgstats. I think it would be great to have voting for official packages though.
Pulling usage data from the mirrors would be pretty tricky. I remembered it being discussed either on the bug tracker or mailing list.
Unrelated but thinking ahead, would it be possible to go ahead and get rid of /etc/pacman.d/mirrorlist and pull from a main http://www.archlinux.org/repository? Then have that repository instead be a proxy to the actual mirrors that round robin's them, possibly with some kind of IP geolocated weighting? Then the package downloads can be easily tracked through this main proxy. -- Kiwis and Limes: http://kaitocracy.blogspot.com/
On 10/26/2010 05:40 AM, Kaiting Chen wrote:
Unrelated but thinking ahead, would it be possible to go ahead and get rid of /etc/pacman.d/mirrorlist and pull from a main http://www.archlinux.org/repository? Then have that repository instead be a proxy to the actual mirrors that round robin's them, possibly with some kind of IP geolocated weighting? Then the package downloads can be easily tracked through this main proxy.
To actually track the tcp-traffic (indirectly containing the name of the requested package) archlinux.org would have to _proxy_ the traffic (_all_ data would go _twice_ through their network infrastructure). This would make the concept of mirrors useless. The other possibility would be a round-robin domain name (like e.g. irc.freenode.net). This way archlinux.org could only log that a connection was made, but not which packages were requested. (Additionally all mirrors would have to use the same folder hierarchy) TL,DR: There is no technical way to monitor all package downloads. Regards, PyroPeter -- freenode/pyropeter "12:50 - Ich drücke Return."
Am Dienstag 26 Oktober 2010, 16:55:27 schrieb PyroPeter:
On 10/26/2010 05:40 AM, Kaiting Chen wrote:
Unrelated but thinking ahead, would it be possible to go ahead and get rid of /etc/pacman.d/mirrorlist and pull from a main http://www.archlinux.org/repository? Then have that repository instead be a proxy to the actual mirrors that round robin's them, possibly with some kind of IP geolocated weighting? Then the package downloads can be easily tracked through this main proxy.
To actually track the tcp-traffic (indirectly containing the name of the requested package) archlinux.org would have to _proxy_ the traffic (_all_ data would go _twice_ through their network infrastructure). This would make the concept of mirrors useless.
The other possibility would be a round-robin domain name (like e.g. irc.freenode.net). This way archlinux.org could only log that a connection was made, but not which packages were requested. (Additionally all mirrors would have to use the same folder hierarchy)
TL,DR: There is no technical way to monitor all package downloads.
Regards, PyroPeter
Why not let pacman do the job (similar to how yaourt uses aurvote)? Let pacman send a "ping" to some server like aurvote does. Michael -- PGP-Key: 51C1D000 Jabber: akurei@furdev.org http://akurei.de
Am 26.10.2010 20:05, schrieb Michael Düll:
Why not let pacman do the job (similar to how yaourt uses aurvote)? Let pacman send a "ping" to some server like aurvote does.
Michael
Because pacman is a distro agnostic tool. Other distros do not have something like the AUR.
Not true, Arch could set up a round robin proxy to other mirrors such that when a package is requested it returns a HTTP 302 or HTTP 303 redirect. Then the only network traffic routed through Arch servers would only be the request HTTP headers which is quite insubstantial but would still allow real package statistics to be retrieved. Kaiting. On Tue, Oct 26, 2010 at 10:55 AM, PyroPeter <abi1789@googlemail.com> wrote:
On 10/26/2010 05:40 AM, Kaiting Chen wrote:
Unrelated but thinking ahead, would it be possible to go ahead and get rid of /etc/pacman.d/mirrorlist and pull from a main http://www.archlinux.org/repository? Then have that repository instead be a proxy to the actual mirrors that round robin's them, possibly with some kind of IP geolocated weighting? Then the package downloads can be easily tracked through this main proxy.
To actually track the tcp-traffic (indirectly containing the name of the requested package) archlinux.org would have to _proxy_ the traffic (_all_ data would go _twice_ through their network infrastructure). This would make the concept of mirrors useless.
The other possibility would be a round-robin domain name (like e.g. irc.freenode.net). This way archlinux.org could only log that a connection was made, but not which packages were requested. (Additionally all mirrors would have to use the same folder hierarchy)
TL,DR: There is no technical way to monitor all package downloads.
Regards, PyroPeter -- freenode/pyropeter "12:50 - Ich drücke Return."
-- Kiwis and Limes: http://kaitocracy.blogspot.com/
On 10/27/2010 02:03 AM, Kaiting Chen wrote:
To actually track the tcp-traffic (indirectly containing the name of the requested package) archlinux.org would have to _proxy_ the traffic (_all_ data would go _twice_ through their network infrastructure). This would make the concept of mirrors useless.
The other possibility would be a round-robin domain name (like e.g. irc.freenode.net). This way archlinux.org could only log that a connection was made, but not which packages were requested. (Additionally all mirrors would have to use the same folder hierarchy)
TL,DR: There is no technical way to monitor all package downloads.
Regards, PyroPeter Not true, Arch could set up a round robin proxy to other mirrors such that when a package is requested it returns a HTTP 302 or HTTP 303 redirect. Then
On Tue, Oct 26, 2010 at 10:55 AM, PyroPeter<abi1789@googlemail.com> wrote: the only network traffic routed through Arch servers would only be the request HTTP headers which is quite insubstantial but would still allow real package statistics to be retrieved.
Kaiting.
Yes, you are right. This would even allow to host the package lists at archlinux.org (I assume they include checksums of the archives) which would help with the security concerns (non-signed packages, etc...) as you would not be forced to trust the mirrors any longer. (as long as you did not use MD5 for the hashes ;-) ) Regards, PyroPeter -- freenode/pyropeter "12:50 - Ich drücke Return."
On Tue, 26 Oct 2010 20:03:30 -0400, Kaiting Chen <kaitocracy@gmail.com> wrote:
Not true, Arch could set up a round robin proxy to other mirrors such that when a package is requested it returns a HTTP 302 or HTTP 303 redirect. Then the only network traffic routed through Arch servers would only be the request HTTP headers which is quite insubstantial but would still allow real package statistics to be retrieved.
Kaiting.
This is called mirrorbrain (ok, it is a little more advanced). We just lack a server and someone to implement this. To make it more effective we'd also need some pacman modifications. -- Pierre Schmitz, https://users.archlinux.de/~pierre
This is called mirrorbrain (ok, it is a little more advanced). We just lack a server and someone to implement this. To make it more effective we'd also need some pacman modifications.
-- Pierre Schmitz, https://users.archlinux.de/~pierre<https://users.archlinux.de/%7Epierre>
Holy shit I just checked out http://www.mirrorbrain.org/; I did not know that something like that existed. I think this weekend I'll go ahead and install it on my server, load the list of Arch mirrors, do a small scale trial. I'll post the results probably next week. Kaiting. -- Kiwis and Limes: http://kaitocracy.blogspot.com/
2010/10/27 Kaiting Chen <kaitocracy@gmail.com>:
This is called mirrorbrain (ok, it is a little more advanced). We just lack a server and someone to implement this. To make it more effective we'd also need some pacman modifications.
-- Pierre Schmitz, https://users.archlinux.de/~pierre<https://users.archlinux.de/%7Epierre>
Holy shit I just checked out http://www.mirrorbrain.org/; I did not know that something like that existed. I think this weekend I'll go ahead and install it on my server, load the list of Arch mirrors, do a small scale trial. I'll post the results probably next week.
If i understand how mirrorbrain works, using it could be also the solution for half-updated mirrors breaking things (I recall people had problems with the libpng/libjpeg rebuilds where moved out of testing a while ago, so much that someone pushed in AUR the old versions of those libraries). Not half bad.
On 26 October 2010 10:38, Loui Chang <louipc.ist@gmail.com> wrote:
On Mon 25 Oct 2010 19:46 +0200, Xyne wrote:
Brieuc ROBLIN wrote:
On 25 October 2010 15:06, Florian Pritz <bluewind@server-speed.net> wrote:
On 25.10.2010 12:35, Christopher Brannon wrote:
I've been polling <https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of the packages in [community] are not installed by anyone.
The page doesn't show the whole database.
-- Florian Pritz -- {flo,bluewind}@server-speed.net<bluewind%7D@server-speed.net>
Is there a way to get the whole database ? Would be cool if we could play with the raw statistics from pkgstat ;)
Not all users submit stats so unless Arch installs spyware on everyone's system or pools download stats from the mirrors, those states are not sufficient to motivate removals.
Well, I've argued in the past that if not enough users care enough to give feedback to the developers about what they care about in the distro, either through votes, or pkgstats, or some other way then it's not something that the devs should have to worry about.
Yes, Allan has also mentioned this in the pkgstats thread [1]. The factor here is the criteria for packages entering community. We say more than 10 votes OR 1% usage, but these cannot correlate. Users contributing to votes may not necessarily be contributing to usage statistics. Even if they did, most often they'd account for less than 1% of the usage pool, even for a package with 50 votes. [1] https://bbs.archlinux.org/viewtopic.php?pid=831180#p831180
On 25 October 2010 18:35, Christopher Brannon <chris@the-brannons.com> wrote:
I've been polling <https://archlinux.de/?page=PackageStatistics> regularly for several weeks, and I've noticed that roughly 50% of the packages in [community] are not installed by anyone. Can we get a list of these unused packages? Would it be a good idea to start moving them to [unsupported]?
I say it would _not_ be a good idea. This is too fast. Some packages that I remember to have had more than 5% usage in the previous pkgstats implementation are nowhere to be seen. Others are recent additions by merit of AUR votes, but are not accounted for in pkgstats. For those, the situation is that you bring them in because they have good votes, only to later move them back because they don't have good usage statistics. That's fine if the "later" is after, say, a year.
I'd say only remove the packages that are orphans. There is no point removing packages that currently have a maintainer, but TUs should look at their list of packages and consider whether some of them are really needed... Allan
Allan McRae <allan@archlinux.org> writes:
I'd say only remove the packages that are orphans.
Here's the list of [community] orphans with less than 1% usage, according to pkgstats: http://paste.xinu.at/98f5 -- Chris
On 26 October 2010 07:48, Christopher Brannon <chris@the-brannons.com> wrote:
Allan McRae <allan@archlinux.org> writes:
I'd say only remove the packages that are orphans.
Here's the list of [community] orphans with less than 1% usage, according to pkgstats: http://paste.xinu.at/98f5
OK that's a good list, and all of those can be moved IMO. I take back part of what I mentioned earlier. There are indeed some packages that I believe no one uses. The best way to handle this is to selectively remove each package that we still want to keep from the wiki list. I've added a filter list, so remove from there (and not the original). Wiki diffs would tell us what has been removed (and by whom). Set up a timeframe along with an official discussion period for this, i.e how long we have until the filter list is final. And then the voting, if needed.
Wow there are some really big name packages on that list. Cacti, freeradius, ajaxterm, etc. I'm using at least 10 packages on the filtered list. I would hate to see them removed. Kaiting. On Mon, Oct 25, 2010 at 8:20 PM, Ray Rashif <schiv@archlinux.org> wrote:
On 26 October 2010 07:48, Christopher Brannon <chris@the-brannons.com> wrote:
Allan McRae <allan@archlinux.org> writes:
I'd say only remove the packages that are orphans.
Here's the list of [community] orphans with less than 1% usage, according to pkgstats: http://paste.xinu.at/98f5
OK that's a good list, and all of those can be moved IMO.
I take back part of what I mentioned earlier. There are indeed some packages that I believe no one uses. The best way to handle this is to selectively remove each package that we still want to keep from the wiki list. I've added a filter list, so remove from there (and not the original). Wiki diffs would tell us what has been removed (and by whom).
Set up a timeframe along with an official discussion period for this, i.e how long we have until the filter list is final. And then the voting, if needed.
-- Kiwis and Limes: http://kaitocracy.blogspot.com/
On Mon, Oct 25, 2010 at 8:20 PM, Ray Rashif <schiv@archlinux.org> wrote:
On 26 October 2010 07:48, Christopher Brannon <chris@the-brannons.com> wrote:
Allan McRae <allan@archlinux.org> writes:
I'd say only remove the packages that are orphans.
Here's the list of [community] orphans with less than 1% usage, according to pkgstats: http://paste.xinu.at/98f5
OK that's a good list, and all of those can be moved IMO.
If it hasn't been done, someone needs to check to make sure that they are not {make,opt}depends of other non-orphaned packages.
I take back part of what I mentioned earlier. There are indeed some packages that I believe no one uses. The best way to handle this is to selectively remove each package that we still want to keep from the wiki list. I've added a filter list, so remove from there (and not the original). Wiki diffs would tell us what has been removed (and by whom).
Set up a timeframe along with an official discussion period for this, i.e how long we have until the filter list is final. And then the voting, if needed.
On Tue, Oct 26, 2010 at 03:46:56AM -0400, Eric Bélanger wrote:
If it hasn't been done, someone needs to check to make sure that they are not {make,opt}depends of other non-orphaned packages.
Done. ucl is a makedepend of upx, gpsmanshp an optdep of gpsman. That's it :)
On 2010-10-26 08:20 +0800 (43:2) Ray Rashif wrote:
On 26 October 2010 07:48, Christopher Brannon <chris@the-brannons.com> wrote:
Allan McRae <allan@archlinux.org> writes:
I'd say only remove the packages that are orphans.
Here's the list of [community] orphans with less than 1% usage, according to pkgstats: http://paste.xinu.at/98f5
OK that's a good list, and all of those can be moved IMO.
I take back part of what I mentioned earlier. There are indeed some packages that I believe no one uses. The best way to handle this is to selectively remove each package that we still want to keep from the wiki list. I've added a filter list, so remove from there (and not the original). Wiki diffs would tell us what has been removed (and by whom).
Set up a timeframe along with an official discussion period for this, i.e how long we have until the filter list is final. And then the voting, if needed.
I can see the point of removing orphans but I still think that using pkgstats as a metric is a bad idea for everything else. Casual users, i.e. those who are not actively involved on the forum or IRC won't even be aware of pkgstats. Really, who installs a distro and actively looks for a way to submit user data? And please don't try to tell me that the only users who matter are the ones who form the core community. Then you have the paranoid who won't submit anything, even if they're a small group. Ultimately pkgstats only reflect the usage of a small group of people with possibly skewed interests. (There should be a few statisticians around so it would be interesting to hear their analysis of this... let's face it, most people fail at interpret ting statistical data and ultimately do so with a bias that supports their own agenda... *cough*politicians*cough*.)* Several of those packages are niche packages too (e.g. python-sympy, vtk, avogadro), but ones that are important within their niche. If they are actively maintained then I see no reason to remove them even if they are not commonly used by the subset of users who submit stats. As it stands, I would support removal of the orphaned packages listed above but not the list based on pkgstats alone. We need a better usage metric for repo packages. Personally I think it would be better to implement a simple online vote and inform users that a package is a candidate for removal in a post_upgrade or post_install message. Users could then vote to keep the package and if it passes a threshold (e.g. 10, as required by AUR), then it does not get removed. Also, consider that a package can be moved to [community] if it gets 10 votes on the AUR. 10 votes out of thousands of users is less than 1%, maybe even less than 0.1% depending on how many AUR users there actually are. Regards, Xyne * pkgstats also uses hashed IPs to form unique IDs. Multiple users behind a single IP would only count as 1 in that case. What if that single IP represents an entire institution with hundreds of installations? p.s. Removing these packages indiscriminately will herald the apocalypse and the end of tacos as we know them.
On Tue, Oct 26, 2010 at 7:36 AM, Xyne <xyne@archlinux.ca> wrote:
I can see the point of removing orphans but I still think that using pkgstats as a metric is a bad idea for everything else. Casual users, i.e. those who are not actively involved on the forum or IRC won't even be aware of pkgstats. Really, who installs a distro and actively looks for a way to submit user data? And please don't try to tell me that the only users who matter are the ones who form the core community.
Then you have the paranoid who won't submit anything, even if they're a small group. Ultimately pkgstats only reflect the usage of a small group of people with possibly skewed interests. (There should be a few statisticians around so it would be interesting to hear their analysis of this... let's face it, most people fail at interpret ting statistical data and ultimately do so with a bias that supports their own agenda... *cough*politicians*cough*.)*
+57, these are all topics that were brought up during the original discussion of using pkgstats as a means to promote packages from unsupported to community, and they were never really addressed. Our system of 10 votes or 1% usage in pkgstats is completely arbitrary. We don't have any statistical means of backing up what those numbers actually mean; they were picked pretty much just because they sounded good. There was even a long-time Trusted User who resigned due to the frustration of arguing over these issues. Anyway, my take on it is that as long as the packages aren't orphans that have been out of date for a *long* time, then what's the harm in keeping them in the repo? If the packages are being maintained anyway, it benefits everyone by having them in there, and unless we're running dangerously low on resources, the cleanup process isn't that necessary. If we _are_ running dangerously low on resources, is it better to drop software that may be used by a lot of people, or would it be better to campaign to raise some money for additional resources? I'm not saying that we never need to prune things up, but at this point in time, we don't have any good means of determining what needs to go aside from the personal judgement of our TUs, which luckily, is pretty reliable. -- Aaron "ElasticDog" Schaefer
On Tue, 26 Oct 2010 08:29:38 -0700 Aaron Bull Schaefer <aaron@elasticdog.com> wrote:
On Tue, Oct 26, 2010 at 7:36 AM, Xyne <xyne@archlinux.ca> wrote:
I can see the point of removing orphans but I still think that using pkgstats as a metric is a bad idea for everything else. Casual users, i.e. those who are not actively involved on the forum or IRC won't even be aware of pkgstats. Really, who installs a distro and actively looks for a way to submit user data? And please don't try to tell me that the only users who matter are the ones who form the core community.
Then you have the paranoid who won't submit anything, even if they're a small group. Ultimately pkgstats only reflect the usage of a small group of people with possibly skewed interests. (There should be a few statisticians around so it would be interesting to hear their analysis of this... let's face it, most people fail at interpret ting statistical data and ultimately do so with a bias that supports their own agenda... *cough*politicians*cough*.)*
+57, these are all topics that were brought up during the original discussion of using pkgstats as a means to promote packages from unsupported to community, and they were never really addressed. Our system of 10 votes or 1% usage in pkgstats is completely arbitrary. We don't have any statistical means of backing up what those numbers actually mean; they were picked pretty much just because they sounded good. There was even a long-time Trusted User who resigned due to the frustration of arguing over these issues.
Anyway, my take on it is that as long as the packages aren't orphans that have been out of date for a *long* time, then what's the harm in keeping them in the repo? If the packages are being maintained anyway, it benefits everyone by having them in there, and unless we're running dangerously low on resources, the cleanup process isn't that necessary. If we _are_ running dangerously low on resources, is it better to drop software that may be used by a lot of people, or would it be better to campaign to raise some money for additional resources? I'm not saying that we never need to prune things up, but at this point in time, we don't have any good means of determining what needs to go aside from the personal judgement of our TUs, which luckily, is pretty reliable.
I did not have the time to actively participate in this discussion so far but Xyne's and Aaron's opinions are pretty much the same that I think. Moving orphans after some time is good, as people using those can take care of them when they're in the AUR, but that's the only good reason I see in this action. I agree on how the AUR cleanup was proposed but except for the mentioned one, I don't see any really good reason for doing this with the repository. -- Jabber: atsutane@freethoughts.de Blog: http://atsutane.freethoughts.de/ Key: 295AFBF4 FP: 39F8 80E5 0E49 A4D1 1341 E8F9 39E4 F17F 295A FBF4
On Tue 26 Oct 2010 16:36 +0200, Xyne wrote:
On 2010-10-26 08:20 +0800 (43:2) > Ray Rashif wrote:
I take back part of what I mentioned earlier. There are indeed some packages that I believe no one uses. The best way to handle this is to selectively remove each package that we still want to keep from the wiki list. I've added a filter list, so remove from there (and not the original). Wiki diffs would tell us what has been removed (and by whom).
Set up a timeframe along with an official discussion period for this, i.e how long we have until the filter list is final. And then the voting, if needed.
I can see the point of removing orphans but I still think that using pkgstats as a metric is a bad idea for everything else. Casual users, i.e. those who are not actively involved on the forum or IRC won't even be aware of pkgstats. Really, who installs a distro and actively looks for a way to submit user data?
And please don't try to tell me that the only users who matter are the ones who form the core community.
I wouldn't say that. I would say that the only users who matter are the ones that participate. For example you can't justly complain about the results of an election if you haven't educated yourself about it and voted. I believe that before any action is taken to move packages back to unsupported there should be a public notice, and users should be able to give feedback.
Several of those packages are niche packages too (e.g. python-sympy, vtk, avogadro), but ones that are important within their niche. If they are actively maintained then I see no reason to remove them even if they are not commonly used by the subset of users who submit stats.
As it stands, I would support removal of the orphaned packages listed above but not the list based on pkgstats alone. We need a better usage metric for repo packages.
Let's be clear here. This isn't about removal of packages. It's about moving packages from one repo to another. Community to aur/unsupported.
Personally I think it would be better to implement a simple online vote and inform users that a package is a candidate for removal in a post_upgrade or post_install message. Users could then vote to keep the package and if it passes a threshold (e.g. 10, as required by AUR), then it does not get removed.
Hmm, now that's an interesting idea. I like the idea of people giving feedback, and voting. I'm not too keen on putting it in a package's install scripts though.
On Tue, Oct 26, 2010 at 6:53 PM, Loui Chang <louipc.ist@gmail.com> wrote:
I wouldn't say that. I would say that the only users who matter are the ones that participate. For example you can't justly complain about the results of an election if you haven't educated yourself about it and voted.
The thing is, we're not voting on a single package that we feel is better than another package, so we're not looking for informed opinions...we're trying to establish objective/accurate usage numbers for every single package across all Arch Linux users (or at least a statistically appropriate sample of Arch Linux users), which is unrelated to people's activity in the community.
Let's be clear here. This isn't about removal of packages. It's about moving packages from one repo to another. Community to aur/unsupported.
I don't think there's any confusion over these semantics, and I'd point out that there's a large difference between moving a package from Community to Unsupported compared with moving a package from Extra to Community. The fact that Community packages are available by default to all Arch users in binary form is a huge plus...the AUR is a fantastic resource, but there's no built-in way for users to track changes in Unsupported automatically. Also, moving packages from Community to Unsupported can be confusing for users who are expecting binary updates and don't use a wrapper like yaourt that will tell them about the updates after we remove packages from Community. -- Aaron "ElasticDog" Schaefer
On Wed 27 Oct 2010 08:35 -0700, Aaron Bull Schaefer wrote:
On Tue, Oct 26, 2010 at 6:53 PM, Loui Chang <louipc.ist@gmail.com> wrote:
I wouldn't say that. I would say that the only users who matter are the ones that participate. For example you can't justly complain about the results of an election if you haven't educated yourself about it and voted.
The thing is, we're not voting on a single package that we feel is better than another package, so we're not looking for informed opinions...we're trying to establish objective/accurate usage numbers for every single package across all Arch Linux users (or at least a statistically appropriate sample of Arch Linux users), which is unrelated to people's activity in the community.
Sorry that wasn't meant as a direct example. I'm just saying if people don't voice what packages are important, then Devs and TUs shouldn't have to worry about maintaining them so much. The larger community can maintain them in the AUR. Arch is based on community involvement and participation. If something isn't done by a TU or dev, then a user can take the initiative to implement it him/herself. Otherwise don't complain.
Let's be clear here. This isn't about removal of packages. It's about moving packages from one repo to another. Community to aur/unsupported.
I don't think there's any confusion over these semantics, and I'd point out that there's a large difference between moving a package from Community to Unsupported compared with moving a package from Extra to Community. The fact that Community packages are available by default to all Arch users in binary form is a huge plus...the AUR is a fantastic resource, but there's no built-in way for users to track changes in Unsupported automatically. Also, moving packages from Community to Unsupported can be confusing for users who are expecting binary updates and don't use a wrapper like yaourt that will tell them about the updates after we remove packages from Community.
Indeed, binary packages are quite convenient but I believe that should be a privilege reserved for the more commonly used packages. It is unfortunate that we lack a good universal system to track and update source based packages, but I don't think it necessarily means unused packages should remain in a binary repo I expect our savvy users to be able to figure out what's happening, especially if we ask for their input and announce any moves beforehand.
participants (24)
-
Aaron Bull Schaefer
-
Allan McRae
-
Andrea Scarpino
-
archlinux@michael.trunner.de
-
Brieuc ROBLIN
-
Brieuc Roblin
-
Christopher Brannon
-
Cédric Girard
-
Eric Bélanger
-
Florian Pritz
-
Gianni Vialetto
-
Ionuț Bîru
-
Kaiting Chen
-
Loui Chang
-
Lukas Fleischer
-
Michael Düll
-
Pierre Schmitz
-
PyroPeter
-
Ray Rashif
-
Sergej Pupykin
-
Stefan Husmann
-
Thorsten Töpper
-
Xyne
-
Ángel Velásquez