[pacman-dev] [PATCH] Allow package to display a brief message before sync install
The aim of this is to alert a user to system/package breaking updates before they happen and before they approve the sync install. This is intended primarily for kernel/initscripts/pacman, etc updates when things could go really wrong and need to be known beforehand. Example output below. This adds an alert="" option to the PKGBUILD. This entry is then stored in the package and then the db with repo-add. On a pacman sync operation if any has an alert message it will be displayed before "Proceed with installation" This is a really basic implementation that I'm sure could be brushed up, as I've not used C for a while. However the pacman code is very clean and easy to read, so that made it pretty trivial to add. There's an example repo with one package "alert" at http://mess.iphitus.org/alert-test/ The PKGBUILD for the aforementioned package: http://mess.iphitus.org/alert-test/PKGBUILD Attached patch is against latest git. James # Example output (note: the format is pkgname: message) iphitus(~/projects/alert/pacman)$ sudo pacman -S alert resolving dependencies... looking for inter-conflicts... Targets (1): alert-1-1 Total Download Size: 0.00 MB Total Installed Size: 0.00 MB alert: This package has an alert message. It may contain tacos. Proceed with installation? [Y/n]
On Mon, Sep 14, 2009 at 3:03 PM, James Rayner <iphitus@iphitus.org> wrote:
The aim of this is to alert a user to system/package breaking updates before they happen and before they approve the sync install. This is intended primarily for kernel/initscripts/pacman, etc updates when things could go really wrong and need to be known beforehand. Example output below.
This adds an alert="" option to the PKGBUILD. This entry is then stored in the package and then the db with repo-add. On a pacman sync operation if any has an alert message it will be displayed before "Proceed with installation"
This is a really basic implementation that I'm sure could be brushed up, as I've not used C for a while. However the pacman code is very clean and easy to read, so that made it pretty trivial to add.
There's an example repo with one package "alert" at http://mess.iphitus.org/alert-test/
The PKGBUILD for the aforementioned package: http://mess.iphitus.org/alert-test/PKGBUILD
Attached patch is against latest git.
James
We also had request of displaying messages at the end of a transaction : http://bugs.archlinux.org/task/12861 and request to move displaying of messages from scriptlets to pkgbuild : http://bugs.archlinux.org/task/1571 Should we think about a more general approach here ?
On Mon 14 Sep 2009 17:26 +0200, Xavier wrote:
On Mon, Sep 14, 2009 at 3:03 PM, James Rayner <iphitus@iphitus.org> wrote:
The aim of this is to alert a user to system/package breaking updates before they happen and before they approve the sync install. This is intended primarily for kernel/initscripts/pacman, etc updates when things could go really wrong and need to be known beforehand. Example output below.
This adds an alert="" option to the PKGBUILD. This entry is then stored in the package and then the db with repo-add. On a pacman sync operation if any has an alert message it will be displayed before "Proceed with installation"
This is a really basic implementation that I'm sure could be brushed up, as I've not used C for a while. However the pacman code is very clean and easy to read, so that made it pretty trivial to add.
There's an example repo with one package "alert" at http://mess.iphitus.org/alert-test/
The PKGBUILD for the aforementioned package: http://mess.iphitus.org/alert-test/PKGBUILD
Attached patch is against latest git.
James
We also had request of displaying messages at the end of a transaction : http://bugs.archlinux.org/task/12861
and request to move displaying of messages from scriptlets to pkgbuild : http://bugs.archlinux.org/task/1571
Should we think about a more general approach here ?
I think these messages should be in the ChangeLog, as any alerts or release notes should be, and they shouldn't be printed during installation unless the user specifically asks for it.
2009/9/14 Loui Chang <louipc.ist@gmail.com>
On Mon 14 Sep 2009 17:26 +0200, Xavier wrote:
On Mon, Sep 14, 2009 at 3:03 PM, James Rayner <iphitus@iphitus.org> wrote:
The aim of this is to alert a user to system/package breaking updates before they happen and before they approve the sync install. This is intended primarily for kernel/initscripts/pacman, etc updates when things could go really wrong and need to be known beforehand. Example output below.
This adds an alert="" option to the PKGBUILD. This entry is then stored in the package and then the db with repo-add. On a pacman sync operation if any has an alert message it will be displayed before "Proceed with installation"
This is a really basic implementation that I'm sure could be brushed up, as I've not used C for a while. However the pacman code is very clean and easy to read, so that made it pretty trivial to add.
There's an example repo with one package "alert" at http://mess.iphitus.org/alert-test/
The PKGBUILD for the aforementioned package: http://mess.iphitus.org/alert-test/PKGBUILD
Attached patch is against latest git.
James
We also had request of displaying messages at the end of a transaction : http://bugs.archlinux.org/task/12861
and request to move displaying of messages from scriptlets to pkgbuild : http://bugs.archlinux.org/task/1571
Should we think about a more general approach here ?
I think these messages should be in the ChangeLog, as any alerts or release notes should be, and they shouldn't be printed during installation unless the user specifically asks for it.
Also, these messages could be echo "message here" lines in the post_install/post_upgrade functions of the package.install file. Yes it prints after install, but like it being in the ChangeLog as Loui suggested, the user can always open up the .install file and see it before they do the upgrade and the ChangeLog doesn't print anything anywhere....
On Mon, Sep 14, 2009 at 10:26 AM, Xavier <shiningxc@gmail.com> wrote:
On Mon, Sep 14, 2009 at 3:03 PM, James Rayner <iphitus@iphitus.org> wrote:
The aim of this is to alert a user to system/package breaking updates before they happen and before they approve the sync install. This is intended primarily for kernel/initscripts/pacman, etc updates when things could go really wrong and need to be known beforehand. Example output below.
This adds an alert="" option to the PKGBUILD. This entry is then stored in the package and then the db with repo-add. On a pacman sync operation if any has an alert message it will be displayed before "Proceed with installation"
This is a really basic implementation that I'm sure could be brushed up, as I've not used C for a while. However the pacman code is very clean and easy to read, so that made it pretty trivial to add.
There's an example repo with one package "alert" at http://mess.iphitus.org/alert-test/
The PKGBUILD for the aforementioned package: http://mess.iphitus.org/alert-test/PKGBUILD
Attached patch is against latest git.
James
We also had request of displaying messages at the end of a transaction : http://bugs.archlinux.org/task/12861
and request to move displaying of messages from scriptlets to pkgbuild : http://bugs.archlinux.org/task/1571
Should we think about a more general approach here ?
I don't really know what to think here. I had looked at that messages one for a long time and thought it was a decent idea, but never went far enough to take it and run with it. @Loui- sure, but this is for extraordinary messages- a lot more exclusive than ChangeLog-worthy stuff, and you have to explicitly request to see that anyway. @Jeff- it isn't exactly straightforward to view an install script beforehand, and the post_install business is a rather hacky reason for needing an install script. -Dan
On Tue, Sep 15, 2009 at 11:20 AM, Dan McGee <dpmcgee@gmail.com> wrote:
I don't really know what to think here. I had looked at that messages one for a long time and thought it was a decent idea, but never went far enough to take it and run with it.
@Loui- sure, but this is for extraordinary messages- a lot more exclusive than ChangeLog-worthy stuff, and you have to explicitly request to see that anyway.
@Jeff- it isn't exactly straightforward to view an install script beforehand, and the post_install business is a rather hacky reason for needing an install script.
-Dan
Dan's got the idea... pacman should not break someone's system without at least telling them first. So yes - this is intended for more extraordinary messages. The current ways of informing the user (homepage/forum news and post-install) are broken and non-simple: - both polling based - both have too much noise In the last year there's been one or maybe two news items where I've actually had to take action. 1/365 chance of breaking something -- I'll take those odds and deal with it afterwards. Even if I only Syu weekly, that's 1/52. Would you bet on odds 98% in your favour? So is it really surprising that people _don't_ read the news? My patch might not be the ideal way of implementing some sort of pre-install alert and definitely has a few pitfalls, however it's works and an example of a very simple implementation that could be extended. James
On Tue, Sep 15, 2009 at 9:14 PM, James Rayner <iphitus@iphitus.org> wrote:
On Tue, Sep 15, 2009 at 11:20 AM, Dan McGee <dpmcgee@gmail.com> wrote:
I don't really know what to think here. I had looked at that messages one for a long time and thought it was a decent idea, but never went far enough to take it and run with it.
@Loui- sure, but this is for extraordinary messages- a lot more exclusive than ChangeLog-worthy stuff, and you have to explicitly request to see that anyway.
@Jeff- it isn't exactly straightforward to view an install script beforehand, and the post_install business is a rather hacky reason for needing an install script.
-Dan
Dan's got the idea...
pacman should not break someone's system without at least telling them first. So yes - this is intended for more extraordinary messages.
The current ways of informing the user (homepage/forum news and post-install) are broken and non-simple: - both polling based oh, and post-install is after the fact - when the system is broken, so it's not a very good way of informing the user that their system "will break" because it's already broken.
Anyway, I'm all for a more generalised/ideal setup, but that's been wanted for a while with no patches coming forward.
On Tue 15 Sep 2009 21:18 +1000, James Rayner wrote:
On Tue, Sep 15, 2009 at 9:14 PM, James Rayner <iphitus@iphitus.org> wrote:
On Tue, Sep 15, 2009 at 11:20 AM, Dan McGee <dpmcgee@gmail.com> wrote:
I don't really know what to think here. I had looked at that messages one for a long time and thought it was a decent idea, but never went far enough to take it and run with it.
@Loui- sure, but this is for extraordinary messages- a lot more exclusive than ChangeLog-worthy stuff, and you have to explicitly request to see that anyway.
@Jeff- it isn't exactly straightforward to view an install script beforehand, and the post_install business is a rather hacky reason for needing an install script.
-Dan
Dan's got the idea...
pacman should not break someone's system without at least telling them first. So yes - this is intended for more extraordinary messages.
The current ways of informing the user (homepage/forum news and post-install) are broken and non-simple: - both polling based oh, and post-install is after the fact - when the system is broken, so it's not a very good way of informing the user that their system "will break" because it's already broken.
Anyway, I'm all for a more generalised/ideal setup, but that's been wanted for a while with no patches coming forward.
The user should be made aware that there is a ChangeLog, and they have a means of easily reading that before installation or upgrade. There's no need to bloat makepkg, PKGBUILDs, and pacman. You only need to add a little to pacman like this.
On Tue, Sep 15, 2009 at 2:17 PM, Loui Chang <louipc.ist@gmail.com> wrote:
The user should be made aware that there is a ChangeLog, and they have a means of easily reading that before installation or upgrade.
There's no need to bloat makepkg, PKGBUILDs, and pacman. You only need to add a little to pacman like this.
changelog can not be read before a transaction because they are not in the sync databases, only in local one. IIRC, Nagy was complaining about that some times ago.
On Tue 15 Sep 2009 14:27 +0200, Xavier wrote:
On Tue, Sep 15, 2009 at 2:17 PM, Loui Chang <louipc.ist@gmail.com> wrote:
The user should be made aware that there is a ChangeLog, and they have a means of easily reading that before installation or upgrade.
There's no need to bloat makepkg, PKGBUILDs, and pacman. You only need to add a little to pacman like this.
changelog can not be read before a transaction because they are not in the sync databases, only in local one. IIRC, Nagy was complaining about that some times ago.
It wouldn't be too difficult to implement would it? Just trying to brainstorm a better method for this alert thing. ChangeLogs don't seem to get much use right now, but this is the perfect application for it. Maybe we can see more people using it instead of echoing messages via the install scriptlets as well. That annoys the hell out of me.
2009. 09. 15, kedd keltezéssel 21.18-kor James Rayner ezt írta:
On Tue, Sep 15, 2009 at 9:14 PM, James Rayner <iphitus@iphitus.org> wrote:
On Tue, Sep 15, 2009 at 11:20 AM, Dan McGee <dpmcgee@gmail.com> wrote:
I don't really know what to think here. I had looked at that messages one for a long time and thought it was a decent idea, but never went far enough to take it and run with it.
@Loui- sure, but this is for extraordinary messages- a lot more exclusive than ChangeLog-worthy stuff, and you have to explicitly request to see that anyway.
@Jeff- it isn't exactly straightforward to view an install script beforehand, and the post_install business is a rather hacky reason for needing an install script.
-Dan
Dan's got the idea...
pacman should not break someone's system without at least telling them first. So yes - this is intended for more extraordinary messages.
The current ways of informing the user (homepage/forum news and post-install) are broken and non-simple: - both polling based oh, and post-install is after the fact - when the system is broken, so it's not a very good way of informing the user that their system "will break" because it's already broken.
Anyway, I'm all for a more generalised/ideal setup, but that's been wanted for a while with no patches coming forward.
OK. Here is my staindpont (not closely related to iphitus's patch, but some thoughts about the "problem"): 1. echo lines in install scriplets are stupid. I bet that you also looked into install scriplets in /var/lib/pacman/... many times manually to read that information on an installed package (when something went wrong). I think this requires a new %INFO% field in (local) database, which could be accessed by -Q. Drawback: pre_install, post_install, pre_upgrade etc. is more sophisticated. (It is possible to only print info if we upgrade version older than...) 2. I am not sure about the pre-transaction messages. We ask for user confirmation before downloading packages, so in order to print info/alarm etc. messages then, we _must_ store this info in sync database, or interrupt the transaction once more before actual install. post-transaction messages are easier to implement, see 1. Iphitus chooses putting %ALERT% to syncdb. Overall, I think iphitus's patch is a good compromise, if we want to distinguish important and non-important messages. My problem is that I don't see when the packager should remove %ALERT% from package, in 1.0-2, 1.1-2, 2.0-1? When I've read (and understood) the alert message, printing it again is just a spam. Bye
Nagy Gabor wrote:
2009. 09. 15, kedd keltezéssel 21.18-kor James Rayner ezt írta:
On Tue, Sep 15, 2009 at 9:14 PM, James Rayner <iphitus@iphitus.org> wrote:
On Tue, Sep 15, 2009 at 11:20 AM, Dan McGee <dpmcgee@gmail.com> wrote:
I don't really know what to think here. I had looked at that messages one for a long time and thought it was a decent idea, but never went far enough to take it and run with it.
@Loui- sure, but this is for extraordinary messages- a lot more exclusive than ChangeLog-worthy stuff, and you have to explicitly request to see that anyway.
@Jeff- it isn't exactly straightforward to view an install script beforehand, and the post_install business is a rather hacky reason for needing an install script.
-Dan
Dan's got the idea...
pacman should not break someone's system without at least telling them first. So yes - this is intended for more extraordinary messages.
The current ways of informing the user (homepage/forum news and post-install) are broken and non-simple: - both polling based
oh, and post-install is after the fact - when the system is broken, so it's not a very good way of informing the user that their system "will break" because it's already broken.
Anyway, I'm all for a more generalised/ideal setup, but that's been wanted for a while with no patches coming forward.
OK. Here is my staindpont (not closely related to iphitus's patch, but some thoughts about the "problem"):
1. echo lines in install scriplets are stupid. I bet that you also looked into install scriplets in /var/lib/pacman/... many times manually to read that information on an installed package (when something went wrong). I think this requires a new %INFO% field in (local) database, which could be accessed by -Q. Drawback: pre_install, post_install, pre_upgrade etc. is more sophisticated. (It is possible to only print info if we upgrade version older than...) 2. I am not sure about the pre-transaction messages. We ask for user confirmation before downloading packages, so in order to print info/alarm etc. messages then, we _must_ store this info in sync database, or interrupt the transaction once more before actual install. post-transaction messages are easier to implement, see 1. Iphitus chooses putting %ALERT% to syncdb.
Overall, I think iphitus's patch is a good compromise, if we want to distinguish important and non-important messages.
My problem is that I don't see when the packager should remove %ALERT% from package, in 1.0-2, 1.1-2, 2.0-1? When I've read (and understood) the alert message, printing it again is just a spam.
That point is what I have been thinking about all day but there is no easy solution there as far as I can tell. I have the same issue with deciding when to remove provides lines... Allan
Am Dienstag, den 15.09.2009, 22:53 +1000 schrieb Allan McRae:
Nagy Gabor wrote:
2009. 09. 15, kedd keltezéssel 21.18-kor James Rayner ezt írta:
On Tue, Sep 15, 2009 at 9:14 PM, James Rayner <iphitus@iphitus.org> wrote:
On Tue, Sep 15, 2009 at 11:20 AM, Dan McGee <dpmcgee@gmail.com> wrote:
I don't really know what to think here. I had looked at that messages one for a long time and thought it was a decent idea, but never went far enough to take it and run with it.
@Loui- sure, but this is for extraordinary messages- a lot more exclusive than ChangeLog-worthy stuff, and you have to explicitly request to see that anyway.
@Jeff- it isn't exactly straightforward to view an install script beforehand, and the post_install business is a rather hacky reason for needing an install script.
-Dan
Dan's got the idea...
pacman should not break someone's system without at least telling them first. So yes - this is intended for more extraordinary messages.
The current ways of informing the user (homepage/forum news and post-install) are broken and non-simple: - both polling based
oh, and post-install is after the fact - when the system is broken, so it's not a very good way of informing the user that their system "will break" because it's already broken.
Anyway, I'm all for a more generalised/ideal setup, but that's been wanted for a while with no patches coming forward.
OK. Here is my staindpont (not closely related to iphitus's patch, but some thoughts about the "problem"):
1. echo lines in install scriplets are stupid. I bet that you also looked into install scriplets in /var/lib/pacman/... many times manually to read that information on an installed package (when something went wrong). I think this requires a new %INFO% field in (local) database, which could be accessed by -Q. Drawback: pre_install, post_install, pre_upgrade etc. is more sophisticated. (It is possible to only print info if we upgrade version older than...) 2. I am not sure about the pre-transaction messages. We ask for user confirmation before downloading packages, so in order to print info/alarm etc. messages then, we _must_ store this info in sync database, or interrupt the transaction once more before actual install. post-transaction messages are easier to implement, see 1. Iphitus chooses putting %ALERT% to syncdb.
Overall, I think iphitus's patch is a good compromise, if we want to distinguish important and non-important messages.
My problem is that I don't see when the packager should remove %ALERT% from package, in 1.0-2, 1.1-2, 2.0-1? When I've read (and understood) the alert message, printing it again is just a spam.
That point is what I have been thinking about all day but there is no easy solution there as far as I can tell. I have the same issue with deciding when to remove provides lines...
Allan
How about detaching the alerts from the actual package itself and add it only to the repo like we do for deltas? We then could add a logic to it to just print it for versions older than the flagged version by default... This of course needs some new tools for the repo to add/modify/remove alerts. I think this could be a way around this problem... Marc
Am Dienstag, den 15.09.2009, 15:09 +0200 schrieb Marc - A. Dahlhaus [ Administration | Westermann GmbH ]:
Am Dienstag, den 15.09.2009, 22:53 +1000 schrieb Allan McRae:
Nagy Gabor wrote:
2009. 09. 15, kedd keltezéssel 21.18-kor James Rayner ezt írta:
On Tue, Sep 15, 2009 at 9:14 PM, James Rayner <iphitus@iphitus.org> wrote:
On Tue, Sep 15, 2009 at 11:20 AM, Dan McGee <dpmcgee@gmail.com> wrote:
I don't really know what to think here. I had looked at that messages one for a long time and thought it was a decent idea, but never went far enough to take it and run with it.
@Loui- sure, but this is for extraordinary messages- a lot more exclusive than ChangeLog-worthy stuff, and you have to explicitly request to see that anyway.
@Jeff- it isn't exactly straightforward to view an install script beforehand, and the post_install business is a rather hacky reason for needing an install script.
-Dan
Dan's got the idea...
pacman should not break someone's system without at least telling them first. So yes - this is intended for more extraordinary messages.
The current ways of informing the user (homepage/forum news and post-install) are broken and non-simple: - both polling based
oh, and post-install is after the fact - when the system is broken, so it's not a very good way of informing the user that their system "will break" because it's already broken.
Anyway, I'm all for a more generalised/ideal setup, but that's been wanted for a while with no patches coming forward.
OK. Here is my staindpont (not closely related to iphitus's patch, but some thoughts about the "problem"):
1. echo lines in install scriplets are stupid. I bet that you also looked into install scriplets in /var/lib/pacman/... many times manually to read that information on an installed package (when something went wrong). I think this requires a new %INFO% field in (local) database, which could be accessed by -Q. Drawback: pre_install, post_install, pre_upgrade etc. is more sophisticated. (It is possible to only print info if we upgrade version older than...) 2. I am not sure about the pre-transaction messages. We ask for user confirmation before downloading packages, so in order to print info/alarm etc. messages then, we _must_ store this info in sync database, or interrupt the transaction once more before actual install. post-transaction messages are easier to implement, see 1. Iphitus chooses putting %ALERT% to syncdb.
Overall, I think iphitus's patch is a good compromise, if we want to distinguish important and non-important messages.
My problem is that I don't see when the packager should remove %ALERT% from package, in 1.0-2, 1.1-2, 2.0-1? When I've read (and understood) the alert message, printing it again is just a spam.
That point is what I have been thinking about all day but there is no easy solution there as far as I can tell. I have the same issue with deciding when to remove provides lines...
Allan
How about detaching the alerts from the actual package itself and add it only to the repo like we do for deltas? We then could add a logic to it to just print it for versions older than the flagged version by default...
This of course needs some new tools for the repo to add/modify/remove alerts.
I think this could be a way around this problem...
Marc
OT: is there a tool for removing deltas from repo archives?
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
OT: is there a tool for removing deltas from repo archives?
Please change subject lines when going off topic. repo-remove... but I am assuming you mean some automated tool to remove "useless" deltas? Not yet, and I doubt there will be until someone starts using deltas in a repo and finds the need to code one. Allan
Am Dienstag, den 15.09.2009, 23:14 +1000 schrieb Allan McRae:
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
OT: is there a tool for removing deltas from repo archives?
Please change subject lines when going off topic.
Will do next time.
repo-remove... but I am assuming you mean some automated tool to remove "useless" deltas? Not yet, and I doubt there will be until someone starts using deltas in a repo and finds the need to code one.
Well, i rise my hand then and look into it as we use deltas quite heavily lately here. Thanks, Marc
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
Am Dienstag, den 15.09.2009, 23:14 +1000 schrieb Allan McRae:
repo-remove... but I am assuming you mean some automated tool to remove "useless" deltas? Not yet, and I doubt there will be until someone starts using deltas in a repo and finds the need to code one.
Well, i rise my hand then and look into it as we use deltas quite heavily lately here.
That would be great. From memory, pacman does not use a chain of deltas if the total download is greater than 90% of just downloading the full package. That is probably a good criteria to use to in order to decide which deltas to remove. Allan
Am Mittwoch, den 16.09.2009, 16:18 +1000 schrieb Allan McRae:
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
Am Dienstag, den 15.09.2009, 23:14 +1000 schrieb Allan McRae:
repo-remove... but I am assuming you mean some automated tool to remove "useless" deltas? Not yet, and I doubt there will be until someone starts using deltas in a repo and finds the need to code one.
Well, i rise my hand then and look into it as we use deltas quite heavily lately here.
That would be great. From memory, pacman does not use a chain of deltas if the total download is greater than 90% of just downloading the full package. That is probably a good criteria to use to in order to decide which deltas to remove.
Actually first i got confused as repo-remove is documented to take a pkgname as param to remove the whole package (including deltas) from the repo. To use the deltapackage-filename as option to remove only that delta doesn't fitted well into that documentation. I'll try to add this to the man page and usage outputs during my work on this topic... I'd like to add a -d|--delta option to repo-add to create the delta between the current package and the one to be added. For option parsing i plan to make usage of getopt (it is already used by makepkg so it shouldn't be a problem). I've also thought about adding a -c|--cleanup option to repo-add and repo-remove that would delete unused files from the repo directory. If booth -d and -c are given than it should remove the "unused" deltas from the repo also and remove the corresponding files. But as i'm unsure if adding on option to do automatic removal of package files is really a good idea for upstream in the first place, i think i'll do a combined creation, adding and cleanup of "unused" deltas task with -d param for repo-add first. What do you think about it? Marc
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
Am Mittwoch, den 16.09.2009, 16:18 +1000 schrieb Allan McRae:
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
Am Dienstag, den 15.09.2009, 23:14 +1000 schrieb Allan McRae:
repo-remove... but I am assuming you mean some automated tool to remove "useless" deltas? Not yet, and I doubt there will be until someone starts using deltas in a repo and finds the need to code one.
Well, i rise my hand then and look into it as we use deltas quite heavily lately here.
That would be great. From memory, pacman does not use a chain of deltas if the total download is greater than 90% of just downloading the full package. That is probably a good criteria to use to in order to decide which deltas to remove.
Actually first i got confused as repo-remove is documented to take a pkgname as param to remove the whole package (including deltas) from the repo. To use the deltapackage-filename as option to remove only that delta doesn't fitted well into that documentation. I'll try to add this to the man page and usage outputs during my work on this topic...
I'd like to add a -d|--delta option to repo-add to create the delta between the current package and the one to be added.
For option parsing i plan to make usage of getopt (it is already used by makepkg so it shouldn't be a problem).
I will fully read your proposal later, but I want to flag that makepkg does not use getopt anymore but its own bash parser. This was because of portability issues. I also notice that the reference to getopt at the top of makepkg has not been removed... Allan
On Wed, Sep 16, 2009 at 5:45 PM, Allan McRae <allan@archlinux.org> wrote:
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
I will fully read your proposal later, but I want to flag that makepkg does not use getopt anymore but its own bash parser. This was because of portability issues. I also notice that the reference to getopt at the top of makepkg has not been removed...
damn, you beat me to it, this is exactly the mail I was going to write :) (I also just noticed the reference :D)
Am Donnerstag, den 17.09.2009, 01:45 +1000 schrieb Allan McRae:
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
Am Mittwoch, den 16.09.2009, 16:18 +1000 schrieb Allan McRae:
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
Am Dienstag, den 15.09.2009, 23:14 +1000 schrieb Allan McRae:
repo-remove... but I am assuming you mean some automated tool to remove "useless" deltas? Not yet, and I doubt there will be until someone starts using deltas in a repo and finds the need to code one.
Well, i rise my hand then and look into it as we use deltas quite heavily lately here.
That would be great. From memory, pacman does not use a chain of deltas if the total download is greater than 90% of just downloading the full package. That is probably a good criteria to use to in order to decide which deltas to remove.
Actually first i got confused as repo-remove is documented to take a pkgname as param to remove the whole package (including deltas) from the repo. To use the deltapackage-filename as option to remove only that delta doesn't fitted well into that documentation. I'll try to add this to the man page and usage outputs during my work on this topic...
I'd like to add a -d|--delta option to repo-add to create the delta between the current package and the one to be added.
For option parsing i plan to make usage of getopt (it is already used by makepkg so it shouldn't be a problem).
I will fully read your proposal later, but I want to flag that makepkg does not use getopt anymore but its own bash parser. This was because of portability issues. I also notice that the reference to getopt at the top of makepkg has not been removed...
Thanks for pointing that out. I only did a quick look at the outputs of a recursive grep for getopt but missed that it only found it in some comments... Marc
Am Mittwoch, den 16.09.2009, 18:05 +0200 schrieb Marc - A. Dahlhaus [ Administration | Westermann GmbH ]:
Am Donnerstag, den 17.09.2009, 01:45 +1000 schrieb Allan McRae:
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
Am Mittwoch, den 16.09.2009, 16:18 +1000 schrieb Allan McRae:
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
Am Dienstag, den 15.09.2009, 23:14 +1000 schrieb Allan McRae:
repo-remove... but I am assuming you mean some automated tool to remove "useless" deltas? Not yet, and I doubt there will be until someone starts using deltas in a repo and finds the need to code one.
Well, i rise my hand then and look into it as we use deltas quite heavily lately here.
That would be great. From memory, pacman does not use a chain of deltas if the total download is greater than 90% of just downloading the full package. That is probably a good criteria to use to in order to decide which deltas to remove.
Actually first i got confused as repo-remove is documented to take a pkgname as param to remove the whole package (including deltas) from the repo. To use the deltapackage-filename as option to remove only that delta doesn't fitted well into that documentation. I'll try to add this to the man page and usage outputs during my work on this topic...
I'd like to add a -d|--delta option to repo-add to create the delta between the current package and the one to be added.
For option parsing i plan to make usage of getopt (it is already used by makepkg so it shouldn't be a problem).
I will fully read your proposal later, but I want to flag that makepkg does not use getopt anymore but its own bash parser. This was because of portability issues. I also notice that the reference to getopt at the top of makepkg has not been removed...
Thanks for pointing that out. I only did a quick look at the outputs of a recursive grep for getopt but missed that it only found it in some comments...
As makepkg shebangs for /bin/bash, why don't we use the getopts buildin of bash in the first place, was there a reason to not use it? To make usage of it could be a reduction in code size (will look into it if it's desired) and also would not be a portability issue IMO. Marc
On Wed, Sep 16, 2009 at 11:14 AM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Am Mittwoch, den 16.09.2009, 18:05 +0200 schrieb Marc - A. Dahlhaus [ Administration | Westermann GmbH ]:
Am Donnerstag, den 17.09.2009, 01:45 +1000 schrieb Allan McRae:
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
Am Mittwoch, den 16.09.2009, 16:18 +1000 schrieb Allan McRae:
Marc - A. Dahlhaus [ Administration | Westermann GmbH ] wrote:
Am Dienstag, den 15.09.2009, 23:14 +1000 schrieb Allan McRae:
> repo-remove... but I am assuming you mean some automated tool to remove > "useless" deltas? Not yet, and I doubt there will be until someone > starts using deltas in a repo and finds the need to code one. > > Well, i rise my hand then and look into it as we use deltas quite heavily lately here.
That would be great. From memory, pacman does not use a chain of deltas if the total download is greater than 90% of just downloading the full package. That is probably a good criteria to use to in order to decide which deltas to remove.
Actually first i got confused as repo-remove is documented to take a pkgname as param to remove the whole package (including deltas) from the repo. To use the deltapackage-filename as option to remove only that delta doesn't fitted well into that documentation. I'll try to add this to the man page and usage outputs during my work on this topic...
I'd like to add a -d|--delta option to repo-add to create the delta between the current package and the one to be added.
For option parsing i plan to make usage of getopt (it is already used by makepkg so it shouldn't be a problem).
I will fully read your proposal later, but I want to flag that makepkg does not use getopt anymore but its own bash parser. This was because of portability issues. I also notice that the reference to getopt at the top of makepkg has not been removed...
Thanks for pointing that out. I only did a quick look at the outputs of a recursive grep for getopt but missed that it only found it in some comments...
As makepkg shebangs for /bin/bash, why don't we use the getopts buildin of bash in the first place, was there a reason to not use it?
To make usage of it could be a reduction in code size (will look into it if it's desired) and also would not be a portability issue IMO.
It's not portable
On Wed, Sep 16, 2009 at 6:39 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
Thanks for pointing that out. I only did a quick look at the outputs of a recursive grep for getopt but missed that it only found it in some comments...
As makepkg shebangs for /bin/bash, why don't we use the getopts buildin of bash in the first place, was there a reason to not use it?
To make usage of it could be a reduction in code size (will look into it if it's desired) and also would not be a portability issue IMO.
It's not portable
a bash builtin should be the most portable thing :) We used getopts in the beginning, but it was changed to gnu getopt (probably because supporting both long and short options is much easier), and then we had to move to our own implementation for portability problem. This should give you all the information about the history : http://projects.archlinux.org/?p=pacman.git&a=search&h=HEAD&st=commit&s=getopt When the portability problem was raised, I remember a long discussion about the different alternatives. Like moving back to getopts (even if handling short/long is ugly), removing long options (my favorite solution) and using our own implementation (the way which was chosen). Search the mailing list archives if you are interested.
On Wed, Sep 16, 2009 at 12:12 PM, Xavier <shiningxc@gmail.com> wrote:
On Wed, Sep 16, 2009 at 6:39 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
Thanks for pointing that out. I only did a quick look at the outputs of a recursive grep for getopt but missed that it only found it in some comments...
As makepkg shebangs for /bin/bash, why don't we use the getopts buildin of bash in the first place, was there a reason to not use it?
To make usage of it could be a reduction in code size (will look into it if it's desired) and also would not be a portability issue IMO.
It's not portable
a bash builtin should be the most portable thing :)
Doh, I skipped over the "builtin" part :S
We used getopts in the beginning, but it was changed to gnu getopt (probably because supporting both long and short options is much easier), and then we had to move to our own implementation for portability problem.
This is what I meant
On Wed, 16 Sep 2009 13:40:37 -0500 Aaron Griffin <aaronmgriffin@gmail.com> wrote:
On Wed, Sep 16, 2009 at 12:12 PM, Xavier <shiningxc@gmail.com> wrote:
On Wed, Sep 16, 2009 at 6:39 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
Thanks for pointing that out. I only did a quick look at the outputs of a recursive grep for getopt but missed that it only found it in some comments...
As makepkg shebangs for /bin/bash, why don't we use the getopts buildin of bash in the first place, was there a reason to not use it?
To make usage of it could be a reduction in code size (will look into it if it's desired) and also would not be a portability issue IMO.
It's not portable
a bash builtin should be the most portable thing :)
Doh, I skipped over the "builtin" part :S
We used getopts in the beginning, but it was changed to gnu getopt (probably because supporting both long and short options is much easier), and then we had to move to our own implementation for portability problem.
This is what I meant
My 2 cents: portability is important, and code conciseness is more important then having a fancy interface with many possibilities. Isn't it just redundant/bloat to support both long and short ones? Dieter
Dieter Plaetinck wrote:
On Wed, 16 Sep 2009 13:40:37 -0500 Aaron Griffin <aaronmgriffin@gmail.com> wrote:
On Wed, Sep 16, 2009 at 12:12 PM, Xavier <shiningxc@gmail.com> wrote:
On Wed, Sep 16, 2009 at 6:39 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
Thanks for pointing that out. I only did a quick look at the outputs of a recursive grep for getopt but missed that it only found it in some comments...
As makepkg shebangs for /bin/bash, why don't we use the getopts buildin of bash in the first place, was there a reason to not use it?
To make usage of it could be a reduction in code size (will look into it if it's desired) and also would not be a portability issue IMO.
It's not portable
a bash builtin should be the most portable thing :)
Doh, I skipped over the "builtin" part :S
We used getopts in the beginning, but it was changed to gnu getopt (probably because supporting both long and short options is much easier), and then we had to move to our own implementation for portability problem.
This is what I meant
My 2 cents: portability is important, and code conciseness is more important then having a fancy interface with many possibilities. Isn't it just redundant/bloat to support both long and short ones?
No.... there are only so many letters of the alphabet and we still have many long options without letters assigned to them that have no obvious shortening. Allan
On Wed, Sep 16, 2009 at 6:08 PM, Allan McRae <allan@archlinux.org> wrote:
Dieter Plaetinck wrote:
On Wed, 16 Sep 2009 13:40:37 -0500 Aaron Griffin <aaronmgriffin@gmail.com> wrote:
On Wed, Sep 16, 2009 at 12:12 PM, Xavier <shiningxc@gmail.com> wrote:
On Wed, Sep 16, 2009 at 6:39 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
> > Thanks for pointing that out. > I only did a quick look at the outputs of a recursive grep for getopt > but > missed that it only found it in some comments... >
As makepkg shebangs for /bin/bash, why don't we use the getopts buildin of bash in the first place, was there a reason to not use it?
To make usage of it could be a reduction in code size (will look into it if it's desired) and also would not be a portability issue IMO.
It's not portable
a bash builtin should be the most portable thing :)
Doh, I skipped over the "builtin" part :S
We used getopts in the beginning, but it was changed to gnu getopt (probably because supporting both long and short options is much easier), and then we had to move to our own implementation for portability problem.
This is what I meant
My 2 cents: portability is important, and code conciseness is more important then having a fancy interface with many possibilities. Isn't it just redundant/bloat to support both long and short ones?
No.... there are only so many letters of the alphabet and we still have many long options without letters assigned to them that have no obvious shortening.
No, if anything I'd drop short options and keep long, but there would be a lot of pushback. Does it matter if it works, and several people here can understand it well enough? Most people didn't even know we did this until today and we surely didn't hear objections when it went in, so it clearly isn't that bad... -Dan
On Wed, 16 Sep 2009 18:10:24 -0500 Dan McGee <dpmcgee@gmail.com> wrote:
On Wed, Sep 16, 2009 at 6:08 PM, Allan McRae <allan@archlinux.org> wrote:
Dieter Plaetinck wrote:
My 2 cents: portability is important, and code conciseness is more important then having a fancy interface with many possibilities. Isn't it just redundant/bloat to support both long and short ones?
No.... there are only so many letters of the alphabet and we still have many long options without letters assigned to them that have no obvious shortening.
No, if anything I'd drop short options and keep long, but there would be a lot of pushback.
Does it matter if it works, and several people here can understand it well enough? Most people didn't even know we did this until today and we surely didn't hear objections when it went in, so it clearly isn't that bad...
-Dan
Hmm this was about makepkg right? makepkg --help | grep '-' | wc -l 24 lowercase + uppercase gives you 52 options. Dieter
Dieter Plaetinck wrote:
On Wed, 16 Sep 2009 18:10:24 -0500 Dan McGee <dpmcgee@gmail.com> wrote:
On Wed, Sep 16, 2009 at 6:08 PM, Allan McRae <allan@archlinux.org> wrote:
Dieter Plaetinck wrote:
My 2 cents: portability is important, and code conciseness is more important then having a fancy interface with many possibilities. Isn't it just redundant/bloat to support both long and short ones?
No.... there are only so many letters of the alphabet and we still have many long options without letters assigned to them that have no obvious shortening.
No, if anything I'd drop short options and keep long, but there would be a lot of pushback.
Does it matter if it works, and several people here can understand it well enough? Most people didn't even know we did this until today and we surely didn't hear objections when it went in, so it clearly isn't that bad...
-Dan
Hmm this was about makepkg right? makepkg --help | grep '-' | wc -l 24
lowercase + uppercase gives you 52 options.
Right... We have these options without short version (e.g.) --source --allsource --asroot . Now -r -R -A -s are already used. What would you choose? the are 52 options but I doubt z,y,x... will ever make sense. Anyway, this is not being changed. If you can provide a single case where the old implementation parses options differently than the current bash implementation, I will be surprised and reward you with a cookie before fixing it. This was discussed to death ages ago. Go back and read about it. Allan
On Thu, 17 Sep 2009 18:58:07 +1000 Allan McRae <allan@archlinux.org> wrote:
Dieter Plaetinck wrote:
Hmm this was about makepkg right? makepkg --help | grep '-' | wc -l 24
lowercase + uppercase gives you 52 options.
Right... We have these options without short version (e.g.) --source --allsource --asroot . Now -r -R -A -s are already used. What would you choose? the are 52 options but I doubt z,y,x... will ever make sense.
Anyway, this is not being changed. If you can provide a single case where the old implementation parses options differently than the current bash implementation, I will be surprised and reward you with a cookie before fixing it.
This was discussed to death ages ago. Go back and read about it.
Allan
This has nothing to do with the new implementation behaving differently from the old one or not. I just raised my concerns (without even being familiar with makepkg source code) and you gave some good counterarguments. I'm fine with that. Sometimes I raise some points, not because I'm convinced they are the way to go, but because it can be interesting for everyone. However, I would like one of those cookies. Dieter
Xavier schrieb:
On Wed, Sep 16, 2009 at 6:39 PM, Aaron Griffin <aaronmgriffin@gmail.com> wrote:
Thanks for pointing that out. I only did a quick look at the outputs of a recursive grep for getopt but missed that it only found it in some comments... As makepkg shebangs for /bin/bash, why don't we use the getopts buildin of bash in the first place, was there a reason to not use it?
To make usage of it could be a reduction in code size (will look into it if it's desired) and also would not be a portability issue IMO. It's not portable
a bash builtin should be the most portable thing :)
We used getopts in the beginning, but it was changed to gnu getopt (probably because supporting both long and short options is much easier), and then we had to move to our own implementation for portability problem.
This should give you all the information about the history : http://projects.archlinux.org/?p=pacman.git&a=search&h=HEAD&st=commit&s=getopt
When the portability problem was raised, I remember a long discussion about the different alternatives. Like moving back to getopts (even if handling short/long is ugly), removing long options (my favorite solution) and using our own implementation (the way which was chosen). Search the mailing list archives if you are interested.
Thanks. :) Marc
On Wed, Sep 16, 2009 at 5:40 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Actually first i got confused as repo-remove is documented to take a pkgname as param to remove the whole package (including deltas) from the repo. To use the deltapackage-filename as option to remove only that delta doesn't fitted well into that documentation. I'll try to add this to the man page and usage outputs during my work on this topic...
Right, a separate patch for improving the doc would be nice.
I'd like to add a -d|--delta option to repo-add to create the delta between the current package and the one to be added.
see attached patch (6 months old :P) I don't think I ever proposed it. I am not sure why. Probably because I don't know how repo maintainers want to use delta ...
I've also thought about adding a -c|--cleanup option to repo-add and repo-remove that would delete unused files from the repo directory. If booth -d and -c are given than it should remove the "unused" deltas from the repo also and remove the corresponding files.
But as i'm unsure if adding on option to do automatic removal of package files is really a good idea for upstream in the first place, i think i'll do a combined creation, adding and cleanup of "unused" deltas task with -d param for repo-add first.
How do you plan do cleanup delta ? it's pacman who would know the best which deltas it will never use. I was thinking about refactoring/extending the delta code to expose a public functions which would return a list of unused delta for a given repo or something. Otherwise we need to re-implement / duplicate delta code in pacman. Or did you have a simpler alternative in mind ?
Xavier schrieb:
On Wed, Sep 16, 2009 at 5:40 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Actually first i got confused as repo-remove is documented to take a pkgname as param to remove the whole package (including deltas) from the repo. To use the deltapackage-filename as option to remove only that delta doesn't fitted well into that documentation. I'll try to add this to the man page and usage outputs during my work on this topic...
Right, a separate patch for improving the doc would be nice.
I'd like to add a -d|--delta option to repo-add to create the delta between the current package and the one to be added.
see attached patch (6 months old :P)
I don't think I ever proposed it. I am not sure why. Probably because I don't know how repo maintainers want to use delta ...
Nice, i would use it ;o)
I've also thought about adding a -c|--cleanup option to repo-add and repo-remove that would delete unused files from the repo directory. If booth -d and -c are given than it should remove the "unused" deltas from the repo also and remove the corresponding files.
But as i'm unsure if adding on option to do automatic removal of package files is really a good idea for upstream in the first place, i think i'll do a combined creation, adding and cleanup of "unused" deltas task with -d param for repo-add first.
How do you plan do cleanup delta ?
it's pacman who would know the best which deltas it will never use. I was thinking about refactoring/extending the delta code to expose a public functions which would return a list of unused delta for a given repo or something. Otherwise we need to re-implement / duplicate delta code in pacman. Or did you have a simpler alternative in mind ?
I plan to do it based on the accumulated filesize of all deltas that could be chained together from the repo package version without gaps. As long as this accumulated size is smaller than eg. 90% of the repo packages size (as Allan mentioned that pacmans logic was based on that in the other mail) the delta will be kept. It would make sense to use a recursion based implementation. A function with two params: $1: version we want to chain down from $2: accumulated size in bytes of deltas above us in the chain first call would look like: function currents-package-version 0 inside of function we read all source-versions and filesizes of deltas containing "_to_${1}" in the filename and call function for each match with the source-version and the filesize+$2. We check every recursions retval. If we get a 0 retval for a delta, we would add it to the next repo archive. We return 1 if $2 is larger than repo packages filesize*0.9. Last thing in this function would be a "return 0". This would walk down the chain as far as it could and then on its way returning up to the first caller it will add the appropriate deltas to the repo. If there are deltas in the old repo that can't be chained to other deltas and don't have the repo package version as target get removed as they will not be found by the recursion. But i didn't read through the current code in repo-add or pacmans code for the incremental delta handling so far, so this is a very early draft of how i would do it and i could be very far out in the woods with it... :-D OT: we could also add a configure option to select the maximum percentage we want to use incremental deltas... Marc
On Wed, Sep 16, 2009 at 10:35 PM, Marc - A. Dahlhaus <mad@wol.de> wrote:
I plan to do it based on the accumulated filesize of all deltas that could be chained together from the repo package version without gaps. As long as this accumulated size is smaller than eg. 90% of the repo packages size (as Allan mentioned that pacmans logic was based on that in the other mail) the delta will be kept.
btw its 70% but it does not change anything.
It would make sense to use a recursion based implementation.
A function with two params: $1: version we want to chain down from $2: accumulated size in bytes of deltas above us in the chain
first call would look like:
function currents-package-version 0
inside of function we read all source-versions and filesizes of deltas containing "_to_${1}" in the filename and call function for each match with the source-version and the filesize+$2. We check every recursions retval. If we get a 0 retval for a delta, we would add it to the next repo archive. We return 1 if $2 is larger than repo packages filesize*0.9. Last thing in this function would be a "return 0".
This would walk down the chain as far as it could and then on its way returning up to the first caller it will add the appropriate deltas to the repo. If there are deltas in the old repo that can't be chained to other deltas and don't have the repo package version as target get removed as they will not be found by the recursion.
But i didn't read through the current code in repo-add or pacmans code for the incremental delta handling so far, so this is a very early draft of how i would do it and i could be very far out in the woods with it... :-D
It took me some time to understand how this would work when just reading that mail, but when trying to implement it it revealed to be very simple. Attached a quick hack in bash I just wrote. However, my current implementation is widely inefficient. And currently I just print the delta which are reachable and too big. So we don' t get the unreachable ones. It would indeed be better to just print delta reachable and with a good size, but then we would have to inverse that list somehow. I am still not sure what to do.. I am not sure bash is an appropriate language for this task. And I am not sure this fits well in repo-add. Currently, repo-add does not do any parsing of the entries. But half the code for cleaning up delta is actually database parsing (need filename and csize from desc, and every line from deltas) I think I still prefer to hack libalpm delta code to do what I want, just returning a list of bad delta which can then be fed to repo-remove (and possibly another script taking care of removing the actual delta files from the FS)
Am Donnerstag, den 17.09.2009, 14:49 +0200 schrieb Xavier:
On Wed, Sep 16, 2009 at 10:35 PM, Marc - A. Dahlhaus <mad@wol.de> wrote: --8<--
It would make sense to use a recursion based implementation.
A function with two params: $1: version we want to chain down from $2: accumulated size in bytes of deltas above us in the chain
first call would look like:
function currents-package-version 0
inside of function we read all source-versions and filesizes of deltas containing "_to_${1}" in the filename and call function for each match with the source-version and the filesize+$2. We check every recursions retval. If we get a 0 retval for a delta, we would add it to the next repo archive. We return 1 if $2 is larger than repo packages filesize*0.9. Last thing in this function would be a "return 0". --8<-- It took me some time to understand how this would work when just reading that mail, but when trying to implement it it revealed to be very simple. Attached a quick hack in bash I just wrote. However, my current implementation is widely inefficient. And currently I just print the delta which are reachable and too big. So we don' t get the unreachable ones.
This is because you check for size and echo the delta on the same level of the recursion and the condition checking is wrong. We need to check for the retval of the recursion and actualy throw the error condition up one level when we exceed the sizelimit. The checking can only work one recursion level down from the current level because you push the actual deltas size one level down. Recursions are fun are't they? ;-D
It would indeed be better to just print delta reachable and with a good size, but then we would have to inverse that list somehow.
Please take a look at the attached version. It might be able to clear up what i wrote in the other mail a bit. Run it with the repositories root as PWD. I added some echos to it so that you can read the actual decisions the recursion would make from the output.
I am still not sure what to do.. I am not sure bash is an appropriate language for this task. And I am not sure this fits well in repo-add. Currently, repo-add does not do any parsing of the entries.
The cleanup will only run if the user asked for it so it is ok if it needs some more time than without parsing the informations i think.
But half the code for cleaning up delta is actually database parsing (need filename and csize from desc, and every line from deltas)
I think the fixed up demo implementation is clean enough to be useful inside of repo-add as read is able to fill more than just one variable.
I think I still prefer to hack libalpm delta code to do what I want, just returning a list of bad delta which can then be fed to repo-remove (and possibly another script taking care of removing the actual delta files from the FS)
If you think it is worth the effort given that the bash version isn't more than maybe 25 lines of code, go for it... ;-P Marc
On Thu, Sep 17, 2009 at 6:42 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
This is because you check for size and echo the delta on the same level of the recursion and the condition checking is wrong. We need to check for the retval of the recursion and actualy throw the error condition up one level when we exceed the sizelimit. The checking can only work one recursion level down from the current level because you push the actual deltas size one level down.
Recursions are fun are't they? ;-D
Both scripts have the same output. For example : 38866 < 139532 : Ignoring dhcpcd-4.0.2-1-i686.pkg.tar.gz 38866 < 88713 : Ignoring dhcpcd-4.0.2-2-i686.pkg.tar.gz 38866 < 87880 : Ignoring dhcpcd-4.0.3-1-i686.pkg.tar.gz 38866 < 67312 : Ignoring dhcpcd-4.0.4-1-i686.pkg.tar.gz 38866 < 57650 : Ignoring dhcpcd-5.0.2-1-i686.pkg.tar.gz vs dhcpcd-3.2.1-1_to_4.0.2-1-i686.delta 139532 dhcpcd-4.0.2-1_to_4.0.2-2-i686.delta 88713 dhcpcd-4.0.2-2_to_4.0.3-1-i686.delta 87880 dhcpcd-4.0.3-1_to_4.0.4-1-i686.delta 67312 dhcpcd-4.0.4-1_to_5.0.2-1-i686.delta 57650 but I checked several other packages as well. So I am not sure what you are trying to explain. Actually I was using the retval first, I just changed it at the end, and kept the same result. But now I just realize there is something wrong about it, and I actually don't know how it works at all. I set deltaname and newsize inside a subshell so normally we cannot access these outside.
Xavier schrieb:
On Thu, Sep 17, 2009 at 6:42 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
This is because you check for size and echo the delta on the same level of the recursion and the condition checking is wrong. We need to check for the retval of the recursion and actualy throw the error condition up one level when we exceed the sizelimit. The checking can only work one recursion level down from the current level because you push the actual deltas size one level down.
Recursions are fun are't they? ;-D
Both scripts have the same output. For example :
38866 < 139532 : Ignoring dhcpcd-4.0.2-1-i686.pkg.tar.gz 38866 < 88713 : Ignoring dhcpcd-4.0.2-2-i686.pkg.tar.gz 38866 < 87880 : Ignoring dhcpcd-4.0.3-1-i686.pkg.tar.gz 38866 < 67312 : Ignoring dhcpcd-4.0.4-1-i686.pkg.tar.gz 38866 < 57650 : Ignoring dhcpcd-5.0.2-1-i686.pkg.tar.gz
vs
dhcpcd-3.2.1-1_to_4.0.2-1-i686.delta 139532 dhcpcd-4.0.2-1_to_4.0.2-2-i686.delta 88713 dhcpcd-4.0.2-2_to_4.0.3-1-i686.delta 87880 dhcpcd-4.0.3-1_to_4.0.4-1-i686.delta 67312 dhcpcd-4.0.4-1_to_5.0.2-1-i686.delta 57650
but I checked several other packages as well.
So I am not sure what you are trying to explain.
Scratch the point about the check one level to high. Didn't realized that you reused the var that you passed down one level in the check.
Actually I was using the retval first, I just changed it at the end, and kept the same result. But now I just realize there is something wrong about it, and I actually don't know how it works at all. I set deltaname and newsize inside a subshell so normally we cannot access these outside.
I think we look at the same problem but from different angles. What i think is needed to do in a cleanup run is to only care about the deltas that need to be added to the new repo. Your plan was to use unneeded ones as an argument for repo-remove, right? We don't need to. We only need to recreate the new repos "deltas"-file content with the delta lines of deltas we want to keep in the repo. The recursion algorithm as proposed in my first mail takes care that it only adds the deltas that are directly chained to the package version in the repo with its walk down as long as the "delta-quota" is not reached. It will add the lowest delta version first in the return path up to the newest. The rest of deltas is unneeded. The functions arguments and the vars created in its body are local. This is ok as we Take a look at attached version that actually implements the whole cleanup this way. It's fast, short and self contained IMO. What do you think? I'll integrate it into repo-add and post a patch tomorrow. Marc #!/bin/bash # $1 : package filename # $2 : sum of delta size to go from this package to the database one cleanup() { grep "${1}$" ${dir}/deltas | while read delta md5sum dsize spkg tpkg do if cleanup ${spkg} $(( ${2} + ${dsize} )) then echo "${delta} ${md5sum} ${dsize} ${spkg} ${tpkg}" \ >> ${dir}/cdeltas fi done [ ${2} -gt ${quota} ] && return 1 if [ ${2} -eq 0 ] then rm -f ${dir}/deltas [ -f ${dir}/cdeltas ] mv -f ${dir}/cdeltas ${dir}/deltas fi return 0 } # list contents of repo from the root of it ls -1 | while read dir do if [ ! -f ${dir}/desc -o ! -f ${dir}/deltas ] then continue fi filename=$(grep -A1 FILENAME ${dir}/desc | tail -n1) filesize=$(grep -A1 CSIZE ${dir}/desc | tail -n1) quota=$(( ${filesize} * 7 / 10 )) cleanup ${filename} 0 done
Marc - A. Dahlhaus schrieb:
The functions arguments and the vars created in its body are local. This is ok as we
only need to use them inside of the current level of the recursion. (removed that part by accident) Marc
Am Donnerstag, den 17.09.2009, 22:13 +0200 schrieb Marc - A. Dahlhaus:
Xavier schrieb:
On Thu, Sep 17, 2009 at 6:42 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote: --8<-- Both scripts have the same output. For example :
38866 < 139532 : Ignoring dhcpcd-4.0.2-1-i686.pkg.tar.gz 38866 < 88713 : Ignoring dhcpcd-4.0.2-2-i686.pkg.tar.gz 38866 < 87880 : Ignoring dhcpcd-4.0.3-1-i686.pkg.tar.gz 38866 < 67312 : Ignoring dhcpcd-4.0.4-1-i686.pkg.tar.gz 38866 < 57650 : Ignoring dhcpcd-5.0.2-1-i686.pkg.tar.gz
vs
dhcpcd-3.2.1-1_to_4.0.2-1-i686.delta 139532 dhcpcd-4.0.2-1_to_4.0.2-2-i686.delta 88713 dhcpcd-4.0.2-2_to_4.0.3-1-i686.delta 87880 dhcpcd-4.0.3-1_to_4.0.4-1-i686.delta 67312 dhcpcd-4.0.4-1_to_5.0.2-1-i686.delta 57650
but I checked several other packages as well.
Yes, i misplaced the returns inside of the loop and fixed this together with some other inefficiencies. I can't work out an efficient implementation to print out the removed deltas because to create such a list we need to finish the whole tree walk before. This is because there is always the possibility that you flag a delta to be removed that is unused in the current branch of the tree but is useful on another branch of the tree: | = quota reached -> = delta file V7 -> V6 -> V5 -> V4 -> V3 | -> V2 -> V1 V7 -> V4 -> V3 -> V2 | -> V1 As the size of the quota V7_to_V4 could be smaller than the sum of the single-version-quotas btween V7 and V4 we would remove a useful V3_to_V2 delta if we settle on a "remove what is detected to be useless" algorithm... Do we even want to print out the "useless" deltas in the first place? I still need to look how this fits into repo-add... What i have so far: delta_cleanup() { if [ ${2} -gt ${quota} ]; then # we exceeded the quota, no need to walk further return 1 else # find deltas that match our target on current level of the tree grep "${1}$" deltas | while read delta md5sum dsize spkg tpkg; do # walk down a level on the branch of each matching source if delta_cleanup ${spkg} $(( ${2} + ${dsize} )); then # don't add the same delta multiple times if ! grep -q "^${delta}" cdeltas &> /dev/null; then # add a usefull delta to the list of cleandeltas echo "${delta} ${md5sum} ${dsize} ${spkg} ${tpkg}" >> \ cleandeltas fi fi done return 0 fi } # this must be done with the new repos tree as PWD if [ $CLEAN -eq 1 ] ls -1 | while read dir; do if [ ! -f ${dir}/desc ]; then continue fi cd ${dir} filename=$(grep -A1 FILENAME desc | tail -n1) if [ $DELTA -eq 1 -a -f deltas ]; then filesize=$(grep -A1 CSIZE desc | tail -n1) quota=$(( ${filesize} * 7 / 10 )) # start walking the tree for the current package delta_cleanup ${filename} 0 # remove the unclean deltas file rm -f deltas # move the cleandeltas file into its final position [ -f cleandeltas ] && mv -f cleandeltas deltas fi cd .. done fi Marc
On Fri, Sep 18, 2009 at 3:32 PM, Marc - A. Dahlhaus [ Administration | Westermann GmbH ] <mad@wol.de> wrote:
Do we even want to print out the "useless" deltas in the first place?
I guess only if we want to keep this outside of repo-add. If this cleanup code is inside repo-add, we can indeed juste rewrite the deltas file. Anyway, I wrote some C code just for fun. It is on my working branch : http://code.toofishes.net/cgit/xavier/pacman.git/log/?h=working The core code I had to add to libalpm is minimal as I expected, just 40 lines. All the rest can be re-used. But I also had to write a standalone script using this. It only does simple things but the simplest things in C require a lot of code :) The benefit is that its much faster (10x), and finds all useless deltas correctly. But currently, the way to load a sync db with libalpm really sucks, because it's not flexible at all.
participants (11)
-
Aaron Griffin
-
Allan McRae
-
Dan McGee
-
Dieter Plaetinck
-
James Rayner
-
Jeff Horelick
-
Loui Chang
-
Marc - A. Dahlhaus
-
Marc - A. Dahlhaus [ Administration | Westermann GmbH ]
-
Nagy Gabor
-
Xavier