Am Donnerstag, den 17.09.2009, 14:49 +0200 schrieb Xavier:
On Wed, Sep 16, 2009 at 10:35 PM, Marc - A. Dahlhaus <mad@wol.de> wrote: --8<--
It would make sense to use a recursion based implementation.
A function with two params: $1: version we want to chain down from $2: accumulated size in bytes of deltas above us in the chain
first call would look like:
function currents-package-version 0
inside of function we read all source-versions and filesizes of deltas containing "_to_${1}" in the filename and call function for each match with the source-version and the filesize+$2. We check every recursions retval. If we get a 0 retval for a delta, we would add it to the next repo archive. We return 1 if $2 is larger than repo packages filesize*0.9. Last thing in this function would be a "return 0". --8<-- It took me some time to understand how this would work when just reading that mail, but when trying to implement it it revealed to be very simple. Attached a quick hack in bash I just wrote. However, my current implementation is widely inefficient. And currently I just print the delta which are reachable and too big. So we don' t get the unreachable ones.
This is because you check for size and echo the delta on the same level of the recursion and the condition checking is wrong. We need to check for the retval of the recursion and actualy throw the error condition up one level when we exceed the sizelimit. The checking can only work one recursion level down from the current level because you push the actual deltas size one level down. Recursions are fun are't they? ;-D
It would indeed be better to just print delta reachable and with a good size, but then we would have to inverse that list somehow.
Please take a look at the attached version. It might be able to clear up what i wrote in the other mail a bit. Run it with the repositories root as PWD. I added some echos to it so that you can read the actual decisions the recursion would make from the output.
I am still not sure what to do.. I am not sure bash is an appropriate language for this task. And I am not sure this fits well in repo-add. Currently, repo-add does not do any parsing of the entries.
The cleanup will only run if the user asked for it so it is ok if it needs some more time than without parsing the informations i think.
But half the code for cleaning up delta is actually database parsing (need filename and csize from desc, and every line from deltas)
I think the fixed up demo implementation is clean enough to be useful inside of repo-add as read is able to fill more than just one variable.
I think I still prefer to hack libalpm delta code to do what I want, just returning a list of bad delta which can then be fed to repo-remove (and possibly another script taking care of removing the actual delta files from the FS)
If you think it is worth the effort given that the bash version isn't more than maybe 25 lines of code, go for it... ;-P Marc