Re: [pacman-dev] [PATCH] Order downloads by descending max_size
This sort is unlikely to optimal, but optimality would be much harder to achieve.
Do you have anything in mind that would be more optimal? Should I take a different strategy?
The ideal situation is to have 1 large download running to soak up bandwidth
Would it be better to search for any really large packages (much larger than the rest), put those at the top of the list, and then leave everything else unsorted? Would that be too slow as well? Cheers
On 13/8/21 12:18 am, Charlie Sale wrote:
This sort is unlikely to optimal, but optimality would be much harder to achieve.
Do you have anything in mind that would be more optimal? Should I take a different strategy?
The ideal situation is to have 1 large download running to soak up bandwidth
Would it be better to search for any really large packages (much larger than the rest), put those at the top of the
list, and then leave everything else unsorted? Would that be too slow as well?
I think aiming for optimal would require too much complexity in the ordering. I'm happy with a plain sort if it is an improvement. I'll run some with and without this patch and report back. I suggest others do too. Allan
Here is my testing of this patch. I picked 10 sets of 50 random packages from the Arch Linux repos: pacman -Sql | shuf -n50 Then I cleared all caches, and ran: pacman -Sddw --noconfirm I ran each list 5 times, alternating between patched and unpatched versions of pacman. The medians of these five runs are: Unpatched With Patch 17 21 12 10 33 33 29 26 34 29 47 36 30 28 31 25 32 29 50 40 Expressing these as a percentage of the unpatched time (so less than 100 is an improvement): 124 83 100 90 85 77 93 81 91 80 For an average of 90% - meaning on average the downloads are 10% faster when we order the packages. This is obviously based on my connection, my mirror, and my choice of parallel downloads (5), and with limited numbers of trials. But as it stands, I think accepting this patch will improve total download speed. I will properly review the patch soon, with the aim to accept unless contrasting results are provided. Allan
Using the same methodology (although with parallel=7): 16 17 1.06 8 8 1.00 78 78 1.00 6 5 0.83 23 23 1.00 4 4 1.00 17 16 0.94 30 30 1.00 8 8 1.00 8 8 1.00 1.00 avg So with my connection there's no significant gain or loss. Since this data was uninteresting I tried again with a large packet delay. # tc qdisc add dev wlan0 root netem delay 500ms 32 27 0.84 194 184 0.95 37 39 1.05 184 178 0.97 42 37 0.88 257 220 0.86 92 62 0.67 65 44 0.68 77 60 0.78 54 51 0.94 0.87 avg On Thu, Aug 12, 2021, at 06:38, Morgan Adamiec wrote:
In my totally untested theory this would slow things down. The ideal situation is to have 1 large download running to soak up bandwidth and then many small downloads running so they can all be set up and handshake.
I expected the same, but when selecting packages randomly this way I see that there are typically a few very large packages that are both first to start and last to finish. Presumably it does a better job of soaking up bandwidth from the very beginning. If the upgrade list is only small and very-small packages then this might be more likely to run dry towards the end, but I wouldn't be too concerned over finding the perfect sort.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On Sun, 22 Aug 2021, Gavin Troy wrote:
On Thu, Aug 12, 2021, at 06:38, Morgan Adamiec wrote:
In my totally untested theory this would slow things down. The ideal situation is to have 1 large download running to soak up bandwidth and then many small downloads running so they can all be set up and handshake.
I expected the same, but when selecting packages randomly this way I see that there are typically a few very large packages that are both first to start and last to finish. Presumably it does a better job of soaking up bandwidth from the very beginning. If the upgrade list is only small and very-small packages then this might be more likely to run dry towards the end, but I wouldn't be too concerned over finding the perfect sort.
I think, Morgan's theory is wrong. When each package $i needs $s[$i] time to download due to its size plus some fixed time $x due to the time needed to establish a connection/latency/..., then the duration due to the $x parts will not depend on the ordering at all. (well ok, in some edge case, you may get unlucky, that all but one package finish at the same time and then you'll need $x to establish the last connection - but this is pure luck and cannot systematically be tracked by reordering the packages). just my two cents. regards, Erich -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEE3p92iMrPBP64GmxZCu7JB1Xae1oFAmEirBwACgkQCu7JB1Xa e1oI/A/8CvbPgh57BwsNk/oMdb4TWUbEmcEAweMelyODyUCqXzoIyG1xqIBxCBWC Vfqz4hE9IMyAKh069A6+LxAHIl0AH/PIOR6axdAhLLGIfIyLLKc7KwS985kuGZEy TFTQ4aQUgsWQysLZpmYUj1u9afuJUX15endOBF0bih9dl+W2PGlCbHIDBTOzloHt ubXN7ohbv0D39+6s2QrQuOE0AJK+i7Abh0Cj9WUUYqXxZgE7BsM9Du4x6flEN/ik K5Rm1a5HmUN+MQSoKtGmpexfS+ensxtQsCAqu4Vt6xMYNFH/AcVLFNzmaw2vrsAU dMMvaQQZULESkp2WOA/H1Q4gTD3zxtmV9xwOmI1zwVac2ymfJ6AiUWONmAzIZ2Ec KiiQ8brLoekF4RdxuFz2BoFUmSfI2nR4YWFMKER2VrlDjMA3DuctcUzFtcoq+24W sWAf77xErEexjvO3Ei6Ppzy2JUpNnKF+W5FuHzxEYuQp4LIVZ6rfajNSOorZv8DS +yVhFhs01izMVckYtuIwRjLo3J6GZnVvpQooSl50nGUbOrD0euugrPvnqLVkg3Wj nhQQlyG6cR17Px2MbfINvrJtrAmfUUF8Gnf+l+/XvEvCTZJ/Mwo5c/OmJ8zG6JS1 cgDuwPrTvo8LaXtknxjTXwn8lnYw8j7wd74kEhs5j02H3/XlLVI= =JYmj -----END PGP SIGNATURE-----
On 23/8/21 4:32 am, Gavin Troy wrote:
Using the same methodology (although with parallel=7):
16 17 1.06 8 8 1.00 78 78 1.00 6 5 0.83 23 23 1.00 4 4 1.00 17 16 0.94 30 30 1.00 8 8 1.00 8 8 1.00 1.00 avg
So with my connection there's no significant gain or loss. Since this data was uninteresting I tried again with a large packet delay. # tc qdisc add dev wlan0 root netem delay 500ms
32 27 0.84 194 184 0.95 37 39 1.05 184 178 0.97 42 37 0.88 257 220 0.86 92 62 0.67 65 44 0.68 77 60 0.78 54 51 0.94 0.87 avg
Thanks. My takeaway here is there is no drawback to adding this patch, only a potential (average) gain for some people. So I am accepting this patch. Allan
participants (4)
-
Allan McRae
-
Charlie Sale
-
Erich Eckner
-
Gavin Troy