From jelle at vdwaa.nl Mon Jan 14 21:15:27 2019 From: jelle at vdwaa.nl (Jelle van der Waa) Date: Mon, 14 Jan 2019 22:15:27 +0100 Subject: [arch-devops] Reproducible Build nodes In-Reply-To: <20181123102648.GI3508@xinu.at> References: <20181122191934.gvqx47bphloc2yrl@gmail.com> <20181123102648.GI3508@xinu.at> Message-ID: <20190114211525.35rccs3ga2k4eu6f@gmail.com> On 11/23/18 at 11:26am, Florian Pritz via arch-devops wrote: > > On Thu, Nov 22, 2018 at 08:19:35PM +0100, Jelle van der Waa wrote: > > 1) Update the boxes and hope they come back up (~ 200 updates). > > 2) Add DNS entries for the two boxes {repro1,2}.pkgbuild.com. > > Sounds good. Sad news, the boxes where not super useful for reproducible-builds.org since they have more powerful boxes now and Debian is a lot easier for their debian centric deployment. I want to still dedicate these boxes though for our reproducible builds effort but to reproduce .pkg.tar.xz packages. But we haven't made a CI solution with repro yet. [1] My hope is that somewhere this year, we will be able to continiously test if our repo packages are reproducible! [1] https://github.com/archlinux/archlinux-repro Greetings, Jelle van der Waa -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From svenstaro at gmail.com Tue Jan 22 16:02:32 2019 From: svenstaro at gmail.com (Sven-Hendrik Haase) Date: Tue, 22 Jan 2019 17:02:32 +0100 Subject: [arch-devops] Let's get a big build box Message-ID: Hi all, so this has been a long time coming as you know from IRC but now I'm actually taking the time to write an email. :P ## Suggested new server and finances So I'd like us to get a big build box. Specifically this one: https://www.hetzner.de/dedicated-rootserver/dell/dx292 This would be an upgrade to soyuz (and the current soyuz would go away). Total cost with 2x1.92TiB NVMe disks and 256GiB of RAM is ? 461.00/month + ? 219.00 setup. soyuz currently costs us ? 54.00 so we'd be paying ? 407.00/month extra. This is a big step up in cost but 1) our infra costs are very low all in all otherwise and 2) frankly we just have a ton of money laying around doing nothing and while that doesn't mean we have to spend it needlessly, I believe that this is a useful thing to do with the money. ## Performance ### Processors The suggested DX292 has two Intel Xeon Gold 6130 16-Core processors while the current one has a single Intel Xeon CPU E3-1275 v5. From benchmarks, I'm estimating the compute power to be almost exactly 4 times as good in the suggested server [0][1] for our workloads. ### Disks We currently have spinning disks in soyuz and that isn't great for building. While I believe soyuz instead puts chroots onto a tmpfs to mitigate this, it takes away from the usable RAM that we have. This is actually a problem as the server has ran out of memory a few times before. Using RAID1 NVMs (as in the suggested new server) for building would make that workaround unnecessary as these should just generally be fast enough for building. ## Reasoning I believe that the current soyuz is too small for bigger rebuilds and big packages for them to get done quickly. I've heard some members of the team complain about rebuild times of C++-based rebuilds in the past as well. I know that soyuz sits mostly idle currently but I suppose the reason for that is that some people build big packages on their own, faster machines (I know that I do this and some TUs as well). On my machine (12 threads), tensorflow takes ~10h to compile while pytorch and arrayfire are at 2-3h. Yes, these are certainly outliers but imagine we have quite a few more of these packages that I don't know about. Also big rebuilds like KDE, boost would benefit. Ultimately, we all want Arch CI and then we could theoretically dynamically spin up/down big build slaves automatically as we need. However, this is currently blocked by reproducible builds AND the svn-git migration. Therefore, I don't see that happening any time soon. This proposal is for getting a practical solution now and not in a few months/years. Additionally, this big server could also serve as a testbed for the CI. ### Alternatives People have suggested this [2] alternative in the past and while it's quite a bit cheaper, it's also only about half as powerful. While the CPU is about the same speed [3], it only has one of them. ## Closing I know that some people have been skeptical about getting a big, expensive server but I hope I made a good case for why I think we should get one. If not, well, at least we'll have it in the archive. Sven [0] cpubenchmark.net shows only the single processor version but we can roughly double the performance given our workload to estimate dual processor performance: https://www.cpubenchmark.net/compare/Intel-Xeon-Gold-6130-vs-Intel-Xeon-E3-1275-v5/3126vs2672 [1] geekbench.com has whole systems and I actually found a DELL R740 which has the exact same processor configuration as the R640 DX292 from Hetzner that I'm suggesting. From those numbers, 4x the compute power seems about right: https://browser.geekbench.com/v4/cpu/11406589 vs https://browser.geekbench.com/v4/cpu/11568488 [2] https://www.hetzner.de/dedicated-rootserver/ax160 [3] https://www.cpubenchmark.net/compare/AMD-EPYC-7401P-vs-Intel-Xeon-Gold-6130/3118vs3126 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jelle at vdwaa.nl Tue Jan 22 21:22:50 2019 From: jelle at vdwaa.nl (Jelle van der Waa) Date: Tue, 22 Jan 2019 22:22:50 +0100 Subject: [arch-devops] Let's get a big build box In-Reply-To: References: Message-ID: <20190122212248.6p5unln7lad3tv77@gmail.com> On 01/22/19 at 05:02pm, Sven-Hendrik Haase via arch-devops wrote: > Hi all, > > so this has been a long time coming as you know from IRC but now I'm > actually taking the time to write an email. :P :) > ## Suggested new server and finances > > So I'd like us to get a big build box. Specifically this one: > https://www.hetzner.de/dedicated-rootserver/dell/dx292 > This would be an upgrade to soyuz (and the current soyuz would go away). > > Total cost with 2x1.92TiB NVMe disks and 256GiB of RAM is ? 461.00/month + > ? 219.00 setup. > > soyuz currently costs us ? 54.00 so we'd be paying ? 407.00/month extra. > This is a big step up in cost but > 1) our infra costs are very low all in all otherwise and > 2) frankly we just have a ton of money laying around doing nothing and > while that doesn't mean we have to spend it needlessly, I believe that this > is a useful thing to do with the money. > ### Disks > > We currently have spinning disks in soyuz and that isn't great for > building. While I believe soyuz instead puts chroots onto a tmpfs to > mitigate this, it takes away from the usable RAM that we have. This is > actually a problem as the server has ran out of memory a few times before. > Using RAID1 NVMs (as in the suggested new server) for building would make > that workaround unnecessary as these should just generally be fast enough > for building. I agree, if we get something new, use nvme's and use RAM for building only. That saves the devops team from resetting a locked box due to these issues. > > ## Reasoning > > I believe that the current soyuz is too small for bigger rebuilds and big > packages for them to get done quickly. I've heard some members of the team > complain about rebuild times of C++-based rebuilds in the past as well. I > know that soyuz sits mostly idle currently but I suppose the reason for > that is that some people build big packages on their own, faster machines > (I know that I do this and some TUs as well). On my machine (12 threads), > tensorflow takes ~10h to compile while pytorch and arrayfire are at 2-3h. > Yes, these are certainly outliers but imagine we have quite a few more of > these packages that I don't know about. Also big rebuilds like KDE, boost > would benefit. I can't really complain here, soyuz is fast enough for me but I don't package heavy stuf. I do however like this proposal with the following reasoning. We now have a buildserver where TU/Dev's build *official* packages and we run services which can be pwnd such as quassel/synapse and our irc bot. I want to have a nice separation of services and keep the buildserver "clean". If this means getting a new (smaller) box for < ~ 54 euro / month that's fine from my side as long as things are separated. > Ultimately, we all want Arch CI and then we could theoretically dynamically > spin up/down big build slaves automatically as we need. However, this is > currently blocked by reproducible builds AND the svn-git migration. > Therefore, I don't see that happening any time soon. This proposal is for > getting a practical solution now and not in a few months/years. > > Additionally, this big server could also serve as a testbed for the CI. For CI, we can (ab)use the four leftover PIA boxes of which two I want to use for setting up a reproducing CI for our packages. The other two can be used to test a CI, since it can just first test [core] for example. > ### Alternatives > > People have suggested this [2] alternative in the past and while it's quite > a bit cheaper, it's also only about half as powerful. While the CPU is > about the same speed [3], it only has one of them. Since I glanced over it, the difference is that we then have two * 16 cores (32) instead of 24. It is however 164 euro versus ~ 2.5 times as much. It is however ~ 45% faster then our current setup but has more threads and double the amount of ram which would resolve most C++ issues (if not using -j24 I guess???). > ## Closing > > I know that some people have been skeptical about getting a big, expensive > server but I hope I made a good case for why I think we should get one. If > not, well, at least we'll have it in the archive. I still think it's a very steep increase of spending per month i.e. 400 month increase. > > Sven > > [0] cpubenchmark.net shows only the single processor version but we can > roughly double the performance given our workload to estimate dual > processor performance: > https://www.cpubenchmark.net/compare/Intel-Xeon-Gold-6130-vs-Intel-Xeon-E3-1275-v5/3126vs2672 > [1] geekbench.com has whole systems and I actually found a DELL R740 which > has the exact same processor configuration as the R640 DX292 from Hetzner > that I'm suggesting. From those numbers, 4x the compute power seems about > right: https://browser.geekbench.com/v4/cpu/11406589 vs > https://browser.geekbench.com/v4/cpu/11568488 > [2] https://www.hetzner.de/dedicated-rootserver/ax160 > [3] > https://www.cpubenchmark.net/compare/AMD-EPYC-7401P-vs-Intel-Xeon-Gold-6130/3118vs3126 -- Jelle van der Waa From bpiotrowski at archlinux.org Wed Jan 23 08:30:04 2019 From: bpiotrowski at archlinux.org (=?UTF-8?Q?Bart=c5=82omiej_Piotrowski?=) Date: Wed, 23 Jan 2019 09:30:04 +0100 Subject: [arch-devops] Let's get a big build box In-Reply-To: References: Message-ID: <845e7ebf-54e0-a6d3-771e-6a6a31d8e355@archlinux.org> On 22/01/2019 17.02, Sven-Hendrik Haase via arch-devops wrote: > I believe that the current soyuz?is too small for bigger rebuilds and > big packages for them to get done quickly. I've heard some members of > the team complain about rebuild times of C++-based rebuilds in the past > as well. I know that soyuz?sits mostly idle currently but I suppose the > reason for that is that some people build big packages on their own, > faster machines (I know that I do this and some TUs as well). On my > machine (12 threads), tensorflow takes ~10h to compile while pytorch?and > arrayfire are at 2-3h. Yes, these are certainly outliers?but imagine we > have quite a few more of these packages that I don't know about. Also > big rebuilds like KDE, boost would benefit. Almost 500? a month is complete overkill for what we do and what we actually need. This machine is going to stay mostly idle and the fact that we received huge donation does not justify burning money. I'm also pretty sure we don't know about more packages like yours because no one else adds them. Both KDE and boost rebuilds were doing fine so far. > Ultimately, we all want Arch CI and then we could theoretically > dynamically spin up/down big build slaves automatically as we need. > However, this is currently blocked by reproducible builds AND the > svn-git migration. Therefore, I don't see that happening any time soon. > This proposal is for getting a practical solution now and not in a few > months/years. I don't think it's blocked by anything but time. Neither git migration nor reprobuilds affect development of service that would take source tarball or svn directory as input and return ready packages. We have access to packet.net thanks to CNCF, I just haven't heard from anyone actually interested in picking up the slack. Bart?omiej From bluewind at xinu.at Wed Jan 23 10:55:45 2019 From: bluewind at xinu.at (Florian Pritz) Date: Wed, 23 Jan 2019 11:55:45 +0100 Subject: [arch-devops] Let's get a big build box In-Reply-To: <845e7ebf-54e0-a6d3-771e-6a6a31d8e355@archlinux.org> References: <845e7ebf-54e0-a6d3-771e-6a6a31d8e355@archlinux.org> Message-ID: <20190123105545.GI7545@xinu.at> On Wed, Jan 23, 2019 at 09:30:04AM +0100, Bart?omiej Piotrowski via arch-devops wrote: > Almost 500? a month is complete overkill for what we do and what we > actually need. 500? per month does indeed sound like too much. I can see why we'd maybe want a box with slightly more memory or with SSDs, but then again I don't know if building on HDDs really slows things down all that much compared to our current tempfs builds. Maybe someone is interested in building some 5-10minute package to get us some numbers to compare? If HDDs are really much slower, I can see why we might want a new machine with SSDs, but I don't see us needing as big of a machine as what you suggested. Florian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From andrew at crerar.io Wed Jan 23 13:33:30 2019 From: andrew at crerar.io (Andrew Crerar) Date: Wed, 23 Jan 2019 14:33:30 +0100 Subject: [arch-devops] Let's get a big build box In-Reply-To: <20190123105545.GI7545@xinu.at> References: <845e7ebf-54e0-a6d3-771e-6a6a31d8e355@archlinux.org> <20190123105545.GI7545@xinu.at> Message-ID: <7dcc6807-9086-10a4-3945-ab05776a8b33@crerar.io> On 1/23/19 11:55 AM, Florian Pritz via arch-devops wrote: > On Wed, Jan 23, 2019 at 09:30:04AM +0100, Bart?omiej Piotrowski via arch-devops wrote: >> Almost 500? a month is complete overkill for what we do and what we >> actually need. > > 500? per month does indeed sound like too much. I'd have to agree here... Another thing to consider is what the average monthly donation totals are for the distro. Even if we went with something beefy because "why not", my concern would be around sustaining it long term. That is, once the large sum we have burns down, do we have enough in monthly donations to keep it going... I don't have insight in the accounting side of things but I would be surprised if we total more than 500?/month on average. Just some commentary from the peanut gallery ;) Regards, Andrew -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From bluewind at xinu.at Wed Jan 23 16:49:14 2019 From: bluewind at xinu.at (Florian Pritz) Date: Wed, 23 Jan 2019 17:49:14 +0100 Subject: [arch-devops] [arch-dev-public] Uploading old packages on archive.org (Was: Archive cleanup) In-Reply-To: References: <0b8c1b38-0e18-a97f-feac-f1cfbf98fed5@xinu.at> <20180601172445.GA31354@lud.localdomain> <087a44ca-6b7d-55a8-0c81-d95d961c2be6@xinu.at> <20180609233554.GB12475@tuxmachine.localdomain> Message-ID: <20190123164914.GJ7545@xinu.at> Hi Baptiste, On Thu, Jun 14, 2018 at 10:28:17AM +0200, Florian Pritz via arch-dev-public wrote: > On 10.06.2018 01:35, Baptiste Jonglez wrote: > > Archival of all packages between September 2013 and December 2016 is finished: > > > > https://archive.org/details/archlinuxarchive > > > > Here is some documentation on this "Historical Archive" hosted on archive.org: > > > > https://wiki.archlinux.org/index.php/Arch_Linux_Archive#Historical_Archive So the archive is still growing and we'll need to clean it up again soonish. Where do you have the archive.org uploader and is it in a state where we (devops team) can also easily use it? If not, could you help us getting it to that point? We also have an archive cleanup script here[1]. Maybe the uploader can be integrated there? I don't know how complicated it is. [1] https://github.com/archlinux/archivetools/blob/master/archive-cleaner Florian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From baptiste at bitsofnetworks.org Thu Jan 24 08:27:23 2019 From: baptiste at bitsofnetworks.org (Baptiste Jonglez) Date: Thu, 24 Jan 2019 09:27:23 +0100 Subject: [arch-devops] [arch-dev-public] Uploading old packages on archive.org (Was: Archive cleanup) In-Reply-To: <20190123164914.GJ7545@xinu.at> References: <0b8c1b38-0e18-a97f-feac-f1cfbf98fed5@xinu.at> <20180601172445.GA31354@lud.localdomain> <087a44ca-6b7d-55a8-0c81-d95d961c2be6@xinu.at> <20180609233554.GB12475@tuxmachine.localdomain> <20190123164914.GJ7545@xinu.at> Message-ID: <20190124082723.GA2374@lud.localdomain> Hi, On 23-01-19, Florian Pritz wrote: > Hi Baptiste, > > On Thu, Jun 14, 2018 at 10:28:17AM +0200, Florian Pritz via arch-dev-public wrote: > > On 10.06.2018 01:35, Baptiste Jonglez wrote: > > > Archival of all packages between September 2013 and December 2016 is finished: > > > > > > https://archive.org/details/archlinuxarchive > > > > > > Here is some documentation on this "Historical Archive" hosted on archive.org: > > > > > > https://wiki.archlinux.org/index.php/Arch_Linux_Archive#Historical_Archive > > So the archive is still growing and we'll need to clean it up again > soonish. Where do you have the archive.org uploader and is it in a state > where we (devops team) can also easily use it? If not, could you help us > getting it to that point? I have just pushed the script I wrote last time: https://github.com/zorun/arch-historical-archive It's a bit hackish and requires some manual work to correctly upload all packages for a given year, because archive.org rate-limits quite aggressively when they are overloaded. > We also have an archive cleanup script here[1]. Maybe the uploader can > be integrated there? I don't know how complicated it is. > > [1] https://github.com/archlinux/archivetools/blob/master/archive-cleaner What about uploading to archive.org as soon as we archive packages on orion? https://github.com/archlinux/archivetools/blob/master/archive.sh It would avoid hammering the archive.org server, because we would only send one package at a time. In any case, we need a retry mechanism to cope with the case where the upload fails. Baptiste -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From bluewind at xinu.at Thu Jan 24 09:48:27 2019 From: bluewind at xinu.at (Florian Pritz) Date: Thu, 24 Jan 2019 10:48:27 +0100 Subject: [arch-devops] [arch-dev-public] Uploading old packages on archive.org (Was: Archive cleanup) In-Reply-To: <20190124082723.GA2374@lud.localdomain> References: <0b8c1b38-0e18-a97f-feac-f1cfbf98fed5@xinu.at> <20180601172445.GA31354@lud.localdomain> <087a44ca-6b7d-55a8-0c81-d95d961c2be6@xinu.at> <20180609233554.GB12475@tuxmachine.localdomain> <20190123164914.GJ7545@xinu.at> <20190124082723.GA2374@lud.localdomain> Message-ID: <20190124094827.GK7545@xinu.at> On Thu, Jan 24, 2019 at 09:27:23AM +0100, Baptiste Jonglez wrote: > I have just pushed the script I wrote last time: > > https://github.com/zorun/arch-historical-archive > > It's a bit hackish and requires some manual work to correctly upload all > packages for a given year, because archive.org rate-limits quite > aggressively when they are overloaded. Thanks! > > We also have an archive cleanup script here[1]. Maybe the uploader can > > be integrated there? I don't know how complicated it is. > > > > [1] https://github.com/archlinux/archivetools/blob/master/archive-cleaner > > What about uploading to archive.org as soon as we archive packages on orion? > > https://github.com/archlinux/archivetools/blob/master/archive.sh While we still use this archive.sh script, dbscripts has recently also be extended to populate the archive continuously. So uploading could be integrated there with a queue file and a background job that performs the upload. Alternatively the uploader could be kept standalone and just adapted to run more often and to maintain its own database/list to know which packages have already been successfully uploaded and which haven't. I'll call this "state database". Then we could run it every hour or so via a systemd timer and it could upload all new and all failed packages. One thing I'd want to have in this context is that the uploader should exit with an error to let the systemd service fail if a package fails to upload multiple times. I think I'd actually prefer this to be standalone for simplicity. > It would avoid hammering the archive.org server, because we would only > send one package at a time. Avoiding load spikes for archive.org certainly sounds like a good idea and for us it's easier to monitor and maintain services that run more often too. > In any case, we need a retry mechanism to cope with the case where the > upload fails. This could use the state database I mentioned above. As for the implementation of such a database, I'd suggest sqlite instead of rolling your own text based list or whatever. It's fast and simple, but you get all the fancy stuff, like transactions, for free. You also don't have to deal with recovering the database if the script crashes. sqlite just rolls back uncommited transactions for you. Would you be interested in adapting the uploader like this and making it an automated service? If you're interested I can help with the deployment part and provide feedback on the scripting side. If you want, we can also discuss this on IRC. PS: I've whitelisted you on the arch-devops ML so that your replies also get archived. Florian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From svenstaro at gmail.com Fri Jan 25 13:30:43 2019 From: svenstaro at gmail.com (Sven-Hendrik Haase) Date: Fri, 25 Jan 2019 14:30:43 +0100 Subject: [arch-devops] Let's get a big build box In-Reply-To: <7dcc6807-9086-10a4-3945-ab05776a8b33@crerar.io> References: <845e7ebf-54e0-a6d3-771e-6a6a31d8e355@archlinux.org> <20190123105545.GI7545@xinu.at> <7dcc6807-9086-10a4-3945-ab05776a8b33@crerar.io> Message-ID: On Wed, 23 Jan 2019 at 14:33, Andrew Crerar wrote: > On 1/23/19 11:55 AM, Florian Pritz via arch-devops wrote: > > On Wed, Jan 23, 2019 at 09:30:04AM +0100, Bart?omiej Piotrowski via > arch-devops wrote: > >> Almost 500? a month is complete overkill for what we do and what we > >> actually need. > > > > 500? per month does indeed sound like too much. > > I'd have to agree here... Another thing to consider is what the average > monthly > donation totals are for the distro. Even if we went with something beefy > because > "why not", my concern would be around sustaining it long term. That is, > once the > large sum we have burns down, do we have enough in monthly donations to > keep it > going... > > I don't have insight in the accounting side of things but I would be > surprised > if we total more than 500?/month on average. > > Just some commentary from the peanut gallery ;) > > Regards, > > Andrew > > Well, people seem to be overwhelmingly of the opinion that 500?/month is too much. In that case, I put forth the next best contender, the Hetzner AX160-NVMe at 164?/month base price. At its base configuration, it has half the memory, half the disk space and roughly half the compute power of the server I originally put forth but it's also 1/3 the price at this configuration. Given that we'd trade it for current soyuz at 54?/month, it means we'd pay 110?/month extra. What do you guys think about that? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jelle at vdwaa.nl Fri Jan 25 15:13:39 2019 From: jelle at vdwaa.nl (Jelle van der Waa) Date: Fri, 25 Jan 2019 16:13:39 +0100 Subject: [arch-devops] Archweb Python 3 update Message-ID: <20190125151338.d4gdthm3roxjwfax@gmail.com> Hi All, After nearly a year? I've finally deployed the Python 3 version of Archweb with Django 2.1.5 on nymeria (staging server). So far I haven't found any issues but I will do more testing over the weekend. I want to merge the Python 3 branch to master soon and switch archlinux.org. Since next weekend is FOSDEM I think I'll do it the week after. So ~4 February maybe. Feedback is welcome, I'm kind of nervous as this is a big release with many changes. But this will open the ability to add CSP and other security headers / 2FA for login. After those changes the biggest feature wanted is probably automatic out of date flagging of packages, which could be achieved with repology or another API. But repology would be preferable since Archweb does not know about source urls :-) (input welcome) -- Jelle van der Waa -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From polyzen at archlinux.org Fri Jan 25 15:19:53 2019 From: polyzen at archlinux.org (Daniel M. Capella) Date: Fri, 25 Jan 2019 10:19:53 -0500 Subject: [arch-devops] Archweb Python 3 update In-Reply-To: <20190125151338.d4gdthm3roxjwfax@gmail.com> References: <20190125151338.d4gdthm3roxjwfax@gmail.com> Message-ID: On January 25, 2019 10:13:39 AM EST, Jelle van der Waa wrote: >Hi All, > >After nearly a year? I've finally deployed the Python 3 version of >Archweb with Django 2.1.5 on nymeria (staging server). So far I haven't >found any issues but I will do more testing over the weekend. > >I want to merge the Python 3 branch to master soon and switch >archlinux.org. Since next weekend is FOSDEM I think I'll do it the week >after. So ~4 February maybe. > >Feedback is welcome, I'm kind of nervous as this is a big release with >many changes. But this will open the ability to add CSP and other >security headers / 2FA for login. > >After those changes the biggest feature wanted is probably automatic >out >of date flagging of packages, which could be achieved with repology or >another API. But repology would be preferable since Archweb does not >know about source urls :-) (input welcome) Is there a URL for the staging server? -- Best, polyzen From jelle at vdwaa.nl Fri Jan 25 15:22:51 2019 From: jelle at vdwaa.nl (Jelle van der Waa) Date: Fri, 25 Jan 2019 16:22:51 +0100 Subject: [arch-devops] Archweb Python 3 update In-Reply-To: References: <20190125151338.d4gdthm3roxjwfax@gmail.com> Message-ID: <20190125152250.mez6fq3ywarcaeo4@gmail.com> On 01/25/19 at 10:19am, Daniel M. Capella via arch-devops wrote: > On January 25, 2019 10:13:39 AM EST, Jelle van der Waa wrote: > >Hi All, > > > >After nearly a year? I've finally deployed the Python 3 version of > >Archweb with Django 2.1.5 on nymeria (staging server). So far I haven't > >found any issues but I will do more testing over the weekend. > > > >I want to merge the Python 3 branch to master soon and switch > >archlinux.org. Since next weekend is FOSDEM I think I'll do it the week > >after. So ~4 February maybe. > > > >Feedback is welcome, I'm kind of nervous as this is a big release with > >many changes. But this will open the ability to add CSP and other > >security headers / 2FA for login. > > > >After those changes the biggest feature wanted is probably automatic > >out > >of date flagging of packages, which could be achieved with repology or > >another API. But repology would be preferable since Archweb does not > >know about source urls :-) (input welcome) > > Is there a URL for the staging server? https://archweb-dev.archlinux.org -- Jelle van der Waa From grazzolini at archlinux.org Fri Jan 25 15:36:44 2019 From: grazzolini at archlinux.org (Giancarlo Razzolini) Date: Fri, 25 Jan 2019 13:36:44 -0200 Subject: [arch-devops] Archweb Python 3 update In-Reply-To: <20190125151338.d4gdthm3roxjwfax@gmail.com> References: <20190125151338.d4gdthm3roxjwfax@gmail.com> Message-ID: <1548430499.3cftfr8ihh.astroid@arch.razzolini.com.br.none> Em janeiro 25, 2019 13:13 Jelle van der Waa escreveu: > Hi All, > > After nearly a year? I've finally deployed the Python 3 version of > Archweb with Django 2.1.5 on nymeria (staging server). So far I haven't > found any issues but I will do more testing over the weekend. > > I want to merge the Python 3 branch to master soon and switch > archlinux.org. Since next weekend is FOSDEM I think I'll do it the week > after. So ~4 February maybe. > > Feedback is welcome, I'm kind of nervous as this is a big release with > many changes. But this will open the ability to add CSP and other > security headers / 2FA for login. > > After those changes the biggest feature wanted is probably automatic out > of date flagging of packages, which could be achieved with repology or > another API. But repology would be preferable since Archweb does not > know about source urls :-) (input welcome) > Great news! I'll begin testing immediately on nymeria. From what I saw on the branch, it should be fine though. Regards, Giancarlo Razzolini -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 870 bytes Desc: not available URL: From grazzolini at archlinux.org Fri Jan 25 15:36:44 2019 From: grazzolini at archlinux.org (Giancarlo Razzolini) Date: Fri, 25 Jan 2019 13:36:44 -0200 Subject: [arch-devops] Archweb Python 3 update In-Reply-To: <20190125151338.d4gdthm3roxjwfax@gmail.com> References: <20190125151338.d4gdthm3roxjwfax@gmail.com> Message-ID: <1548430499.3cftfr8ihh.astroid@arch.razzolini.com.br.none> Em janeiro 25, 2019 13:13 Jelle van der Waa escreveu: > Hi All, > > After nearly a year? I've finally deployed the Python 3 version of > Archweb with Django 2.1.5 on nymeria (staging server). So far I haven't > found any issues but I will do more testing over the weekend. > > I want to merge the Python 3 branch to master soon and switch > archlinux.org. Since next weekend is FOSDEM I think I'll do it the week > after. So ~4 February maybe. > > Feedback is welcome, I'm kind of nervous as this is a big release with > many changes. But this will open the ability to add CSP and other > security headers / 2FA for login. > > After those changes the biggest feature wanted is probably automatic out > of date flagging of packages, which could be achieved with repology or > another API. But repology would be preferable since Archweb does not > know about source urls :-) (input welcome) > Great news! I'll begin testing immediately on nymeria. From what I saw on the branch, it should be fine though. Regards, Giancarlo Razzolini -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 870 bytes Desc: not available URL: From bluewind at xinu.at Fri Jan 25 16:52:09 2019 From: bluewind at xinu.at (Florian Pritz) Date: Fri, 25 Jan 2019 17:52:09 +0100 Subject: [arch-devops] Let's get a big build box In-Reply-To: References: <845e7ebf-54e0-a6d3-771e-6a6a31d8e355@archlinux.org> <20190123105545.GI7545@xinu.at> <7dcc6807-9086-10a4-3945-ab05776a8b33@crerar.io> Message-ID: <20190125165209.GN7545@xinu.at> On Fri, Jan 25, 2019 at 02:30:43PM +0100, Sven-Hendrik Haase via arch-devops wrote: > In that case, I put forth the next best contender, the Hetzner > AX160-NVMe at 164?/month base price. That's certainly a much more realistic option, but I'm still not sure if we really need it. If I look at the cpu graph of soyuz for the last month, I see a lot of idle time. There's a base load from quassel/matrix which should really be moved elsewhere (a hetzner cloud VM maybe?) and the occasional peak, but I don't really see us needing a bigger machine just yet. I see the build server more as a support machine in case a packager doesn't have a suitable build machine themselves or if their network connection is too slow to upload the packages. For that purpose I'd say the load that soyuz has is perfectly fine and no upgrade is required. That said, I know that you want a faster machine for your big packages. Since I don't have any packages like that personally, I don't have a strong opinion here. Also I fear that if we have a really beefy machine, it might attract more attention from packagers with slower machines and therefore it might be more loaded than what we have now. I mean, who in their right mind wouldn't want to build on the fancy, new, super-fast build server where the same build takes only 1/4 of the time. I'd rather have a second machine similar to soyuz so that we can allow more people to build at the same time without stepping on each other's toes. Then again, we do have sgp.pkgbuild.com and we could probably convert 1-3 more machines if needed. I agree that these machines are "slow", but, to some degree, I see that as a good thing. I hope this explanation makes sense. If not feel free to tell me. Florian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From baptiste at bitsofnetworks.org Sun Jan 27 21:37:23 2019 From: baptiste at bitsofnetworks.org (Baptiste Jonglez) Date: Sun, 27 Jan 2019 22:37:23 +0100 Subject: [arch-devops] [arch-dev-public] Uploading old packages on archive.org (Was: Archive cleanup) In-Reply-To: <20190124094827.GK7545@xinu.at> References: <0b8c1b38-0e18-a97f-feac-f1cfbf98fed5@xinu.at> <20180601172445.GA31354@lud.localdomain> <087a44ca-6b7d-55a8-0c81-d95d961c2be6@xinu.at> <20180609233554.GB12475@tuxmachine.localdomain> <20190123164914.GJ7545@xinu.at> <20190124082723.GA2374@lud.localdomain> <20190124094827.GK7545@xinu.at> Message-ID: <20190127213723.GA24749@lud.localdomain> On 24-01-19, Florian Pritz wrote: > > What about uploading to archive.org as soon as we archive packages on orion? > > > > https://github.com/archlinux/archivetools/blob/master/archive.sh > > While we still use this archive.sh script, dbscripts has recently also > be extended to populate the archive continuously. So uploading could be > integrated there with a queue file and a background job that performs > the upload. > > Alternatively the uploader could be kept standalone and just adapted to > run more often and to maintain its own database/list to know which > packages have already been successfully uploaded and which haven't. I'll > call this "state database". Then we could run it every hour or so via a > systemd timer and it could upload all new and all failed packages. One > thing I'd want to have in this context is that the uploader should exit > with an error to let the systemd service fail if a package fails to > upload multiple times. I think I'd actually prefer this to be standalone > for simplicity. There is one argument against a standalone tool: each time it runs, it will need to scan the whole filesystem hierarchy to detect new packages, which can be quite slow. One solution is to have dbscripts build a queue of new packages to upload, but then the upload tool would not be completely standalone (it's basically your first solution above). A simpler but less robust way would be to scan only the current year (along with the previous year for a while). Other than this issue, it indeed looks like a good idea to clearly separate this tool from the dbscripts. > > In any case, we need a retry mechanism to cope with the case where the > > upload fails. > > This could use the state database I mentioned above. As for the > implementation of such a database, I'd suggest sqlite instead of rolling > your own text based list or whatever. It's fast and simple, but you get > all the fancy stuff, like transactions, for free. You also don't have to > deal with recovering the database if the script crashes. sqlite just > rolls back uncommited transactions for you. > > Would you be interested in adapting the uploader like this and making it > an automated service? If you're interested I can help with the > deployment part and provide feedback on the scripting side. If you want, > we can also discuss this on IRC. I don't have a lot of time to work on this at the moment, but I'll see what I can do. How urgent is the cleanup on orion? Is it ok to do it in a few weeks/months? > PS: I've whitelisted you on the arch-devops ML so that your replies also > get archived. Ok, thanks! Baptiste -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From bluewind at xinu.at Sun Jan 27 22:18:47 2019 From: bluewind at xinu.at (Florian Pritz) Date: Sun, 27 Jan 2019 23:18:47 +0100 Subject: [arch-devops] [arch-dev-public] Uploading old packages on archive.org (Was: Archive cleanup) In-Reply-To: <20190127213723.GA24749@lud.localdomain> References: <0b8c1b38-0e18-a97f-feac-f1cfbf98fed5@xinu.at> <20180601172445.GA31354@lud.localdomain> <087a44ca-6b7d-55a8-0c81-d95d961c2be6@xinu.at> <20180609233554.GB12475@tuxmachine.localdomain> <20190123164914.GJ7545@xinu.at> <20190124082723.GA2374@lud.localdomain> <20190124094827.GK7545@xinu.at> <20190127213723.GA24749@lud.localdomain> Message-ID: <20190127221847.GA14187@xinu.at> On Sun, Jan 27, 2019 at 10:37:23PM +0100, Baptiste Jonglez wrote: > There is one argument against a standalone tool: each time it runs, it > will need to scan the whole filesystem hierarchy to detect new packages, > which can be quite slow. You can focus on the /srv/archive/packages/* directories. I've just run find on those and, once cached, it takes about 0.33 seconds. Uncached it's slightly slower, but below 5 seconds. I forgot to redirect the output the first time and having it output the list makes it quite a bit slower. > A simpler but less robust way would be to scan only the current year > (along with the previous year for a while). The ./packages/ subtree contains all unique packages no matter when they were released. If you just record the filenames of all packages that were already uploaded, you can easily detect new ones. I don't see a need for scanning each of the year/month/day trees. Also, the README in your repo already uses the packages/ tree and does not scan the other directories. Right now, there are a little under 500k packages and signatures in the packages tree. So that's 250k package filenames you'd need to check against the database. I'll ignore signatures and assume that we only add the package file name when the signature has also been uploaded. I've performed some very quick testing against an sqlite db with 2.5M rows. Running a select statement that searches for matches with a set of 10 strings (some of which never match) completed in ~0.2ms. Multiplied by 25k (250k / 10 since we have batches of 10 strings) that's roughly 5 seconds. You will probably get better performance with a smaller database and with larger batches of like 100 file names so I'd say that's perfectly fine. I've also tried matching only a single path and that took slightly under 0.2ms. With a batch of 100 strings it took 0.6ms which puts the total around 1-2 seconds. If you need to further reduce the number of db queries, you could also just check the modification time of the files and skip checking for file that are older than some cutoff (let's say 1 month). I'd advise against this though, unless it's really necessary. > How urgent is the cleanup on orion? Is it ok to do it in a few weeks/months? Looking at your script, I see that you already seem to have uploaded 2016, is that correct? In that case we could go ahead and remove those packages to buy us some more time (probably another year). Last time we only removed up to 2015. Florian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: