[arch-dev-public] [RFC] Moving repos to nymeria
Hi, So we got a new box (nymeria) and I'd like to move core/extra/community/multilib/testing/.. repos + svn over there. Setup overview / changes: - create shell accounts for every user, but only allow certain commands to be executed (dbscripts, rsync) - move the svn2git conversion script to nymeria and let gudrun sync the repo periodically for cgit - let archweb sync the needed database files periodically - integrity check will run on nymeria - postfix for @archlinux.org and @aur.archlinux.org: see below [postfix] - did I miss something? Benefits: - more trustful/locked-down system (could be useful for db signing) - 1TB of disk space (~900GiB for packages) - 100Mbit/s uplink - all packages on one box so if we do a big move, extra and community can be synced without admin intervention if dbscripts support that - gerolde won't run much (anything?) any more so it could potentially be merged back into gudrun/host system Drawbacks (kind of): - no more shell accounts for browsing the repo (brynhild can be used for that) - different network latency (gudrun is located in the US, nymeria in Germany) - users can no longer <s>mess up</s> change the repo db manually (no idea if that's still valid, but it happened a few years ago) [postfix]: We can move both domains to nymeria and let users change the forward destination themselves (need to make sure that you can't run arbitrary commands) or just appoint an admin that takes care of changing the destination since that shouldn't happen too often. In the second case we can keep them on gudrun/sigurd or move them where ever we want. Comments welcome. -- Florian Pritz
Le 2012-09-06 11:05, Florian Pritz a écrit :
Hi,
So we got a new box (nymeria) and I'd like to move core/extra/community/multilib/testing/.. repos + svn over there.
Setup overview / changes: - create shell accounts for every user, but only allow certain commands to be executed (dbscripts, rsync) - move the svn2git conversion script to nymeria and let gudrun sync the repo periodically for cgit - let archweb sync the needed database files periodically - integrity check will run on nymeria - postfix for @archlinux.org and @aur.archlinux.org: see below [postfix] - did I miss something?
Benefits: - more trustful/locked-down system (could be useful for db signing) - 1TB of disk space (~900GiB for packages) - 100Mbit/s uplink - all packages on one box so if we do a big move, extra and community can be synced without admin intervention if dbscripts support that - gerolde won't run much (anything?) any more so it could potentially be merged back into gudrun/host system
Drawbacks (kind of): - no more shell accounts for browsing the repo (brynhild can be used for that) - different network latency (gudrun is located in the US, nymeria in Germany) - users can no longer <s>mess up</s> change the repo db manually (no idea if that's still valid, but it happened a few years ago)
[postfix]: We can move both domains to nymeria and let users change the forward destination themselves (need to make sure that you can't run arbitrary commands) or just appoint an admin that takes care of changing the destination since that shouldn't happen too often. In the second case we can keep them on gudrun/sigurd or move them where ever we want.
Comments welcome.
Could we run sogrep on nymeria ? Also, could you please explain why browsing the repo in a shell account will be disabled ? I found this very useful when moving a large number of packages from staging/testing to extra/core. Regards, Stéphane
On 06.09.2012 17:23, Stéphane Gaudreault wrote:
Could we run sogrep on nymeria ?
I don't really see a benefit there. You can already run it on brynhild and sogrep needs a databases which is updated via a cron job so you probably won't even see a difference in update latency between the two.
Also, could you please explain why browsing the repo in a shell account will be disabled ? I found this very useful when moving a large number of packages from staging/testing to extra/core.
The idea is to reduce the possible damage an attacker can cause if he happens to obtain a dev's/TU's ssh key. Without a shell and only a few whitelisted commands the box should be very safe. That allows us to use a server stored signing key for the database without having to worry about someone using a kernel exploit and gaining access to the key. sftp will still be available so if all you want is a file list you can use that. You can also run "sudo syncrepo" on brynhild to force a sync at any time and then browse there. -- Florian Pritz
[2012-09-06 17:39:03 +0200] Florian Pritz:
The idea is to reduce the possible damage an attacker can cause if he happens to obtain a dev's/TU's ssh key. Without a shell and only a few whitelisted commands the box should be very safe. That allows us to use a server stored signing key for the database without having to worry about someone using a kernel exploit and gaining access to the key.
Did we abandon the idea of having packagers download the old DB, check its signature, do changes to it, sign the new DB, and upload it back? Because I would certainly find this much safer and trustworthy than having a black-box server blindly signs anything it is given. And I would also find it too bad to lose the flexibility actual non-root Linux accounts give, such as being able to fix things ourselves when they go wrong (like when pushing to the wrong repo). Cheers. -- Gaetan
On Thu, Sep 6, 2012 at 12:46 PM, Gaetan Bisson <bisson@archlinux.org> wrote:
[2012-09-06 17:39:03 +0200] Florian Pritz:
The idea is to reduce the possible damage an attacker can cause if he happens to obtain a dev's/TU's ssh key. Without a shell and only a few whitelisted commands the box should be very safe. That allows us to use a server stored signing key for the database without having to worry about someone using a kernel exploit and gaining access to the key.
Did we abandon the idea of having packagers download the old DB, check its signature, do changes to it, sign the new DB, and upload it back? Because I would certainly find this much safer and trustworthy than having a black-box server blindly signs anything it is given.
Agree.
And I would also find it too bad to lose the flexibility actual non-root Linux accounts give, such as being able to fix things ourselves when they go wrong (like when pushing to the wrong repo).
What will happen to our personal web space? And what about /srv/ftp/other/ ? Will they move to the new server? If so, we'll need to whitelist enough commands so we can use them without being a PITA. Could you give us a more detailed list of the commands that will be allowed? I'm concerned that the shell would become so crippled that it would be practically unusable. Eric
Cheers.
-- Gaetan
On 06.09.2012 19:18, Eric Bélanger wrote:
On Thu, Sep 6, 2012 at 12:46 PM, Gaetan Bisson <bisson@archlinux.org> wrote:
[2012-09-06 17:39:03 +0200] Florian Pritz:
The idea is to reduce the possible damage an attacker can cause if he happens to obtain a dev's/TU's ssh key. Without a shell and only a few whitelisted commands the box should be very safe. That allows us to use a server stored signing key for the database without having to worry about someone using a kernel exploit and gaining access to the key.
Did we abandon the idea of having packagers download the old DB, check its signature, do changes to it, sign the new DB, and upload it back? Because I would certainly find this much safer and trustworthy than having a black-box server blindly signs anything it is given.
Agree.
And I would also find it too bad to lose the flexibility actual non-root Linux accounts give, such as being able to fix things ourselves when they go wrong (like when pushing to the wrong repo).
Pierre said that we should support using devtools inside screen (db-move can take quite long) and screen allows to run other commands so limiting the shells doesn't seem possible right now. Limiting the shell creates a trusted server which makes signing the databases way more secure because even if we use remote signing the hash is calculated on the server. I understand either way and I don't care if we limit them or not so I'm not going to argue about that. -- Florian Pritz
Am 15.09.2012 23:24, schrieb Florian Pritz:
Pierre said that we should support using devtools inside screen (db-move can take quite long) and screen allows to run other commands so limiting the shells doesn't seem possible right now.
It's dbscripts actually. As packages are signed an attacker cannot inject any code. We should isolate svn though. A shell account with limited permissions (no direct write access to the repos or svn) should be secure enough then. Maybe one day we will reimplement the whole process; but this wont be done anytime soon.
Limiting the shell creates a trusted server which makes signing the databases way more secure because even if we use remote signing the hash is calculated on the server.
We do not sign databases anyway atm. And imho we shouldn't do it until it's possible to tell pacman to trust certain keys only for the database. Then the worst case would be a replay attack which we would detect. Using our packager keys to sign something that is calculated on the server is a bad idea. The server cannot be trusted and our setup should be based on that fact. But this might go off-topic. Right now we don't sign databases and we don't have a finished concept for this. So I'd say keep this in mind but let us not limit by this. Back to the actual topic: the community repo should be moved from sigurd as we are running out of disk space. It is also benifitial to have the dev and tu repos on the same server. Therefor an easy solution would be: * have shell accounts for every dev and tu * maybe review our group setup * package files and svn files cannot be accessed by these accounts. Use some sudo and dedicated user magic here so that only dbscripts can write packages and the svn repo can only be access via an svn client. We can ave a more advanced setup later. Greetings, Pierre -- Pierre Schmitz, https://pierre-schmitz.com
On 16.09.2012 00:29, Pierre Schmitz wrote:
* maybe review our group setup
One group per repo or what do you mean?
* package files and svn files cannot be accessed by these accounts. Use some sudo and dedicated user magic here so that only dbscripts can write packages and the svn repo can only be access via an svn client.
I've looked into that and all I found was that you "should" use ssh forced commands together with separate keys. AFAIK it is not possible to tell svn to run a different command than "svnserve -t" when connected via ssh. It might be possible to use a simple forced commands wrapper that passes just traps svnserve and executes it with sudo. I haven't checked if that works with interactive shells.
We can ave a more advanced setup later.
Good idea. -- Florian Pritz
[2012-09-15 23:24:57 +0200] Florian Pritz:
Did we abandon the idea of having packagers download the old DB, check its signature, do changes to it, sign the new DB, and upload it back? Because I would certainly find this much safer and trustworthy than having a black-box server blindly signs anything it is given.
Limiting the shell creates a trusted server which makes signing the databases way more secure because even if we use remote signing the hash is calculated on the server.
Do we really need remote signing for the DB, given that each of us already downloads the DB when upgrading, most likely several times a day? I do not think downloading it a couple more times when pushing packages will change much. Then I see no need to trust the server: I download the current DB and its signature, check it (it's by Florian P, and of course I trust him), apply my changes, sign and upload back. -- Gaetan
On 16/09/12 15:59, Gaetan Bisson wrote:
(it's by Florian P, and of course I trust him)
Where there goes the point of failure right there! Remember there is also the files.db which is ~10x the size. I guess we should sign that too? Allan
On Sun, Sep 16, 2012 at 7:59 AM, Gaetan Bisson <bisson@archlinux.org> wrote:
Do we really need remote signing for the DB, given that each of us already downloads the DB when upgrading, most likely several times a day? I do not think downloading it a couple more times when pushing packages will change much. Then I see no need to trust the server: I download the current DB and its signature, check it (it's by Florian P, and of course I trust him), apply my changes, sign and upload back.
I want avoid anything that requires me to upload the DB from my computer. Reason: http://www.speedtest.net/result/2173792066.png That would be over 7MB I would have to download and upload for every operation on the [extra] repo.
Am 16.09.2012 08:34, schrieb Jan Steffens:
On Sun, Sep 16, 2012 at 7:59 AM, Gaetan Bisson <bisson@archlinux.org> wrote:
Do we really need remote signing for the DB, given that each of us already downloads the DB when upgrading, most likely several times a day? I do not think downloading it a couple more times when pushing packages will change much. Then I see no need to trust the server: I download the current DB and its signature, check it (it's by Florian P, and of course I trust him), apply my changes, sign and upload back.
I want avoid anything that requires me to upload the DB from my computer. Reason: http://www.speedtest.net/result/2173792066.png That would be over 7MB I would have to download and upload for every operation on the [extra] repo.
Exactly, this is not an option. Also remember that we need to lock the db during that time so nobody else can modify it. Transactions are also way harder to handle; what if the upload fails etc.. Si imho both, remote signing and re-uploading the db files are a no go. Greetings, Pierre -- Pierre Schmitz, https://pierre-schmitz.com
Am 16.09.2012 08:34, schrieb Jan Steffens:
I want avoid anything that requires me to upload the DB from my computer.
[...]
That would be over 7MB I would have to download and upload
Would we really need to sign the full 7MB database? Could we not come up with something more minimal to sign that would still be sufficient? Alternatively, we need to get Jan a better connection :) On Sun, Sep 16, 2012 at 9:47 AM, Pierre Schmitz <pierre@archlinux.de> wrote:
Exactly, this is not an option. Also remember that we need to lock the db during that time so nobody else can modify it. Transactions are also way harder to handle; what if the upload fails etc...
We don't need to lock the database for the duration of the download/sign/upload. We could simply: * check the timestamp of the old database * download the database * check the old signature * update the database and sign the new version * upload the database * lock the database on the server * check if the timestamp has changed * if yes, release the lock and start from scratch * if no, overwrite it with your new version and release the lock This means that you might need to retry once or twice if more than one person is updating the database, so it does not scale that well. However, we are not that many people and we don't update the database that often, so the chance of actually getting a conflict is low (and the additional cost is not that high either). -t
Tom Gundersen wrote:
Am 16.09.2012 08:34, schrieb Jan Steffens:
I want avoid anything that requires me to upload the DB from my computer.
[...]
That would be over 7MB I would have to download and upload
Why can't the following procedure be used? 1) update the database on the server 2) download it 3) check it and sign it 4) upload the signature 5) check that the signature matches on the server The database would only need to be locked during step 1. If user B updates it while user A is in the process of signing it, step 5 will ensure that the uploaded signature from user A is rejected and that user B's signature is kept, even if user B manages to upload a signature before user A. Advantages: * no complicated locking * local signing (i.e. no keys on server) * minimal upload
Would we really need to sign the full 7MB database? Could we not come up with something more minimal to sign that would still be sufficient? Alternatively, we need to get Jan a better connection :)
This is something that came up before when discussing signing of the [haskell] repo. The problem there is that Magnus builds the packages remotely and just doesn't have the bandwidth to download the entire repo and sign it. We found no solution, because the only way to verify the integrity of the file is to check the entire file. Anything generated on the server (e.g. a list of checksums) could be compromised if an attacker managed to gain access. Security costs bandwidth. There does not seem to be any way around it.
We don't need to lock the database for the duration of the download/sign/upload. We could simply:
* check the timestamp of the old database * download the database * check the old signature * update the database and sign the new version * upload the database * lock the database on the server * check if the timestamp has changed * if yes, release the lock and start from scratch * if no, overwrite it with your new version and release the lock
This means that you might need to retry once or twice if more than one person is updating the database, so it does not scale that well. However, we are not that many people and we don't update the database that often, so the chance of actually getting a conflict is low (and the additional cost is not that high either).
The procedure that I outlined above should avoid these conflict altogether. The signature that matches the most recent version of the database wins, regardless of the order of upload. It should work as long as the database is locked when updated on the server. Regards, Xyne
On 16/09/12 23:56, Xyne wrote:
Tom Gundersen wrote:
Am 16.09.2012 08:34, schrieb Jan Steffens:
> I want avoid anything that requires me to upload the DB from my computer.
[...]
> That would be over 7MB I would have to download and upload Why can't the following procedure be used?
1) update the database on the server 2) download it 3) check it and sign it 4) upload the signature 5) check that the signature matches on the server
The database would only need to be locked during step 1. If user B updates it while user A is in the process of signing it, step 5 will ensure that the uploaded signature from user A is rejected and that user B's signature is kept, even if user B manages to upload a signature before user A.
Advantages: * no complicated locking * local signing (i.e. no keys on server) * minimal upload
What does "check it and sign it" mean? Diff it to the old and signed database? Anyway, I think it would need locked throughout. If B updates the database while A is uploading, that is not different to bad guy C adjusting the database and leaving it for someone to sign on the next addition. The only way to maintain what would be a chain of trust - where we can link each database update to the previous database - is to have the current db signature checked before adding the new packages and resigning. Worst case scenario is that you move stuff from [testing] to [core] and [extra] so you need to download three databases - probably less that 2MB in total and then upload three signatures. I am ignoring signing the .files databases...
Allan McRae wrote:
What does "check it and sign it" mean? Diff it to the old and signed database?
By "check it" I mean check that each signature in the database is authentic and trusted, and that every package in the database is signed. I thought there was an easy way to verify each signature's authenticity without also verifying the file's integrity, i.e. confirm that foo.sig was indeed created by user x without caring if it matches foo (pacman handles that). Looking at the command-line options for gpg I do not see any way to do this directly, but that information is contained in the file, e.g. $ wget foo.sig $ touch foo $ gpg --verify foo.sig gpg: Signature made ... using RSA key ID ... gpg: BAD signature from ... The ID and other data can also be dumped using pgpdump (pgpdump-git in AUR). It should be possible to write a simple tool to extract the key ID from each signature (e.g. using gpgme or a wrapper shell script). As long as each file in the database is or appears to be signed by a trusted key, it should be secure. Pacman will check each signature during installation. Even if the signature ID was somehow forged, the integrity check should fail. (If valid signatures can be forged then the whole system is useless anyway.) This approach will obviously involve some overhead as the ID of each signature will need to be extracted and checked, but that should not be significant compared to the overhead of package building. The advantage that I see in this approach versus the one below is that you do not need to maintain a chain of trust. Each database version is verified independently. As mentioned, there is no locking either.
Anyway, I think it would need locked throughout. If B updates the database while A is uploading, that is not different to bad guy C adjusting the database and leaving it for someone to sign on the next addition. The only way to maintain what would be a chain of trust - where we can link each database update to the previous database - is to have the current db signature checked before adding the new packages and resigning.
[2012-09-16 16:03:19 +0000] Xyne:
By "check it" I mean check that each signature in the database is authentic and trusted, and that every package in the database is signed.
Signing the DB serves a completely different purpose to all the signatures on its packages. -- Gaetan
Gaetan Bisson wrote:
[2012-09-16 16:03:19 +0000] Xyne:
By "check it" I mean check that each signature in the database is authentic and trusted, and that every package in the database is signed.
Signing the DB serves a completely different purpose to all the signatures on its packages.
I see now that what I proposed would not ensure the integrity of package metadata such as dependencies. What about individually signing the metadata of each package in the database when a package is added? The packaging procedure would then be: 1) build and sign package locally 2) generate and sign "depends", "desc", etc. files locally 3) upload package and signatures to server 4) add package and signatures to (locked) database on server 5) download database 6) check metadata signatures 7) sign database and upload signature Cons: * redundant generation of metadata files * more data in database Pros: * database integrity can be checked without having to rebuild it locally To clarify, with a chain of trust you need a trusted starting point. That means that someone has to verify all of the package signatures and then locally rebuild the database from scratch. If there is ever a doubt that the chain has been broken (due to malice, carelessness in updates, whatever) then that needs to be repeated. Signing per-package metadata should avoid that. The metadata signatures could be kept out of the database if space is an issue, but each packager would need to download them to check the database in that case. If they are kept in the database then signing the database file itself may be unnecessary. Pacman could verify the integrity of the metadata for each package when it downloads the database.
Xyne wrote:
If they are kept in the database then signing the database file itself may be unnecessary. Pacman could verify the integrity of the metadata for each package when it downloads the database.
Adding to that idea, pacman currently verifies database signatures each time it is run. If the metadata sigs were included in the database then pacman could do the following: 1) check for matching valid sig for each database 2) if no valid sig, check metadata sigs in db 3) if all metadata sigs are valid, sign database with local key, else die
[2012-09-16 23:33:39 +0000] Xyne:
I see now that what I proposed would not ensure the integrity of package metadata such as dependencies.
As the metadata is found within packages (.pkg.tar.xz), package signatures (.pkg.tar.xz.sig) ensure their integrity and, more importantly, authenticity. The point of signing the DB is to prevent an attacker from distributing an outdated Arch package (properly signed by one of our packagers) which has a known vulnerability. For this, all we really need to sign is a list of unique identifiers for the most recent version of all packages in each repos. These identifiers could be the hash of each package, tuples ($pkgname,$pkgver,$pkgrel), etc. But of course it is more elegant to simply sign the DB. What matters is that an attacker cannot withhold one package without withholding all packages (by withholding the DB and its sig). So, when an official packager updates the DB, to prevent an attacker with access to our servers to sneak in an old version of some package, they really need to check that the DB was properly signed by another official packager before making changes and signing it themselves. That is the cryptographically secure way. The other way which has been proposed is based on the assumption that some "hardened" server cannot be breached; then we push our changes to this server and rely on it for automatically signing the DB. -- Gaetan
participants (9)
-
Allan McRae
-
Eric Bélanger
-
Florian Pritz
-
Gaetan Bisson
-
Jan Steffens
-
Pierre Schmitz
-
Stéphane Gaudreault
-
Tom Gundersen
-
Xyne