Working on SandboxUser patch
Just to avoid any potential overlap - I am working on the SandboxUser patch that Remi submitted a long while ago... https://lists.archlinux.org/archives/list/pacman-dev@lists.archlinux.org/thr... The patch is far too big for me to confidently review in one go, so I am splitting it into smaller sections as I go. Progress can be seen here: https://gitlab.archlinux.org/pacman/pacman/-/commits/allan/privsep Allan
Glad to see this one is doing the rounds again, one day we're going to have a bug in curl and this will help a lot. If you want any review from kernel side, please feel free to let me know. One thing that immediately strikes me is that it would be better to list the allowed syscalls rather than the denied ones. We're adding new syscalls all the time, after all, and that would make the list somewhat kernel version agnostic. It can always be turned off with a command line option in pacman, after all. Thanks, Chris
On Tue, Nov 08, 2022 at 01:41:46PM +0000, Chris Down wrote:
Glad to see this one is doing the rounds again, one day we're going to have a bug in curl and this will help a lot.
If you want any review from kernel side, please feel free to let me know.
One thing that immediately strikes me is that it would be better to list the allowed syscalls rather than the denied ones. We're adding new syscalls all the time, after all, and that would make the list somewhat kernel version agnostic. It can always be turned off with a command line option in pacman, after all.
I think you are looking at the wrong version. The first iteration has the syscall filtering, but this was dropped in v2 of the series :) I think the goal was to split it up a little and do the syscall filtering later. -- Morten Linderud PGP: 9C02FF419FECBE16
Morten Linderud writes:
On Tue, Nov 08, 2022 at 01:41:46PM +0000, Chris Down wrote:
Glad to see this one is doing the rounds again, one day we're going to have a bug in curl and this will help a lot.
If you want any review from kernel side, please feel free to let me know.
One thing that immediately strikes me is that it would be better to list the allowed syscalls rather than the denied ones. We're adding new syscalls all the time, after all, and that would make the list somewhat kernel version agnostic. It can always be turned off with a command line option in pacman, after all.
I think you are looking at the wrong version. The first iteration has the syscall filtering, but this was dropped in v2 of the series :)
I'm looking at the version linked by Allan :-) Is that not the version being worked on?
On Tue, Nov 08, 2022 at 01:53:43PM +0000, Chris Down wrote:
Morten Linderud writes:
On Tue, Nov 08, 2022 at 01:41:46PM +0000, Chris Down wrote:
Glad to see this one is doing the rounds again, one day we're going to have a bug in curl and this will help a lot.
If you want any review from kernel side, please feel free to let me know.
One thing that immediately strikes me is that it would be better to list the allowed syscalls rather than the denied ones. We're adding new syscalls all the time, after all, and that would make the list somewhat kernel version agnostic. It can always be turned off with a command line option in pacman, after all.
I think you are looking at the wrong version. The first iteration has the syscall filtering, but this was dropped in v2 of the series :)
I'm looking at the version linked by Allan :-) Is that not the version being worked on?
Allan sent a link to the thread itself with the discussion. The patch itself is this I believe https://lists.archlinux.org/archives/list/pacman-dev@lists.archlinux.org/mes... -- Morten Linderud PGP: 9C02FF419FECBE16
I have now split this patch into much smaller segments - it is much easier for me to review like this. I have also done some shuffling of where the code is and some variable/function renaming. For example, I moved a lot of the base sandbox functions/callbacks/etc into libalpm/sandbox.{h,c} and named them appropriately, as these should be used in the future to (e.g.) sandbox GPG operations. https://gitlab.archlinux.org/pacman/pacman/-/commits/allan/privsep There were some random changes to progress bar code in the original patch. These probably are needed to prevent the assertions I am getting in testing! Buy if you run with --debug or --noprogressbar, things appear to work! allan@mando /var/lib/pacman $ ls -l /var/lib/pacman/sync/ total 49476 -rw-r--r-- 1 root root 7544407 Nov 8 18:15 community.db -rw-r--r-- 1 alpm alpm 29240777 Nov 9 10:49 community.files -rw-r--r-- 1 root root 162479 Nov 8 08:20 core.db -rw-r--r-- 1 alpm alpm 1042583 Nov 9 09:39 core.files -rw-r--r-- 1 root root 1804717 Nov 8 17:56 extra.db -rw-r--r-- 1 alpm alpm 10861206 Nov 9 09:32 extra.files @Remi: can you check these and make sure you agree with the changes I made. You are listed as the author so all blame will head back to you :D @Andrew (or anyone): the main patch I'd like another set of eyes on is: "Add sandboxed download for the internal downloader". Cheers, Allan
On 9/11/22 12:13, Allan McRae wrote:
There were some random changes to progress bar code in the original patch. These probably are needed to prevent the assertions I am getting in testing! Buy if you run with --debug or --noprogressbar, things appear to work!
And they were! Everything appears to be working now. Allan
Am 09.11.22 um 03:13 schrieb Allan McRae:
@Andrew (or anyone): the main patch I'd like another set of eyes on is: "Add sandboxed download for the internal downloader".
I've read it over and it looks mostly good to me. I did not build or test, just read the code. Two things I noticed are: - The "sandboxing failed" log message is a lot nicer in [1] than in [2] (line 995 in dload.c), maybe they should match? - Also, the `done = true` before the break in the first loop of the parent might be redundant (line 1026 in dload.c [3]), since `done` isn't used after the loop. Maybe even all `done=true` could be breaks, as no loop code runs after them anyway? [1]: https://gitlab.archlinux.org/pacman/pacman/-/commit/edb1d8e629a9b7a928bbb00c... [2]: https://gitlab.archlinux.org/pacman/pacman/-/commit/03e2e487e341aed137f3f124... [3]: https://gitlab.archlinux.org/pacman/pacman/-/commit/03e2e487e341aed137f3f124... -- regards, brainpower
On 9/11/22 19:15, brainpower wrote:
Am 09.11.22 um 03:13 schrieb Allan McRae:
@Andrew (or anyone): the main patch I'd like another set of eyes on is: "Add sandboxed download for the internal downloader".
I've read it over and it looks mostly good to me. I did not build or test, just read the code.
Two things I noticed are:
- The "sandboxing failed" log message is a lot nicer in [1] than in [2] (line 995 in dload.c), maybe they should match?
Thanks - I have unified these messages.
- Also, the `done = true` before the break in the first loop of the parent might be redundant (line 1026 in dload.c [3]), since `done` isn't used after the loop. Maybe even all `done=true` could be breaks, as no loop code runs after them anyway?
In fact, that whole while loop looks weird to me. Do we need one here? It looks like if the read() call fails, we bail. Then only bail if we processed that call correctly? Weird... Allan
On 10/11/2022 03:58, Allan McRae wrote:
In fact, that whole while loop looks weird to me. Do we need one here? It looks like if the read() call fails, we bail. Then only bail if we processed that call correctly? Weird...
I believe we need the loop because we might have to process more than one callback event. We want to exit the loop as soon as either the read() call failed, or processing one of the even failed (_alpm_sandbox_process_cb_log or _alpm_sandbox_process_cb_download returning false), so we could get rid of the "done" variable by always breaking indeed, since when we do break it is useless to set "done = true". Remi
Am 10.11.22 um 09:33 schrieb Remi Gacogne:
On 10/11/2022 03:58, Allan McRae wrote:
In fact, that whole while loop looks weird to me. Do we need one here? It looks like if the read() call fails, we bail. Then only bail if we processed that call correctly? Weird...
I believe we need the loop because we might have to process more than one callback event. We want to exit the loop as soon as either the read() call failed, or processing one of the even failed (_alpm_sandbox_process_cb_log or _alpm_sandbox_process_cb_download returning false), so we could get rid of the "done" variable by always breaking indeed, since when we do break it is useless to set "done = true".
Remi
yeah, the loop should be needed for processing multiple events, that's my understanding, too. Progress events would turn up every time there's a progress update, for example, wouldn't they?
On 10/11/2022 03:58, Allan McRae wrote: Then only bail if we processed that call correctly?
I think you (Allan) overlooked that those processing functions return boolean true for success, false for error, instead of the (usual?) int == 0 for success and int != 0 for error. -- regards, bp
Am 10.11.22 um 09:57 schrieb bp:
Am 10.11.22 um 09:33 schrieb Remi Gacogne:
On 10/11/2022 03:58, Allan McRae wrote:
In fact, that whole while loop looks weird to me. Do we need one here? It looks like if the read() call fails, we bail. Then only bail if we processed that call correctly? Weird...
I believe we need the loop because we might have to process more than one callback event. We want to exit the loop as soon as either the read() call failed, or processing one of the even failed (_alpm_sandbox_process_cb_log or _alpm_sandbox_process_cb_download returning false), so we could get rid of the "done" variable by always breaking indeed, since when we do break it is useless to set "done = true".
Remi
yeah, the loop should be needed for processing multiple events, that's my understanding, too. Progress events would turn up every time there's a progress update, for example, wouldn't they?
On 10/11/2022 03:58, Allan McRae wrote: Then only bail if we processed that call correctly?
I think you (Allan) overlooked that those processing functions return boolean true for success, false for error, instead of the (usual?) int == 0 for success and int != 0 for error.
Ah, sorry answered using the wrong mail account, this was me. I dont want to cause confusion... -- regards, brainpower
On 10/11/22 18:57, bp wrote:
I think you (Allan) overlooked that those processing functions return boolean true for success, false for error, instead of the (usual?) int == 0 for success and int != 0 for error.
I had not missed it... we started allowing bool usage in the lead up to 6.0 (in the parallel download patches). I guess not using it in the codebase is a hangover of more limited C99 support when pacman was starting to be written. I think we now require C11(? - it appears this check got lost in the transition to meson...), so I see no reason to continue requiring 0/-1 returns for binary return states. Allan
Am 10.11.22 um 10:23 schrieb Allan McRae:
On 10/11/22 18:57, bp wrote:
I think you (Allan) overlooked that those processing functions return boolean true for success, false for error, instead of the (usual?) int == 0 for success and int != 0 for error.
I had not missed it... we started allowing bool usage in the lead up to 6.0 (in the parallel download patches).
I guess not using it in the codebase is a hangover of more limited C99 support when pacman was starting to be written. I think we now require C11(? - it appears this check got lost in the transition to meson...), so I see no reason to continue requiring 0/-1 returns for binary return states.
Allan
ah, I didn't mean "missed to change" in the sense of requiring C99-style or C11-style. I meant when trying to understand the loop you missed that when using booleans the condition in the if() is the other way round for the usual ints, which lead you to believe the code was bailing out on success? Sorry, if that wasn't as clear as it could have been. -- regards, bp
On 10/11/22 18:33, Remi Gacogne wrote:
On 10/11/2022 03:58, Allan McRae wrote:
In fact, that whole while loop looks weird to me. Do we need one here? It looks like if the read() call fails, we bail. Then only bail if we processed that call correctly? Weird...
I believe we need the loop because we might have to process more than one callback event. We want to exit the loop as soon as either the read() call failed, or processing one of the even failed (_alpm_sandbox_process_cb_log or _alpm_sandbox_process_cb_download returning false), so we could get rid of the "done" variable by always breaking indeed, since when we do break it is useless to set "done = true".
Great - that is the context I needed to understand the loop. I have made the changes. If I hear no other comments by the end of the week, I will push this. Allan
On 11/10/22 at 07:01pm, Allan McRae wrote:
On 10/11/22 18:33, Remi Gacogne wrote:
On 10/11/2022 03:58, Allan McRae wrote:
In fact, that whole while loop looks weird to me. Do we need one here? It looks like if the read() call fails, we bail. Then only bail if we processed that call correctly? Weird...
I believe we need the loop because we might have to process more than one callback event. We want to exit the loop as soon as either the read() call failed, or processing one of the even failed (_alpm_sandbox_process_cb_log or _alpm_sandbox_process_cb_download returning false), so we could get rid of the "done" variable by always breaking indeed, since when we do break it is useless to set "done = true".
Great - that is the context I needed to understand the loop. I have made the changes.
If I hear no other comments by the end of the week, I will push this.
My comments are on the gitlab branch.
On 11/11/22 at 12:22pm, Andrew Gregory wrote:
On 11/10/22 at 07:01pm, Allan McRae wrote:
On 10/11/22 18:33, Remi Gacogne wrote:
On 10/11/2022 03:58, Allan McRae wrote:
In fact, that whole while loop looks weird to me. Do we need one here? It looks like if the read() call fails, we bail. Then only bail if we processed that call correctly? Weird...
I believe we need the loop because we might have to process more than one callback event. We want to exit the loop as soon as either the read() call failed, or processing one of the even failed (_alpm_sandbox_process_cb_log or _alpm_sandbox_process_cb_download returning false), so we could get rid of the "done" variable by always breaking indeed, since when we do break it is useless to set "done = true".
Great - that is the context I needed to understand the loop. I have made the changes.
If I hear no other comments by the end of the week, I will push this.
My comments are on the gitlab branch.
This needs a decent amount of work, can you make a merge request or send the revised patches to the list?
On 12/11/22 11:01, Andrew Gregory wrote:
On 11/11/22 at 12:22pm, Andrew Gregory wrote:
On 11/10/22 at 07:01pm, Allan McRae wrote:
On 10/11/22 18:33, Remi Gacogne wrote:
On 10/11/2022 03:58, Allan McRae wrote:
In fact, that whole while loop looks weird to me. Do we need one here? It looks like if the read() call fails, we bail. Then only bail if we processed that call correctly? Weird...
I believe we need the loop because we might have to process more than one callback event. We want to exit the loop as soon as either the read() call failed, or processing one of the even failed (_alpm_sandbox_process_cb_log or _alpm_sandbox_process_cb_download returning false), so we could get rid of the "done" variable by always breaking indeed, since when we do break it is useless to set "done = true".
Great - that is the context I needed to understand the loop. I have made the changes.
If I hear no other comments by the end of the week, I will push this.
My comments are on the gitlab branch.
This needs a decent amount of work, can you make a merge request or send the revised patches to the list?
Thanks for the reviews. I had noticed the -U <url> issue the other night and am trying to figure that out. It falls into the "not easy" category. I can make a merge request, but don't expect to have a chance to deal with some of the comments you made in the next couple of weeks. I'm happy for anyone to provide fix-up patches. Allan
Thank you Andrew for the review, I definitely missed a few important things. On 12/11/2022 02:41, Allan McRae wrote:
Thanks for the reviews. I had noticed the -U <url> issue the other night and am trying to figure that out. It falls into the "not easy" category. I _think_ we might be able to work-around these cases by defining a new type of event passed from the child to the parent in that case.
I can make a merge request, but don't expect to have a chance to deal with some of the comments you made in the next couple of weeks. I'm happy for anyone to provide fix-up patches.
With some guidance I should be able to provide patches for most of the comments, if not all. It might take a week or so, so I'll post a message when I can start on it so we don't duplicate the work in case you beat me to it. Remi
On 12/11/22 06:22, Andrew Gregory wrote:
On 11/10/22 at 07:01pm, Allan McRae wrote:
On 10/11/22 18:33, Remi Gacogne wrote:
On 10/11/2022 03:58, Allan McRae wrote:
In fact, that whole while loop looks weird to me. Do we need one here? It looks like if the read() call fails, we bail. Then only bail if we processed that call correctly? Weird...
I believe we need the loop because we might have to process more than one callback event. We want to exit the loop as soon as either the read() call failed, or processing one of the even failed (_alpm_sandbox_process_cb_log or _alpm_sandbox_process_cb_download returning false), so we could get rid of the "done" variable by always breaking indeed, since when we do break it is useless to set "done = true".
Great - that is the context I needed to understand the loop. I have made the changes.
If I hear no other comments by the end of the week, I will push this.
My comments are on the gitlab branch.
If anyone can tell me where these comments have gone, it would be greatly appreciated... I do not see any comments on the merge request and have no idea where to find comments on a branch. I want this branch merged so I can make a release. Allan
On 12/2/23 15:13, Allan McRae wrote:
On 12/11/22 06:22, Andrew Gregory wrote:
On 11/10/22 at 07:01pm, Allan McRae wrote:
On 10/11/22 18:33, Remi Gacogne wrote:
On 10/11/2022 03:58, Allan McRae wrote:
In fact, that whole while loop looks weird to me. Do we need one here? It looks like if the read() call fails, we bail. Then only bail if we processed that call correctly? Weird...
I believe we need the loop because we might have to process more than one callback event. We want to exit the loop as soon as either the read() call failed, or processing one of the even failed (_alpm_sandbox_process_cb_log or _alpm_sandbox_process_cb_download returning false), so we could get rid of the "done" variable by always breaking indeed, since when we do break it is useless to set "done = true".
Great - that is the context I needed to understand the loop. I have made the changes.
If I hear no other comments by the end of the week, I will push this.
My comments are on the gitlab branch.
If anyone can tell me where these comments have gone, it would be greatly appreciated... I do not see any comments on the merge request and have no idea where to find comments on a branch.
Found them by going back through the project activity log. https://gitlab.archlinux.org/pacman/pacman/activity They comments were done on specific commits which get a new sha value when rebased. Allan
Hi! First of all, thank you a lot Allan for picking this up! I have reviewed the changes on your branch (up to e46c92d31b4ad3428fb3e5a76f0feaa30de31408), first as I would have done for a normal review, then by comparing against the last version I submitted, and it looks very good to me. The way the code is now organized makes sense to me, and is cleaner than my original attempt, as I hoped it would! I am therefore ready to take the blame if/when something breaks! :) Now let's see what a new set of eyes will dig up. Once this has (hopefully) been merged we can have a look at adding more sandboxing (syscall filtering, filesystem restrictions via landlock, for example). Cheers, Remi
participants (7)
-
Allan McRae
-
Andrew Gregory
-
bp
-
brainpower
-
Chris Down
-
Morten Linderud
-
Remi Gacogne