[pacman-dev] [Git][pacman/pacman][master] 7 commits: libmakepkg: reproducibilty for python packages
Allan McRae pushed to branch master at Pacman / Pacman
Commits:
1c5a5688 by Allan McRae at 2021-08-08T22:49:32+10:00
libmakepkg: reproducibilty for python packages
Arch Linux has been setting PYTHONHASHSEED=0 to create deterministic
.pyc files. After a thorough review by the Arch Security Team, setting
this variable was determined not to generated vulnerable .pyc files, as
when the loader loads the .pyc file and unmarshalls it, the internal
runtime will just populate the unordered data structures and use a new
runtime hash for them.
Signed-off-by: Allan McRae <allan@archlinux.org>
- - - - -
c0026caa by morganamilo at 2021-09-04T10:33:51+10:00
libalpm: Give -U downloads a random .part name if needed
archweb's download links all ended in /download. This cause all the temp
files to be named download.part. With parallel downloads this results in
multiple downloads to go to the same temp file and breaks the transaction.
Assign random temporary filenames to downloads from URLs that are either
missing a filename, or if the filename does not contain at least three
hyphens (as a well formed package filename does).
While this approach to determining when to use a temporary filename is
not 100% foolproof, it does keep nice looking download progress bar names
when a proper package filename is given. The only downside of not using
temporary files when provided with a filename with three or more hyphens
is URLs created specifically to bypass temporary filename usage can not
be downloaded in parallel. We probably do not want to download packages
from such URLs anyway.
Fixes FS#71464
Modified-by: Allan McRae (do not use temporary files for realish URLs)
Signed-off-by: Allan McRae <allan@archlinux.org>
- - - - -
2ec6de96 by morganamilo at 2021-09-04T10:34:00+10:00
only use effective url for urls containing .db or .pkg
Github and other sites redirect their downloads to a cdn. So the
download http://foo.org/myrepo.db may redirect to something like
https://cdn.foo.org/83749327439.
This then causes pacman to try and download the sig as
https://cdn.foo.org/83749327439.sig which is incorrect. In this case
pacman should append .sig to the original url.
However urls like https://archlinux.org/packages/community/x86_64/0ad/download/
Redirect to the mirror, so .sig has to appended after the redirects and
not before.
So we decide if we should append .sig on the original or effective url
based on if the effective url (minus the query part) has .db or .pkg in it.
Fixes FS#71148
---
v2: move variable decleration to start of block
v3: use dbext instead of db
- - - - -
f951282b by morganamilo at 2021-09-04T10:34:00+10:00
pactest: add tests for downloading packages from a cdn
Test for downloads that redirect to some sort of cdn where the
redirected url does not relate to the original filename.
Signed-off-by: Allan McRae <allan@archlinux.org>
- - - - -
efb714b3 by Charlie Sale at 2021-09-04T10:34:00+10:00
Order downloads by descending max_size
When downloading in parallel, sort by package size so that the larger
packages are queued first to fully leverage parallelism.
Addresses FS#70172
Signed-off-by: Charlie Sale <softwaresale01@gmail.com>
Signed-off-by: Allan McRae <allan@archlinux.org>
- - - - -
cf923e73 by Hugo Osvaldo Barrera at 2021-09-04T10:34:00+10:00
Update broken links pointing to git.archlinux.org
All of these links are broken since the recent move to
gitlab.archlinux.org.
A few projects are, apparently, only available on GitHub, so I've linked
to that source (hopefully that's only temporary).
For git-clone URLs, I've opted for the https URLs since those can be
used by anyone -- whereas the ssh URLs require the user to be registered
on the gitlab instance which is not open to the public yet.
Signed-off-by: Hugo Osvaldo Barrera <hugo@barrera.io>
Signed-off-by: Allan McRae <allan@archlinux.org>
- - - - -
5da4af2b by Hugo Osvaldo Barrera at 2021-09-04T10:34:00+10:00
Delete the "Other Utilities" section
Signed-off-by: Hugo Osvaldo Barrera <hugo@barrera.io>
Signed-off-by: Allan McRae <allan@archlinux.org>
- - - - -
10 changed files:
- doc/index.asciidoc
- doc/submitting-patches.asciidoc
- doc/translation-help.asciidoc
- lib/libalpm/dload.c
- lib/libalpm/dload.h
- scripts/libmakepkg/meson.build
- + scripts/libmakepkg/reproducible.sh.in
- + scripts/libmakepkg/reproducible/meson.build
- + scripts/libmakepkg/reproducible/python.sh.in
- test/pacman/tests/upgrade-download-pkg-and-sig-with-filename.py
Changes:
=====================================
doc/index.asciidoc
=====================================
@@ -59,11 +59,11 @@ configuration files dealing with pacman.
Changelog
~~~~~~~~~
For a good idea of what is going on in pacman development, take a look at the
-link:https://git.archlinux.org/pacman.git/[Git summary page] for the
+link:https://gitlab.archlinux.org/pacman/pacman[Git summary page] for the
project.
See the most recent
-link:https://git.archlinux.org/pacman.git/tree/NEWS[NEWS]
+link:https://gitlab.archlinux.org/pacman/pacman/-/blob/master/NEWS[NEWS]
file for a not-as-frequently-updated list of changes. However, this should
contain the biggest changes in a format more concise than the commit log.
@@ -220,12 +220,11 @@ these trees).
The current development tree can be fetched with the following command:
- git clone git://git.archlinux.org/pacman.git pacman
+ git clone https://gitlab.archlinux.org/pacman/pacman.git
which will fetch the full development history into a directory named pacman.
You can browse the source as well using
-link:https://git.archlinux.org/pacman.git/[cgit]. HTTP/HTTPS URLs are also
-available for cloning purposes; these URLs are listed at the above page.
+link:https://gitlab.archlinux.org/pacman/pacman/[gitlab].
If you are interested in hacking on pacman, it is highly recommended you join
the mailing list mentioned above, as well as take a quick glance at our
@@ -237,20 +236,6 @@ you speak a foreign language, you can help by either creating or updating a
translation file for your native language. Instructions can be found in
link:translation-help.html[translation-help].
-Other Utilities
-~~~~~~~~~~~~~~~
-Although the package manager itself is quite simple, many scripts have been
-developed that help automate building and installing packages. These are used
-extensively in link:https://archlinux.org/[Arch Linux]. Most of these utilities
-are available in the Arch Linux projects
-link:https://git.archlinux.org/[code browser].
-
-Utilities available:
-
-* link:https://git.archlinux.org/dbscripts.git/[dbscripts] - scripts used by Arch Linux to manage the main package repositories
-* link:https://git.archlinux.org/devtools.git/[devtools] - tools to assist in packaging and dependency checking
-* link:https://git.archlinux.org/namcap.git/[namcap] - a package analysis utility written in python
-
Bugs
----
If you find bugs (which is quite likely), please email them to the pacman-dev
=====================================
doc/submitting-patches.asciidoc
=====================================
@@ -20,7 +20,7 @@ started with GIT if you have not worked with it before.
The pacman code can be fetched using the following command:
- git clone git://git.archlinux.org/pacman.git
+ git clone https://gitlab.archlinux.org/pacman/pacman.git
Creating your patch
=====================================
doc/translation-help.asciidoc
=====================================
@@ -78,7 +78,7 @@ Incremental Updates
If you have more advanced needs you will have to get a copy of the pacman
repository.
- git clone git://git.archlinux.org/pacman.git pacman
+ git clone https://gitlab.archlinux.org/pacman/pacman.git
Next, you will need to run `./autogen.sh` and `./configure` in the base
directory to generate the correct Makefiles. At this point, all necessary
=====================================
lib/libalpm/dload.c
=====================================
@@ -613,12 +613,33 @@ static int curl_check_finished_download(CURLM *curlm, CURLMsg *msg,
/* Let's check if client requested downloading accompanion *.sig file */
if(!payload->signature && payload->download_signature && curlerr == CURLE_OK && payload->respcode < 400) {
struct dload_payload *sig = NULL;
-
+ char *url = payload->fileurl;
+ char *_effective_filename;
+ const char *effective_filename;
+ char *query;
+ const char *dbext = alpm_option_get_dbext(handle);
const char* realname = payload->destfile_name ? payload->destfile_name : payload->tempfile_name;
- int len = strlen(effective_url) + 5;
+ int len;
+
+ STRDUP(_effective_filename, effective_url, GOTO_ERR(handle, ALPM_ERR_MEMORY, cleanup));
+ effective_filename = get_filename(_effective_filename);
+ query = strrchr(effective_filename, '?');
+
+ if(query) {
+ query[0] = '\0';
+ }
+
+ /* Only use the effective url for sig downloads if the effective_url contains .dbext or .pkg */
+ if(strstr(effective_filename, dbext) || strstr(effective_filename, ".pkg")) {
+ url = effective_url;
+ }
+
+ free(_effective_filename);
+
+ len = strlen(url) + 5;
CALLOC(sig, 1, sizeof(*sig), GOTO_ERR(handle, ALPM_ERR_MEMORY, cleanup));
MALLOC(sig->fileurl, len, FREE(sig); GOTO_ERR(handle, ALPM_ERR_MEMORY, cleanup));
- snprintf(sig->fileurl, len, "%s.sig", effective_url);
+ snprintf(sig->fileurl, len, "%s.sig", url);
if(payload->trust_remote_name) {
/* In this case server might provide a new name for the main payload.
@@ -767,7 +788,7 @@ static int curl_add_payload(alpm_handle_t *handle, CURLM *curlm,
GOTO_ERR(handle, ALPM_ERR_SERVER_BAD_URL, cleanup);
}
- if(payload->remote_name && strlen(payload->remote_name) > 0) {
+ if(!payload->random_partfile && payload->remote_name && strlen(payload->remote_name) > 0) {
if(!payload->destfile_name) {
payload->destfile_name = get_fullpath(localpath, payload->remote_name, "");
}
@@ -776,8 +797,9 @@ static int curl_add_payload(alpm_handle_t *handle, CURLM *curlm,
goto cleanup;
}
} else {
- /* URL doesn't contain a filename, so make a tempfile. We can't support
- * resuming this kind of download; partial transfers will be destroyed */
+ /* We want a random filename or the URL does not contain a filename, so download to a
+ * temporary location. We can not support resuming this kind of download; any partial
+ * transfers will be destroyed */
payload->unlink_on_fail = 1;
payload->localf = create_tempfile(payload, localpath);
@@ -825,6 +847,19 @@ cleanup:
return ret;
}
+/*
+ * Use to sort payloads by max size in decending order (largest -> smallest)
+ */
+static int compare_dload_payload_sizes(const void *left_ptr, const void *right_ptr)
+{
+ struct dload_payload *left, *right;
+
+ left = (struct dload_payload *) left_ptr;
+ right = (struct dload_payload *) right_ptr;
+
+ return right->max_size - left->max_size;
+}
+
/* Returns -1 if an error happened for a required file
* Returns 0 if a payload was actually downloaded
* Returns 1 if no files were downloaded and all errors were non-fatal
@@ -838,6 +873,10 @@ static int curl_download_internal(alpm_handle_t *handle,
int max_streams = handle->parallel_downloads;
int updated = 0; /* was a file actually updated */
CURLM *curlm = handle->curlm;
+ size_t payloads_size = alpm_list_count(payloads);
+
+ /* Sort payloads by package size */
+ payloads = alpm_list_msort(payloads, payloads_size, &compare_dload_payload_sizes);
while(active_downloads_num > 0 || payloads) {
CURLMcode mc;
@@ -986,11 +1025,20 @@ int SYMEXPORT alpm_fetch_pkgurl(alpm_handle_t *handle, const alpm_list_t *urls,
alpm_list_append(fetched, filepath);
} else {
struct dload_payload *payload = NULL;
+ char *c;
ASSERT(url, GOTO_ERR(handle, ALPM_ERR_WRONG_ARGS, err));
CALLOC(payload, 1, sizeof(*payload), GOTO_ERR(handle, ALPM_ERR_MEMORY, err));
STRDUP(payload->fileurl, url, FREE(payload); GOTO_ERR(handle, ALPM_ERR_MEMORY, err));
- payload->allow_resume = 1;
+
+ c = strrchr(url, '/');
+ if(strstr(c, ".pkg")) {
+ /* we probably have a usable package filename to download to */
+ payload->allow_resume = 1;
+ } else {
+ payload->random_partfile = 1;
+ }
+
payload->handle = handle;
payload->trust_remote_name = 1;
payload->download_signature = (handle->siglevel & ALPM_SIG_PACKAGE);
=====================================
lib/libalpm/dload.h
=====================================
@@ -44,6 +44,7 @@ struct dload_payload {
off_t prevprogress;
int force;
int allow_resume;
+ int random_partfile;
int errors_ok;
int unlink_on_fail;
int trust_remote_name;
=====================================
scripts/libmakepkg/meson.build
=====================================
@@ -5,6 +5,7 @@ libmakepkg_modules = [
{ 'name' : 'lint_config', 'has_subdir' : true },
{ 'name' : 'lint_package', 'has_subdir' : true },
{ 'name' : 'lint_pkgbuild', 'has_subdir' : true },
+ { 'name' : 'reproducible', 'has_subdir' : true },
{ 'name' : 'source', 'has_subdir' : true },
{ 'name' : 'srcinfo', },
{ 'name' : 'tidy', 'has_subdir' : true },
=====================================
scripts/libmakepkg/reproducible.sh.in
=====================================
@@ -0,0 +1,29 @@
+#!/bin/bash
+#
+# reproducible.sh - utilities for improving package reproducibility
+#
+# Copyright (c) 2021 Pacman Development Team
participants (1)
-
Allan McRae (@allan)