[pacman-dev] [PATCH] Introduce alpm_dbs_update() function for parallel db updates
Allan McRae
allan at archlinux.org
Sun Mar 8 13:04:50 UTC 2020
On 7/3/20 6:35 am, Anatol Pomozov wrote:
> This is an equivalent of alpm_db_update but for multiplexed (parallel)
> download. The difference is that this function accepts list of
> databases to update. And then ALPM internals download it in parallel if
> possible.
>
> Add a stub for _alpm_multi_download the function that will do parallel
> payloads downloads in the future.
Thanks for splitting the patches up into smaller units. It makes it a
lot easier for me to review.
Note that once review is complete, this patch (and the config one) will
sit on a branch until the rest of the work is ready to land in one go.
In fact... it probably makes sense to add _alpm_multi_download first,
then follow-up with this patch.
> Introduce dload_payload->filepath field that contains url path to the
> file we download. It is like fileurl field but does not contain
> protocol/server part. The rationale for having this field is that with
> the curl multidownload the server retry logic is going to move to a curl
> callback. And the callback needs to be able to reconstruct the 'next'
> fileurl. One will be able to do it by getting the next server url from
> 'servers' list and then concat with filepath. Once the 'parallel download'
> refactoring is over 'fileurl' field will go away.
>
> Signed-off-by: Anatol Pomozov <anatol.pomozov at gmail.com>
> ---
> lib/libalpm/alpm.h | 2 +
> lib/libalpm/be_sync.c | 132 ++++++++++++++++++++++++++++++++++++++++++
> lib/libalpm/dload.c | 12 ++++
> lib/libalpm/dload.h | 5 ++
> 4 files changed, 151 insertions(+)
>
> diff --git a/lib/libalpm/alpm.h b/lib/libalpm/alpm.h
> index 93b97f44..eb0490eb 100644
> --- a/lib/libalpm/alpm.h
> +++ b/lib/libalpm/alpm.h
> @@ -1045,6 +1045,8 @@ int alpm_db_remove_server(alpm_db_t *db, const char *url);
> */
> int alpm_db_update(int force, alpm_db_t *db);
>
> +int alpm_dbs_update(alpm_handle_t *handle, alpm_list_t *dbs, int force, int failfast);
As already mentioned in the other reply, I don't think failfast is needed.
It is a bit annoying that "force" and "dbs" are in a different order to
alpm_db_update, but I think this is the better order... so keep it.
> +
> /** Get a package entry from a package database.
> * @param db pointer to the package database to get the package from
> * @param name of the package
> diff --git a/lib/libalpm/be_sync.c b/lib/libalpm/be_sync.c
> index aafed15d..cdb46bd9 100644
> --- a/lib/libalpm/be_sync.c
> +++ b/lib/libalpm/be_sync.c
> @@ -301,6 +301,138 @@ int SYMEXPORT alpm_db_update(int force, alpm_db_t *db)
> return ret;
> }
>
> +/** Update list of databases. This function may run updates in parallel.
> + *
> + * @param dbs a list of alpm_db_t to update.
> + */
This should be in alpm.h and all params and return need documented.
> +int SYMEXPORT alpm_dbs_update(alpm_handle_t *handle, alpm_list_t *dbs, int force, UNUSED int failfast) {
> + char *syncpath;
> + const char *dbext = handle->dbext;
> + alpm_list_t *i;
> + int ret = -1;
> + mode_t oldmask;
> + alpm_list_t *payloads = NULL;
> +
> + /* Sanity checks */
> + ASSERT(dbs != NULL, return -1);
> + handle->pm_errno = ALPM_ERR_OK;
> +
> + syncpath = get_sync_dir(handle);
> + ASSERT(syncpath != NULL, return -1);
> +
> + /* make sure we have a sane umask */
> + oldmask = umask(0022);
> +
> + for(i = dbs; i; i = i->next) {
> + alpm_db_t *db = i->data;
> + int dbforce = force;
> + struct dload_payload *payload = NULL;
> + size_t len;
> + int siglevel;
> +
> + if(!(db->usage & ALPM_DB_USAGE_SYNC)) {
> + continue;
> + }
> +
> + ASSERT(db != handle->db_local, RET_ERR(handle, ALPM_ERR_WRONG_ARGS, -1));
> + ASSERT(db->servers != NULL, RET_ERR(handle, ALPM_ERR_SERVER_NONE, -1));
> +
> + /* force update of invalid databases to fix potential mismatched database/signature */
> + if(db->status & DB_STATUS_INVALID) {
> + dbforce = 1;
> + }
> +
> + CALLOC(payload, 1, sizeof(*payload), RET_ERR(handle, ALPM_ERR_MEMORY, -1));
> +
> + /* set hard upper limit of 128MiB */
> + payload->max_size = 128 * 1024 * 1024;
> + ASSERT(db->servers != NULL, RET_ERR(handle, ALPM_ERR_SERVER_NONE, -1));
> + payload->servers = db->servers;
> +
I got to here. Seems a lot of this is duplicated from the single db
path. If both are going to coexist, can we do some refactoring?
A
More information about the pacman-dev
mailing list