[pacman-dev] Improving -Qi/-Si code
I've been working out-of-tree for a replacement for -Qi / -Si, which can be found here: <https://github.com/vodik/alpm-dump> and I'm looking for some feedback before I start integrating it into pacman. For those not familiar with why I've decided to rewrite this, the existing code is rather clumsy and makes some assumptions: string_display(_("Name :"), alpm_pkg_get_name(pkg), cols); The header is "Name", but the padding is embedded into the string. This forces translation writers to also put the right paddings into their translations, something which they shouldn't be burdened with. Further, `string_display` actually needs to know how long "Name :" is so it can do proper line wrapping. The following lines need to be indented properly. This ends up being recalculated for every header. Instead, if we knew how wide the widest header would be ahead of time, we no longer need to embed padding into translations and we no longer need to calculate line lengths for each print operation. This is what the code in the alpm-dump repository attempts to do. The approach I've taken is have the printing function take an array of const char *. Each index represents a field, like Name or Version. If the index is NULL, its ignored, and if it has a value, it is printed along side corresponding data from the package. For example: static const char *sync_table[LAST_ENTRY] = { [ENTRY_REPOSITORY] = "Repository", [ENTRY_NAME] = "Name", [ENTRY_VERSION] = "Version", [ENTRY_DESCRIPTION] = "Description", [ENTRY_ARCHITECTURE] = "Architecture", ... }; Calculating the max width means iterating across this table once looking for the longest field length. Dumping a package is then a matter of iterating across the table, printing the stored string and then the value that corresponds to the array index. printf("%-*s : ", width, table[index]); switch(index) { case ENTRY_REPOSITORY: indentprint_r(alpm_db_get_name(alpm_pkg_get_db(pkg)), width, 0); break; case ENTRY_NAME: indentprint_r(alpm_pkg_get_name(pkg), width, 0); break; case ENTRY_VERSION: indentprint_r(alpm_pkg_get_version(pkg), width, 0); break; ... } And as an added bonus, alpm-dump is measurably faster than the current code. Not that this matters that much. The current code isn't exactly slow and is still much much faster than stdout is.
On 29/12/13 04:24, Simon Gomizelj wrote:
I've been working out-of-tree for a replacement for -Qi / -Si, which can be found here: <https://github.com/vodik/alpm-dump> and I'm looking for some feedback before I start integrating it into pacman.
For those not familiar with why I've decided to rewrite this, the existing code is rather clumsy and makes some assumptions:
string_display(_("Name :"), alpm_pkg_get_name(pkg), cols);
The header is "Name", but the padding is embedded into the string. This forces translation writers to also put the right paddings into their translations, something which they shouldn't be burdened with.
Further, `string_display` actually needs to know how long "Name :" is so it can do proper line wrapping. The following lines need to be indented properly. This ends up being recalculated for every header.
Instead, if we knew how wide the widest header would be ahead of time, we no longer need to embed padding into translations and we no longer need to calculate line lengths for each print operation.
This is what the code in the alpm-dump repository attempts to do.
The approach I've taken is have the printing function take an array of const char *. Each index represents a field, like Name or Version. If the index is NULL, its ignored, and if it has a value, it is printed along side corresponding data from the package.
For example:
static const char *sync_table[LAST_ENTRY] = { [ENTRY_REPOSITORY] = "Repository", [ENTRY_NAME] = "Name", [ENTRY_VERSION] = "Version", [ENTRY_DESCRIPTION] = "Description", [ENTRY_ARCHITECTURE] = "Architecture", ... };
Calculating the max width means iterating across this table once looking for the longest field length. Dumping a package is then a matter of iterating across the table, printing the stored string and then the value that corresponds to the array index.
So we have two tables, one for -Si and one for -Qi. What about -Sii that adds extra fields? Will you have to add if(config->info > 1) conditions? In the output only, or also in the calculation of field width? -Qii adds "Backup Files:" at the end in a a different format, so we can ignore that... What are the arrays gaining us? Removing a few if(from == ALPM_PKG_FROM_FILE) tests? Is see the disadvantages over passing a first column width to dump_pkg_full are: 1) the field order can easily become inconsistent for -Si and -Qi. alpm_dump shows this... 2) to add/remove a field, I now need to edit an additional two arrays and a enum on top of the output function. I really like the idea of precomupting the first column width, but I am seeing a decrease in maintainability with the approach involving various arrays compared to what we currently have. Can you clarify what we gain over adding: site_t dump_pkg_column_width(alpm_pkgfrom_t, int extra) and adjusting: void dump_pkg_full(alpm_pkg_t *pkg, int extra, size_t colwidth) Allan
We actually do not need to maintain two tables. I've merged it with pacman currently and I don't take that approach. I only did it with alpm-dump because it was quicker to just dump out statically initialized tables for testing. Was probably a poor choice for a demonstration. I have a table building function instead, along the lines of: void build_table(char **table, alpm_pkgfrom_t from) { memset(table, 0, sizeof(char *) * LAST_ENTRY); table[ENTRY_REPOSITORY] = _("Repository"); table[ENTRY_NAME] = _("Name"); table[ENTRY_VERSION] = _("Version"); table[ENTRY_DESCRIPTION] = _("Description"); table[ENTRY_ARCHITECTURE] = _("Architecture"); table[ENTRY_URL] = _("URL"); // snip if(from == ALPM_PKG_FROM_SYNCDB || config->op_s_info > 1) { table[ENTRY_REQUIRED] = _("Required By"); table[ENTRY_OPTIONAL_FOR] = _("Optional For"); } // more branches } So rather, the actual effective change, as far as design goes, is this allows us to break dump_pkg_full into two discrete steps rather than all at once; figure out what to print before you start printing. In fact, all this table building stuff can entirely stay inside package.c if we change dump_pkg_full to expect an alpm_list_t * instead of a alpm_pkg_t *. It doesn't need to be exposed to sync.c/query.c (I think, I'm not fully confident I've integrated the new system everywhere).
On 30/12/13 15:06, Simon Gomizelj wrote:
We actually do not need to maintain two tables. I've merged it with pacman currently and I don't take that approach. I only did it with alpm-dump because it was quicker to just dump out statically initialized tables for testing. Was probably a poor choice for a demonstration.
I have a table building function instead, along the lines of:
void build_table(char **table, alpm_pkgfrom_t from) { memset(table, 0, sizeof(char *) * LAST_ENTRY);
table[ENTRY_REPOSITORY] = _("Repository"); table[ENTRY_NAME] = _("Name"); table[ENTRY_VERSION] = _("Version"); table[ENTRY_DESCRIPTION] = _("Description"); table[ENTRY_ARCHITECTURE] = _("Architecture"); table[ENTRY_URL] = _("URL"); // snip
if(from == ALPM_PKG_FROM_SYNCDB || config->op_s_info > 1) { table[ENTRY_REQUIRED] = _("Required By"); table[ENTRY_OPTIONAL_FOR] = _("Optional For"); }
// more branches }
So rather, the actual effective change, as far as design goes, is this allows us to break dump_pkg_full into two discrete steps rather than all at once; figure out what to print before you start printing.
OK - that sounds good to me.
In fact, all this table building stuff can entirely stay inside package.c if we change dump_pkg_full to expect an alpm_list_t * instead of a alpm_pkg_t *. It doesn't need to be exposed to sync.c/query.c (I think, I'm not fully confident I've integrated the new system everywhere).
I'd have no objects provided it makes sense to always pass a list (I would need to look at what pacman is passed in single package case). Otherwise you could add dump_pkg_list() and call the dump_pkg_{list,full} as needed. I await the final patch submission. Allan
participants (2)
-
Allan McRae
-
Simon Gomizelj