[pacman-dev] [PATCH 1/3] meson: replace objects with link_whole
Instead extracting/listing objects from existing static libraries, simply add those to the link list. Note that link_whole is needed otherwise, unused code will be pruned. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> --- meson.build | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/meson.build b/meson.build index 264c6501..fd573826 100644 --- a/meson.build +++ b/meson.build @@ -310,15 +310,14 @@ alpm_deps = [crypto_provider, libarchive, libcurl, libintl, gpgme] libalpm_a = static_library( 'alpm_objlib', libalpm_sources, - # https://github.com/mesonbuild/meson/issues/3937 - objects : libcommon.extract_all_objects(), + link_whole: libcommon, include_directories : includes, dependencies : alpm_deps) libalpm = library( 'alpm', version : libalpm_version, - objects: libalpm_a.extract_all_objects(recursive: true), + link_whole: libalpm_a, dependencies : alpm_deps, install : true) -- 2.29.2
Currently, we are erroneously exporting all the symbols via the libalpm.so. As such, the libcommon dependency is resolved. The libalpm.so exports are about to be resolved shortly, yet that exposed that pacman-conf is missing a link against libcommon. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> --- meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/meson.build b/meson.build index fd573826..380564bb 100644 --- a/meson.build +++ b/meson.build @@ -346,7 +346,7 @@ executable( 'pacman-conf', pacman_conf_sources, include_directories : includes, - link_with : [libalpm], + link_with : [libalpm, libcommon], dependencies : [libarchive], install : true, ) -- 2.29.2
All the required public API is annotated with SYMEXPORT, so we can just add the meson notation, to hide all the symbols by default. Thus we no longer spill all the internal API into the global namespace. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> --- meson.build | 2 ++ 1 file changed, 2 insertions(+) diff --git a/meson.build b/meson.build index 380564bb..6173f2b7 100644 --- a/meson.build +++ b/meson.build @@ -303,6 +303,7 @@ libcommon = static_library( 'common', libcommon_sources, include_directories : includes, + gnu_symbol_visibility : 'hidden', install : false) alpm_deps = [crypto_provider, libarchive, libcurl, libintl, gpgme] @@ -312,6 +313,7 @@ libalpm_a = static_library( libalpm_sources, link_whole: libcommon, include_directories : includes, + gnu_symbol_visibility : 'hidden', dependencies : alpm_deps) libalpm = library( -- 2.29.2
On 12/23/20 5:42 PM, Emil Velikov wrote:
All the required public API is annotated with SYMEXPORT, so we can just add the meson notation, to hide all the symbols by default.
Thus we no longer spill all the internal API into the global namespace. Thanks for noticing this... it's a regression from autotools, which contained in lib/libalpm/Makefile.am:
if ENABLE_VISIBILITY_CC if DARWIN AM_CFLAGS += -fvisibility=hidden else AM_CFLAGS += -fvisibility=internal endif endif I wonder if we had a good reason to use "internal" and if we should continue to do so? IIUC it makes it slightly more optimized at the cost of allowing pointers into private functions (e.g. callbacks) used by other programs to segfault. -- Eli Schwartz Bug Wrangler and Trusted User
On Thu, 24 Dec 2020 at 01:38, Eli Schwartz <eschwartz@archlinux.org> wrote:
On 12/23/20 5:42 PM, Emil Velikov wrote:
All the required public API is annotated with SYMEXPORT, so we can just add the meson notation, to hide all the symbols by default.
Thus we no longer spill all the internal API into the global namespace. Thanks for noticing this... it's a regression from autotools, which contained in lib/libalpm/Makefile.am:
if ENABLE_VISIBILITY_CC if DARWIN AM_CFLAGS += -fvisibility=hidden else AM_CFLAGS += -fvisibility=internal endif endif
I wonder if we had a good reason to use "internal" and if we should continue to do so? IIUC it makes it slightly more optimized at the cost of allowing pointers into private functions (e.g. callbacks) used by other programs to segfault.
If the output of size&ls is any indication - there is little-to-no optimisation happening. The former produces the same numbers, while the latter claims that binaries built with "internal" are larger by 8 bytes (always). Fwiw the above snippet is the first time I've seen anyone using "internal", after staring at various projects for 5+ years. Can bring it back, simply not sure it brings much. Thanks Emil $ ls 812336 build-internal/libalpm.so.12.0.1 812328 build-hidden/libalpm.so.12.0.1 337176 build-internal/pacman 337168 build-hidden/pacman $ size 316708 3080 592 320380 4e37c build-internal/libalpm.so.12.0.1 316708 3080 592 320380 4e37c build-hidden/libalpm.so.12.0.1 155288 5040 5808 166136 288f8 build-internal/pacman 155288 5040 5808 166136 288f8 build-hidden/pacman
On 25/12/20 2:05 am, Emil Velikov wrote:
On Thu, 24 Dec 2020 at 01:38, Eli Schwartz <eschwartz@archlinux.org> wrote:
On 12/23/20 5:42 PM, Emil Velikov wrote:
All the required public API is annotated with SYMEXPORT, so we can just add the meson notation, to hide all the symbols by default.
Thus we no longer spill all the internal API into the global namespace. Thanks for noticing this... it's a regression from autotools, which contained in lib/libalpm/Makefile.am:
if ENABLE_VISIBILITY_CC if DARWIN AM_CFLAGS += -fvisibility=hidden else AM_CFLAGS += -fvisibility=internal endif endif
I wonder if we had a good reason to use "internal" and if we should continue to do so? IIUC it makes it slightly more optimized at the cost of allowing pointers into private functions (e.g. callbacks) used by other programs to segfault.
If the output of size&ls is any indication - there is little-to-no optimisation happening. The former produces the same numbers, while the latter claims that binaries built with "internal" are larger by 8 bytes (always).
Fwiw the above snippet is the first time I've seen anyone using "internal", after staring at various projects for 5+ years. Can bring it back, simply not sure it brings much.
Thanks Emil
$ ls 812336 build-internal/libalpm.so.12.0.1 812328 build-hidden/libalpm.so.12.0.1 337176 build-internal/pacman 337168 build-hidden/pacman
$ size 316708 3080 592 320380 4e37c build-internal/libalpm.so.12.0.1 316708 3080 592 320380 4e37c build-hidden/libalpm.so.12.0.1 155288 5040 5808 166136 288f8 build-internal/pacman 155288 5040 5808 166136 288f8 build-hidden/pacman
It turns out, we have this: #define SYMHIDDEN __attribute__((visibility("internal"))) But we never flag any functions with this. That would enable a compiler to optimize for speed and not compatibility, in which case using -fvisibility=internal would make a difference. I'm not sure if we removed any usage of that from the codebase, or if it was never there... Allan
On 12/28/20 10:15 PM, Allan McRae wrote:
On 25/12/20 2:05 am, Emil Velikov wrote:
On Thu, 24 Dec 2020 at 01:38, Eli Schwartz <eschwartz@archlinux.org> wrote:
On 12/23/20 5:42 PM, Emil Velikov wrote:
All the required public API is annotated with SYMEXPORT, so we can just add the meson notation, to hide all the symbols by default.
Thus we no longer spill all the internal API into the global namespace. Thanks for noticing this... it's a regression from autotools, which contained in lib/libalpm/Makefile.am:
if ENABLE_VISIBILITY_CC if DARWIN AM_CFLAGS += -fvisibility=hidden else AM_CFLAGS += -fvisibility=internal endif endif
I wonder if we had a good reason to use "internal" and if we should continue to do so? IIUC it makes it slightly more optimized at the cost of allowing pointers into private functions (e.g. callbacks) used by other programs to segfault.
If the output of size&ls is any indication - there is little-to-no optimisation happening. The former produces the same numbers, while the latter claims that binaries built with "internal" are larger by 8 bytes (always).
Fwiw the above snippet is the first time I've seen anyone using "internal", after staring at various projects for 5+ years. Can bring it back, simply not sure it brings much.
Thanks Emil
$ ls 812336 build-internal/libalpm.so.12.0.1 812328 build-hidden/libalpm.so.12.0.1 337176 build-internal/pacman 337168 build-hidden/pacman
$ size 316708 3080 592 320380 4e37c build-internal/libalpm.so.12.0.1 316708 3080 592 320380 4e37c build-hidden/libalpm.so.12.0.1 155288 5040 5808 166136 288f8 build-internal/pacman 155288 5040 5808 166136 288f8 build-hidden/pacman
It turns out, we have this: #define SYMHIDDEN __attribute__((visibility("internal")))
But we never flag any functions with this. That would enable a compiler to optimize for speed and not compatibility, in which case using -fvisibility=internal would make a difference.
I'm not sure if we removed any usage of that from the codebase, or if it was never there...
Hmm... only ever used on alpm_add_target (then _alpm_add_loadtarget) and removed in commit 7f7da2b5fc01f46d28236384540c7ecfdac16a63 which added the AM_CFLAGS instead and marked a bunch of symbols as SYMEXPORT. But that define used to be for visibility("hidden"), then in commit 920b0d2049deb148efe89bfebda03d172b68c1f5 both the (unused) define and the AM_CFLAGS got switched to internal: "this allows for slightly better optimization". Which is about as vague as our knowledge today. "Supposedly better and probably works fine, but no one put out numbers on it". idk, "hidden" is probably just fine. -- Eli Schwartz Bug Wrangler and Trusted User
On Tue, 29 Dec 2020 at 03:47, Eli Schwartz <eschwartz@archlinux.org> wrote:
On 12/28/20 10:15 PM, Allan McRae wrote:
On 25/12/20 2:05 am, Emil Velikov wrote:
On Thu, 24 Dec 2020 at 01:38, Eli Schwartz <eschwartz@archlinux.org> wrote:
On 12/23/20 5:42 PM, Emil Velikov wrote:
All the required public API is annotated with SYMEXPORT, so we can just add the meson notation, to hide all the symbols by default.
Thus we no longer spill all the internal API into the global namespace. Thanks for noticing this... it's a regression from autotools, which contained in lib/libalpm/Makefile.am:
if ENABLE_VISIBILITY_CC if DARWIN AM_CFLAGS += -fvisibility=hidden else AM_CFLAGS += -fvisibility=internal endif endif
I wonder if we had a good reason to use "internal" and if we should continue to do so? IIUC it makes it slightly more optimized at the cost of allowing pointers into private functions (e.g. callbacks) used by other programs to segfault.
If the output of size&ls is any indication - there is little-to-no optimisation happening. The former produces the same numbers, while the latter claims that binaries built with "internal" are larger by 8 bytes (always).
Fwiw the above snippet is the first time I've seen anyone using "internal", after staring at various projects for 5+ years. Can bring it back, simply not sure it brings much.
Thanks Emil
$ ls 812336 build-internal/libalpm.so.12.0.1 812328 build-hidden/libalpm.so.12.0.1 337176 build-internal/pacman 337168 build-hidden/pacman
$ size 316708 3080 592 320380 4e37c build-internal/libalpm.so.12.0.1 316708 3080 592 320380 4e37c build-hidden/libalpm.so.12.0.1 155288 5040 5808 166136 288f8 build-internal/pacman 155288 5040 5808 166136 288f8 build-hidden/pacman
It turns out, we have this: #define SYMHIDDEN __attribute__((visibility("internal")))
But we never flag any functions with this. That would enable a compiler to optimize for speed and not compatibility, in which case using -fvisibility=internal would make a difference.
I'm not sure if we removed any usage of that from the codebase, or if it was never there...
Hmm... only ever used on alpm_add_target (then _alpm_add_loadtarget) and removed in commit 7f7da2b5fc01f46d28236384540c7ecfdac16a63 which added the AM_CFLAGS instead and marked a bunch of symbols as SYMEXPORT.
Fwiw annotating the public API tends to be easier and quicker. So most projects opt for that.
But that define used to be for visibility("hidden"), then in commit 920b0d2049deb148efe89bfebda03d172b68c1f5 both the (unused) define and the AM_CFLAGS got switched to internal: "this allows for slightly better optimization".
Which is about as vague as our knowledge today. "Supposedly better and probably works fine, but no one put out numbers on it".
idk, "hidden" is probably just fine.
Another point wrt "internal": Seems like CMake - fairly common and mature build system* - has a wrapper for the "hidden" visibility but lacks one for "internal". Thanks for the references, I'll use those in the commit message for v2 and add an extra patch on top, removing the unused SYMHIDDEN. -Emil * Not here to start a flame war about build-systems and/or their quality. Each one sucks in its unique way :-P
On 12/23/20 5:42 PM, Emil Velikov wrote:
Instead extracting/listing objects from existing static libraries, simply add those to the link list. Note that link_whole is needed otherwise, unused code will be pruned. This is intentional! link_whole is broken, just like the comment says.
It is fixed in newer versions of meson than the current minimum. The fix is that link_whole does precisely what extract_all_objects() does. In other words, link_whole is syntactic sugar for extract_all_objects(). I have intended to switch to link_whole, but *only* once we bump the minimum version of meson to something that guarantees the result is functional. I didn't want to bump the minimum version of meson just for this. -- Eli Schwartz Bug Wrangler and Trusted User
On Thu, 24 Dec 2020 at 00:57, Eli Schwartz <eschwartz@archlinux.org> wrote:
On 12/23/20 5:42 PM, Emil Velikov wrote:
Instead extracting/listing objects from existing static libraries, simply add those to the link list. Note that link_whole is needed otherwise, unused code will be pruned. This is intentional! link_whole is broken, just like the comment says.
It is fixed in newer versions of meson than the current minimum. The fix is that link_whole does precisely what extract_all_objects() does. In other words, link_whole is syntactic sugar for extract_all_objects().
I have intended to switch to link_whole, but *only* once we bump the minimum version of meson to something that guarantees the result is functional.
I didn't want to bump the minimum version of meson just for this.
AFAICT the bug link talks about link_with being broken and says "do not use link_whole" w/o much reason. There seems to be something more to in link_whole than extract_all_objects() since the latter fails with the proposed visibility patch (3/3). Regardless, I may have found a way to make it work w/o bumping the version. Thanks Emil
On Thu, 24 Dec 2020 at 15:46, Emil Velikov <emil.l.velikov@gmail.com> wrote:
On Thu, 24 Dec 2020 at 00:57, Eli Schwartz <eschwartz@archlinux.org> wrote:
On 12/23/20 5:42 PM, Emil Velikov wrote:
Instead extracting/listing objects from existing static libraries, simply add those to the link list. Note that link_whole is needed otherwise, unused code will be pruned. This is intentional! link_whole is broken, just like the comment says.
It is fixed in newer versions of meson than the current minimum. The fix is that link_whole does precisely what extract_all_objects() does. In other words, link_whole is syntactic sugar for extract_all_objects().
I have intended to switch to link_whole, but *only* once we bump the minimum version of meson to something that guarantees the result is functional.
I didn't want to bump the minimum version of meson just for this.
AFAICT the bug link talks about link_with being broken and says "do not use link_whole" w/o much reason.
There seems to be something more to in link_whole than extract_all_objects() since the latter fails with the proposed visibility patch (3/3). Regardless, I may have found a way to make it work w/o bumping the version.
For posterity-sake: Elie asked over IRC about the alleged issues w/o this patch. Upon second testing - there are none, as PATCH 2/3 is enough. Will drop this patch for v2 coming later today. -Emil
participants (3)
-
Allan McRae
-
Eli Schwartz
-
Emil Velikov