On Tue, Feb 15, 2022 at 00:08:04 +0100, Daan De Meyer via arch-general wrote:
Hi,
Now that glibc 2.35 is available, could we enable and ship the compiled form of the new C.UTF-8 locale in glibc by default in Arch Linux?
From the glibc 2.35 release notes (https://sourceware.org/pipermail/libc-alpha/2022-February/136040.html):
* Support for the C.UTF-8 locale has been added to glibc. The locale supports full code-point sorting for all valid Unicode code points. A limitation in the framework for fnmatch, regexec, and regcomp requires a compromise to save space and only ASCII-based range expressions are supported for now (see bug 28255). The full size of the locale is only ~400KiB, with 346KiB coming from LC_CTYPE information for Unicode. This locale harmonizes downstream C.UTF-8 already shipping in various downstream distributions. The locale is not built into glibc, and must be installed.
Being able to rely on the existence of a UTF-8 english locale simplifies many use cases. A good example of issues introduced due to a lack of a built-in UTF-8 locale is https://github.com/systemd/systemd/pull/8340 which is a workaround added in 2018 that still exists today. Having the C.UTF-8 locale available by default in Arch would enable removing such workarounds.
Any thoughts?
That would indeed be neat. But let's at least wait until the next glibc release[*] that will fix this: Generating locales... C.UTF-8...failed to set locale! [error] LC_MONETARY: value for field `mon_decimal_point' must not be an empty string done [*] or just grab these patches: https://patchwork.sourceware.org/project/glibc/cover/20220131053442.3995804-... https://patchwork.sourceware.org/project/glibc/patch/20220131053442.3995804-... https://patchwork.ozlabs.org/project/glibc/patch/20220131053442.3995804-3-ca... Geert