Hi, Now that glibc 2.35 is available, could we enable and ship the compiled form of the new C.UTF-8 locale in glibc by default in Arch Linux?
From the glibc 2.35 release notes (https://sourceware.org/pipermail/libc-alpha/2022-February/136040.html):
* Support for the C.UTF-8 locale has been added to glibc. The locale supports full code-point sorting for all valid Unicode code points. A limitation in the framework for fnmatch, regexec, and regcomp requires a compromise to save space and only ASCII-based range expressions are supported for now (see bug 28255). The full size of the locale is only ~400KiB, with 346KiB coming from LC_CTYPE information for Unicode. This locale harmonizes downstream C.UTF-8 already shipping in various downstream distributions. The locale is not built into glibc, and must be installed.
Being able to rely on the existence of a UTF-8 english locale simplifies many use cases. A good example of issues introduced due to a lack of a built-in UTF-8 locale is https://github.com/systemd/systemd/pull/8340 which is a workaround added in 2018 that still exists today. Having the C.UTF-8 locale available by default in Arch would enable removing such workarounds. Any thoughts? Cheers, Daan De Meyer