[arch-dev-public] Issues with non-ASCII path names in packages

Pierre Schmitz pierre at archlinux.de
Fri Oct 28 12:23:51 EDT 2011


Hi all,

we found some issues with libarchive and some file names when using the
C locale. For example try
    $ bsdtar tf ca-certificates-20111025-1-any.pkg.tar.xz (from
testing)
The result will be "bsdtar: Pathname in pax header can't be converted
to current locale." Fun fact: The package was created using the C
locale. Due to this pacman will refuse to install this package when
using the C locale which is used e.g. by our build tools.

Ignoring the question if we should avoid packaging such file names we
might have a bigger problem here: our C locale does not work well with
unicode names. To solve such issues our friends at Debian introduced a
new C.UTF-8 locale. This might be a better alternative than forcing e.g.
en_US.UTF-8 as default locale which might have other consequences;
different sorting behavior for example. Some information can be found at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=609306 and
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522776

What is your opinion about this? Of course the easiest solution would
be to ensure that all packages only contain ASCII file names. I also
wonder if there any real world use cases for using non-utf8 locales.

Greetings,

Pierre


-- 
Pierre Schmitz, https://users.archlinux.de/~pierre


More information about the arch-dev-public mailing list