[arch-general] Opening a document with unicode in path

Eli Schwartz eschwartz at archlinux.org
Fri Aug 2 18:07:56 UTC 2019


On 8/2/19 1:24 PM, John Z. wrote:
>> Could you verify that the encoding of the filepath is, in fact, UTF8?
>> Filepaths in linux are free to be arbitrary bytes despite the locale
>> settings. Most tools don't care, though I would expect the filepath to
>> display incorrectly in the terminal and file browser if it were not UTF8.
>> So it is probably a long shot but perhaps worth checking.
> 
> Hi, thank you for the suggestion. I tried running your script, and all
> filenames are decoded correctly, no exception was thrown (I also tried
> without try/except just in case something else gets thrown)
> 
> However, you might be onto something here because, interestingly enough:
> while BASH prompt and autocompletition feature both decode the character
> correctly, `ls` does not and outputs a sequence of escape codes:
> 
>     Proc'$'\303\251''dures
> 
> instead of
> 
>     Procedures (where first 'e' is the unicode char, and has french accent)

The ls command will by default escape the character into its numeric
code if it thinks the character is invalid in your locale. I can get ls
to print the same thing as you (using shell-escaped $'\303\251') *iff* I
first export LC_ALL=C (which is not a UTF-8 locale and therefore cannot
print unicode characters).

This indicates something is wrong with your locale, because at the very
least, your shell cannot parse the character correctly -- maybe neither
can libreoffice.

-- 
Eli Schwartz
Bug Wrangler and Trusted User

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 1601 bytes
Desc: OpenPGP digital signature
URL: <https://lists.archlinux.org/pipermail/arch-general/attachments/20190802/c7569cc9/attachment.sig>


More information about the arch-general mailing list