[arch-general] Opening a document with unicode in path

Chris Billington chrisjbillington at gmail.com
Fri Aug 2 17:06:45 UTC 2019


Could you verify that the encoding of the filepath is, in fact, UTF8?
Filepaths in linux are free to be arbitrary bytes despite the locale
settings. Most tools don't care, though I would expect the filepath to
display incorrectly in the terminal and file browser if it were not UTF8.
So it is probably a long shot but perhaps worth checking.

The following Python script, run in the directory containing the
file/directory containing the french character should tell you if it it
valid UTF8:

import os
for item in os.listdir(b'.'):
    try:
        item.decode('utf8')
    except UnicodeDecodeError:
        print(item, "is not valid UTF8")
        raise

On Fri, Aug 2, 2019 at 12:48 PM Eli Schwartz via arch-general <
arch-general at archlinux.org> wrote:

> On 8/2/19 8:59 AM, John Z. wrote:
> > Hi everyone,
> >     there's a document on Dropbox, that has unicode character in its
> >     path (french character). Trying to open this document with libre
> >     office (Plasma is running) fails with 'file not found', and the path
> >     shown with error clearly presents the path with that unicode
> >     character replaced by '??'
> >
> >     What I tried:
> >     * copy the document in a path where there's no unicode - it opens
> >     * copy the document using shell - it works
> >     * copy the document using Dolphin (from Plasma) - it works
> >     * check $LANG - its set to `en_CA.UTF8`
> >     * search for 'libreoffice unicode path', 'archlinux unicode path'
> >       and plethora of similar search terms - not much came through
> >
> >     This makes me think the issue is actually with LibreOffice, but the
> >     reason I ask here, and not in their forum, is that on another
> >     computer running Ubuntu - this works without fail, so I'm fairly
> >     certain the issue is in some local configuration.
> >
> >     Could anyone shed some light on this, please, or at least point me
> >     in some direction where I could look?
>
> Can you determine some steps that exactly reproduce the problem?
> Assuming that the problem should manifest when opening the file using
> /usr/bin/loffice /path/to/file, I tried creating a test file and opening
> it, and it worked:
>
> $ mkdir -p '/tmp/unicode paths are 💩/'
> $ touch '/tmp/unicode paths are 💩/testfile.txt'
> $ loffice '/tmp/unicode paths are 💩/testfile.txt'
> $
>
> I could successfully edit this file in libreoffice, save content, or
> reopen it.
> Tested with LANG=en_US.UTF-8 and the libreoffice-fresh package
>
> --
> Eli Schwartz
> Bug Wrangler and Trusted User
>
>


More information about the arch-general mailing list