Could you verify that the encoding of the filepath is, in fact, UTF8? Filepaths in linux are free to be arbitrary bytes despite the locale settings. Most tools don't care, though I would expect the filepath to display incorrectly in the terminal and file browser if it were not UTF8. So it is probably a long shot but perhaps worth checking.
Hi, thank you for the suggestion. I tried running your script, and all filenames are decoded correctly, no exception was thrown (I also tried without try/except just in case something else gets thrown) However, you might be onto something here because, interestingly enough: while BASH prompt and autocompletition feature both decode the character correctly, `ls` does not and outputs a sequence of escape codes: Proc'$'\303\251''dures instead of Procedures (where first 'e' is the unicode char, and has french accent)
The following Python script, run in the directory containing the file/directory containing the french character should tell you if it it valid UTF8:
import os for item in os.listdir(b'.'): try: item.decode('utf8') except UnicodeDecodeError: print(item, "is not valid UTF8") raise
On Fri, Aug 2, 2019 at 12:48 PM Eli Schwartz via arch-general < arch-general@archlinux.org> wrote:
On 8/2/19 8:59 AM, John Z. wrote:
Hi everyone, there's a document on Dropbox, that has unicode character in its path (french character). Trying to open this document with libre office (Plasma is running) fails with 'file not found', and the path shown with error clearly presents the path with that unicode character replaced by '??'
What I tried: * copy the document in a path where there's no unicode - it opens * copy the document using shell - it works * copy the document using Dolphin (from Plasma) - it works * check $LANG - its set to `en_CA.UTF8` * search for 'libreoffice unicode path', 'archlinux unicode path' and plethora of similar search terms - not much came through
This makes me think the issue is actually with LibreOffice, but the reason I ask here, and not in their forum, is that on another computer running Ubuntu - this works without fail, so I'm fairly certain the issue is in some local configuration.
Could anyone shed some light on this, please, or at least point me in some direction where I could look?
Can you determine some steps that exactly reproduce the problem? Assuming that the problem should manifest when opening the file using /usr/bin/loffice /path/to/file, I tried creating a test file and opening it, and it worked:
$ mkdir -p '/tmp/unicode paths are 💩/' $ touch '/tmp/unicode paths are 💩/testfile.txt' $ loffice '/tmp/unicode paths are 💩/testfile.txt' $
I could successfully edit this file in libreoffice, save content, or reopen it. Tested with LANG=en_US.UTF-8 and the libreoffice-fresh package
-- Eli Schwartz Bug Wrangler and Trusted User
-- "That gum you like is going to come back in style."