Re: [arch-dev-public] packaging hunspell dictionaries converted for qt5-webengine
On August 13, 2019 5:17:27 AM EDT, Florian Bruhin <me@the-compiler.org> wrote:
Hey,
My $0.02 as qutebrowser maintainer (off-list because I can't send to arch-dev-public):
Forwarded back to a-d-p with inline comments. :)
On Mon, Aug 12, 2019 at 07:50:44PM -0400, Eli Schwartz via arch-dev-public wrote:
QtWebEngine supports spellchecking: https://doc.qt.io/qt-5/qtwebengine-features.html#spellchecker
However, they have helpfully decided (steered by upstream chromium) to *not* use hunspell dictionaries, and instead to use... hunspell dictionaries stored in /usr/share/qt/qtwebengine_dictionaries/ as ".bdic" files, because this is supposedly "more efficiently read by chromium".
The actual spell checking is implemented inside Chromium, all QtWebEngine does with the dictionaries is passing them to Chromium. I don't think they're happy with bdic files either, but the alternatives aren't really an option (completely reimplementing spell checking support by patching their copy of Chromium, with a lot of added friction each time they want to update their Chromium snapshot).
So it pretty much boils down to "blame Google/Chromium" ;)
Yeah, but I still wanna blame them for npt patching it for the purpose of integrating well. :p
(Actually QtWebEngine's spell-checking infrastructure is entirely willing to read dictionaries in /usr/bin/qtwebengine_dictionaries before looking in /usr/share because clearly they've put great thought into how this is all supposed to work on a conceptual design level especially for distro packaging.)
Agreed this doesn't make much sense for Linux distributions. It happens because it looks next to the executable, which probably *does* make a lot of sense for Windows, macOS, embedded scenarios, bundled apps, etc. It doesn't help much for distributions, but it also doesn't hurt.
So I have a program -- pageedit -- which just added spellchecking support via qtwebengine in the latest release, and I would like to support that. And I don't want to see people being personally responsible for installing their own stuff in /usr/share. While I'm at it, Morten (Foxboron) pointed out to me that qutebrowser also supports spellchecking, and it currently provides a user script which downloads preconverted dictionaries from chromium's git repository into $HOME/.local/share/qutebrowser/ ... because there's apparently no guidance or precedent for actually distributing these dictionaries. (In fact, currently only Fedora seems to make these dictionaries available to users.)
Oh, I didn't know Fedora packages them! I opened a qutebrowser issue too: https://github.com/qutebrowser/qutebrowser/issues/4966
It's possible to convert them yourself, using the qwebengine_convert_dict tool shipped in the qt5-webengine package. I think it would be nice if users were able to obtain these dictionaries properly, but I'm not positive what the best way would be. Ideas:
- Ship a pacman hook to convert whatever the user has installed, implemented via the following libalpm script and hooks: https://paste.xinu.at/m-ydTjU/ - make every hunspell-* package makedepend on qt5-webengine and produce those dictionaries - same thing but also make split packages for basically a tiny data file - force users to install an out of date AUR package not kept in sync with hunspell-* (this one is just a joke)
The advantage of a hook is that users with webengine installed automatically get magic google-approved dictionaries corresponding to the hunspell dictionaries they have installed.
The advantage of modifying each hunspell-* package is saving about 0.38 seconds per file at installation time, plus users don't have weird untracked files in some cloistered dir in /usr/
The advantage of doing anything other than possibility #3 is "avoid adding another 34 packages to the repositories, which users need to manually install in addition to the other dictionaries they explicitly installed".
Depending on how big those dictionaries are, they all could be in a single qt5-webengine-dicts package? Though I guess they aren't much smaller than the hunspell ones, and there probably was a reason those were split.
Well, they are different source code with different versions, so I see no gain or practical way to implement a combined hunspell package. A combined webengine dicts package would need to makedepend on all hunspell dict packages, then get updated for any hunspell dict update.
On Tue, Aug 13, 2019 at 09:22:39AM +0200, Jan Alexander Steffens via arch-dev-public wrote:
On Tue, Aug 13, 2019 at 9:04 AM Bartłomiej Piotrowski via arch-dev-public < arch-dev-public@archlinux.org> wrote:
I'd go with updating all packages to ship the converted files. Cluttering /usr with untracked files doesn't sound good.
Yeah, I agree. I think we should package convert_dict from the Chromium sources as a new package to makedepend on.
I'm assuming those are compatible to each other? It does seem like it from the sources:
https://github.com/qt/qtwebengine/blob/v5.13.0/src/tools/qwebengine_convert_... https://github.com/qt/qtwebengine-chromium/blob/75-based/chromium/chrome/too...
Assuming that WebEngine will not be the only consumer of .bdic dictionaries, how about putting them in /usr/share/bdic, and then either patching sources to use that dir or linking whatever engine-specific dictionaries there?
We could also put them with the other dictionaries into /usr/share/hunspell, assuming that won't cause problems.
I guess Qt wouldn't be opposed to a change (for Qt 5.14 I guess) adding one of those paths.
Maybe the Chromium package could load them as well from there?
Florian
I have no idea at all what chromium does do right now! -- Eli Schwartz Bug Wrangler and Trusted User
participants (1)
-
Eli Schwartz