[arch-dev-public] packaging hunspell dictionaries converted for qt5-webengine

Eli Schwartz eschwartz at archlinux.org
Mon Aug 12 23:50:44 UTC 2019


QtWebEngine supports spellchecking:
https://doc.qt.io/qt-5/qtwebengine-features.html#spellchecker

However, they have helpfully decided (steered by upstream chromium) to
*not* use hunspell dictionaries, and instead to use... hunspell
dictionaries stored in /usr/share/qt/qtwebengine_dictionaries/ as
".bdic" files, because this is supposedly "more efficiently read by
chromium".

(Actually QtWebEngine's spell-checking infrastructure is entirely
willing to read dictionaries in /usr/bin/qtwebengine_dictionaries before
looking in /usr/share because clearly they've put great thought into how
this is all supposed to work on a conceptual design level especially for
distro packaging.)

So I have a program -- pageedit -- which just added spellchecking
support via qtwebengine in the latest release, and I would like to
support that. And I don't want to see people being personally
responsible for installing their own stuff in /usr/share. While I'm at
it, Morten (Foxboron) pointed out to me that qutebrowser also supports
spellchecking, and it currently provides a user script which downloads
preconverted dictionaries from chromium's git repository into
$HOME/.local/share/qutebrowser/ ... because there's apparently no
guidance or precedent for actually distributing these dictionaries. (In
fact, currently only Fedora seems to make these dictionaries available
to users.)

It's possible to convert them yourself, using the
qwebengine_convert_dict tool shipped in the qt5-webengine package. I
think it would be nice if users were able to obtain these dictionaries
properly, but I'm not positive what the best way would be. Ideas:

- Ship a pacman hook to convert whatever the user has installed,
  implemented via the following libalpm script and hooks:
  https://paste.xinu.at/m-ydTjU/
- make every hunspell-* package makedepend on qt5-webengine and produce
  those dictionaries
- same thing but also make split packages for basically a tiny data file
- force users to install an out of date AUR package not kept in sync
  with hunspell-* (this one is just a joke)

The advantage of a hook is that users with webengine installed
automatically get magic google-approved dictionaries corresponding to
the hunspell dictionaries they have installed.

The advantage of modifying each hunspell-* package is saving about 0.38
seconds per file at installation time, plus users don't have weird
untracked files in some cloistered dir in /usr/

The advantage of doing anything other than possibility #3 is "avoid
adding another 34 packages to the repositories, which users need to
manually install in addition to the other dictionaries they explicitly
installed".

...

Prior art:

Fedora uses rpm post-install filetriggers:
https://src.fedoraproject.org/rpms/qt5-qtwebengine/blob/master/f/qt5-qtwebengine.spec#_489

Gentoo has a proposal for a package that runs the conversion tool on
each file the user has installed in /usr/share/hunspell/ and packages
the results.

...

Thoughts on the best way forward to make these dictionaries available on
Arch Linux?

-- 
Eli Schwartz
Bug Wrangler and Trusted User

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 1601 bytes
Desc: OpenPGP digital signature
URL: <https://lists.archlinux.org/pipermail/arch-dev-public/attachments/20190812/cf4712f2/attachment.sig>


More information about the arch-dev-public mailing list