On Sat, Apr 6, 2024 at 10:42 PM Arvid Norlander <arvid@vorpal.se> wrote:
Hi,

After talking to people on Arch Linux IRC channels (mpan in particular)
about this, they recommended I contact you directly about this, since it
affects many packages (so filing a bug on any specific package wasn't really
appropriate).

Arch only packages the final binary crates for Rust (as opposed to separate
packages for every single Rust dependency, which would rightfully drive
people crazy). As a result, you only get one single package that contains
many libraries.

Lets take an example: ripgrep, which links to ~53 dependencies (maybe less,
not all dependencies are used on all platforms, etc). Many of these
dependencies use MIT license or similar that requires:

> The above copyright notice and this permission notice shall be included in
> all copies or substantial portions of the Software.

Looking in /usr/share/licenses/ripgrep I only see the copyright notice
for ripgrep itself. This is a bit of a problem.

Similarly, one of it's dependencies is unicode-ident which uses
"Unicode-DFS-2016" as it's license. That is missing. encoding_rs is
"(Apache-2.0 OR MIT) AND BSD-3-Clause" (which may collapse down to MIT, not
sure, I am not a lawyer).

So I can see two problems here:

1. Copyright notices that need to be included for dependencies, aren't.
2. The SPDX expression may be incorrect (missing some things from
   dependencies that should be included).

Looking at some other packages (not just ripgrep) I see similar issues
across all of those packages.

Now, for something like rust it would be impossible to require the
maintainers to handle this by hand. So lets talk solutions.

* There is cargo-about (already packaged in Arch, also has suspect
  license/copyright info). It doesn't do quite what you want: Given a
  config and a handlebar template it will generate an HTML page with all
  the licenses you use. It will collapse "OR" in the license info based on
  a priority list of which licenses you prefer.

  You could perhaps wrangle the handlebar template to generate a text
  file instead of a HTML file, not sure. It can also output a JSON file
  instead though (I have used that at my dayjob for license compliance).

  While it is a good solution, it is not a drop-in solution without
  additional scripting on top (and you need to specify accepted licenses
  for your project).

* I believe Fedora has some automated tooling based on this comment by a
  Fedora packager: https://users.rust-lang.org/t/psa-check-if-your-cargo-crates-are-clean-and-tagged/109264/25

  That is for the SPDX expression. I have not looked into the details of
  their tooling.

* Debian probably have thought about this too, they tend to be rather
  careful (some may say up-tight even) about this sort of issues. It might
  be worth checking out what they did.

I'd like some automated tooling that works for AUR too, I maintain some
Rust packages there (some of which I'm also the upstream for). For that
reason, I'd be happy to stay "in the loop" on this issue as well as
possibly help (time and energy permitting, due to recently recovering from
burnout, energy tends to vary on a day to day basis), rather than treating
it as a one-off issue report.

Is it a high priority issue? No.
Should it be solved eventually? Yes, for legal reasons.

Best regards,
Arvid Norlander
(Arch Linux user and Rust / C++ software developer)

Hi Arvid,

Thanks for bringing this issue to my attention and your detailed email about it. I'm CCïng our public development mailing list in this response so our other maintainers get informed, too.

I agree that Arch needs a solution for this eventually. Unlike Fedora we do not package Rust libraries so I think we need some help from Cargo for this. Preferably from upstream, but a third-party tool would work as well.

Ideally, I think there we would create a SPDX license expression from the entire crate tree and then simplify it, e.g. to turn `(MIT) AND (MPL-2.0 OR MIT) AND (MIT AND BSD-2-Clause) AND (MPL-2.0 OR BSD-3-Clause)` into `MIT AND BSD-2-Clause AND (MPL-2.0 OR BSD-3-Clause)`. Or perhaps even simpler if the tool had knowledge about which licenses are covered by others.

We could call such a tool in the `package()` function to set the `license` for the package.

I'm not sure how feasible this would be. Are crates required to use SPDX expressions?

Greetings,
Jan