[arch-dev-public] [RFC] archweb nvchecker integration

Felix Yan felixonmars at archlinux.org
Wed Feb 2 19:03:21 UTC 2022


On 2/1/22 22:54, George Rawlinson via arch-dev-public wrote:
> On 22-02-01 08:21, Morten Linderud via arch-dev-public wrote:
> At this stage, the following [community] packages that I maintain
> require massaging of HTML sources:
> 
> * html-xml-utils
> * oil
> * parallel
> * libmilter (bundled with sendmail source)
> * time
> 
> I suppose if a nvchecker plugin existed that utilised bs4 (beautiful
> soup), that would work. But I assume that would still fit your
> definition of "arbitrary script". :p

There is a regex plugin and a htmlparser plugin for this.

The htmlparser plugin accepts XPath, but if you want to process it 
further the regex plugin may just work better.

Examples for your packages:

[html-xml-utils]
source = "regex"
url = "https://www.w3.org/Tools/HTML-XML-utils/"
regex = "html-xml-utils-(.*?).tar.gz"

[oil]
source = "htmlparser"
url = "https://www.oilshell.org/release/latest/"
xpath = "//h1/text()"
prefix = "Oil "

-- 
Regards,
Felix Yan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.archlinux.org/pipermail/arch-dev-public/attachments/20220202/824ecd13/attachment.sig>


More information about the arch-dev-public mailing list