On Fri, 29 Apr 2016 at 13:56:58, Dave Reisner wrote:
What are your reasons for not wanting to extend the RPC interface? Splitting the API in this way is cumbersome for clients. We now are supposed to download and maintain data which is essentially an index (which ought to be internal to the server to allow strong consistency). You're also asking clients of a read-only interface to become stateful, which isn't something that really interests me.
Isn't that exactly the way pacman works, though? It downloads a copy of the database locally and uses that database to answer requests and obtain package information. My vision is that, optimally, the official repositories and the AUR build upon the same basic concept. Apart from binary vs. source packages, the only real difference between the official repositories and the AUR is the amount of packages but if we figure out a good way to solve point (2) from my initial email, that should be an non-issue. Apart from that, there are two general directions we can go: * Do everything on the server. Keep extending the server for every feature that is needed by some client. What happens if a user only wants to know the number of packages matched by a given expression? Do we really want to force her to fetch the whole list of matched packages, just to obtain its size, or do we add another request type? And even if regular expressions were the last missing thing, adding them demands a bit more thought than one might expect (what kind of expressions do we support, do we need to care about ReDoS or is that handled by the engines themselves, etc.) * Directly publish all the information required to answer all possible requests. Let the clients do whatever they want. Currently, we only provide package names but in the future, this could be extended to a more complete database like the one pacman uses. I am not saying that the second option is the holy grail. For a simple web application that retrieves information on a package or for a single basic package search, downloading everything might be overkill. That is why I suggest to keep the very basic RPC interface we have right now and, additionally, provide package databases for fancier applications. I am not set on this idea yet. It just seems like the most natural and Arch way of handling this kind of things. I am open for discussion!
Or, change the storage for the name list such that updates can be fast. Turns out, you already have such a thing, you'd just need an index on the Packages and PackageBases tables.
Those indices are there already. Dropping the package list cache completely might be an option (got to investigate the performance impact).
I'm not understanding why any of this is considered a good direction for the API. What are the reasons for wanting the whole list of package names on the client side? Are there use cases other than search?
Search could be extended in many ways, especially now that we have useful meta data. One could build full dependency trees of the AUR add proper support for package groups, just to name two examples. Regards, Lukas