[PATCH] Add rate limit support to API

Lukas Fleischer lfleischer at archlinux.org
Fri Jan 26 19:39:13 UTC 2018


On Thu, 07 Dec 2017 at 23:38:59, Florian Pritz wrote:
> This allows us to prevent users from hammering the API every few seconds
> to check if any of their packages were updated. Real world users check
> as often as every 5 or 10 seconds.

First of all, thanks for the patch!

Out of curiosity: is there a noticeable performance impact caused by
these requests? Our is this just a safety measure for the future? I am
also asking because below, your are setting the default limit to one
request every ~20 seconds. Do you think a reduction by a factor of two
is sufficient to mitigate the issue?

> 
> Signed-off-by: Florian Pritz <bluewind at xinu.at>
> ---
> 
> Basic idea for a rate limiting solution. Currently the cleanup of old
> entries is done on each api request. That may be a bit excessive, but I
> didn't find a cronjob to put it and given the table is indexed, the
> query is probably fast enough. At least for a PoC this should be fine.
> Tell me what you think.

The overall idea sounds good to me. More comments below.

> [...]
> --- /dev/null
> +++ b/upgrading/4.7.0.txt

You also need to create the table in our database schema (which is
located under schema/aur-schema.sql).

> @@ -0,0 +1,11 @@
> +1. Add ApiRateLimit table:
> +
> +---
> +CREATE TABLE `ApiRateLimit` (
> +  `ip` varchar(45) CHARACTER SET ascii COLLATE ascii_bin NOT NULL,
> +  `requests` int(11) NOT NULL,
> +  `window_start` bigint(20) NOT NULL,

We use for the field names in all other tables and I would like to keep
this consistent. Upper case is used for the data types.

> +  PRIMARY KEY (`ip`),
> +  KEY `window_start_idx` (`window_start`)

Same applies to the key.

> +) ENGINE=InnoDB;

Space before and after the "=".

> diff --git a/web/lib/aurjson.class.php b/web/lib/aurjson.class.php
> index 9eeaafd..8a6dd7b 100644
> --- a/web/lib/aurjson.class.php
> +++ b/web/lib/aurjson.class.php
> @@ -96,6 +96,11 @@ public function handle($http_data) {
>  
>                 $this->dbh = DB::connect();
>  
> +               if ($this->check_ratelimit($_SERVER['REMOTE_ADDR'])) {
> +                       header("HTTP/1.1 429 Too Many Requests");
> +                       return $this->json_error('Rate limit reached');
> +               }
> +
>                 $type = str_replace('-', '_', $http_data['type']);
>                 if ($type == 'info' && $this->version >= 5) {
>                         $type = 'multiinfo';
> @@ -130,6 +135,72 @@ public function handle($http_data) {
>                 }
>         }
>  
> +       /*
> +        * Check if an IP needs to be  rate limited.
> +        *
> +        * @param $ip IP of the current request
> +        *
> +        * @return true if IP needs to be rate limited, false otherwise.
> +        */
> +       private function check_ratelimit($ip) {
> +               $limit = config_get("ratelimit", "request_limit");
> +               $window_length = config_get("ratelimit", "window_length");
> +               $this->update_ratelimit($ip);
> +               $stmt = $this->dbh->prepare("
> +                       SELECT requests,window_start FROM ApiRateLimit
> +                       WHERE ip = :ip");
> +               $stmt->bindParam(":ip", $ip);

We usually do not use prepared statements but I think it does not
hurt -- unless it breaks with SQLite.

> +               $result = $stmt->execute();
> +
> +               if (!$result) {
> +                       return false;
> +               }
> +
> +               $row = $stmt->fetch(PDO::FETCH_ASSOC);
> +
> +               if ($row['window_start'] < time() - $window_length) {
> +                       $stmt = $this->dbh->prepare("
> +                               DELETE FROM ApiRateLimit
> +                               WHERE ip = :ip");
> +                       $stmt->bindParam(":ip", $ip);
> +                       $stmt->execute();
> +                       return false;
> +               }
> +
> +               if ($row['requests'] > $limit) {
> +                       return true;
> +               }
> +               return false;
> +       }
> +
> +       /*
> +        * Update a rate limit for an IP by increasing it's requests value by one.
> +        *
> +        * @param $ip IP of the current request
> +        *
> +        * @return void
> +        */
> +       private function update_ratelimit($ip) {
> +               $window_length = config_get("ratelimit", "window_length");
> +               $time = time();
> +               $deletion_time = $time - $window_length;
> +               $stmt = $this->dbh->prepare("
> +                       INSERT INTO ApiRateLimit
> +                       (ip, requests, window_start)
> +                       VALUES (:ip, 0, :window_start)
> +                       ON DUPLICATE KEY UPDATE requests=requests+1");

I do not think this works with SQLite. Is there a more standardized way
of doing this? Maybe "INSERT OR REPLACE"?

> +               $stmt->bindParam(":ip", $ip);
> +               $stmt->bindParam(":window_start", $time);
> +               $stmt->execute();
> +
> +               // TODO: this should move into some cronjob
> +               $stmt = $this->dbh->prepare("
> +                       DELETE FROM ApiRateLimit
> +                       WHERE window_start < :time");
> +               $stmt->bindParam(":time", $deletion_time);
> +               $stmt->execute();

I am quite confused. There is a similar piece of code in
check_ratelimit() already. Is either of these statements a leftover or
do they serve different purposes?

I am also curious about the general performance impact of this patch. It
reduces request flooding but at the same time adds a couple of
additional database queries to each request. Do you happen to have any
statistics?

> +       }
> +
>         /*
>          * Returns a JSON formatted error string.
>          *
> -- 
> 2.15.1


More information about the aur-dev mailing list