On Thu, 07 Dec 2017 at 23:38:59, Florian Pritz wrote:
This allows us to prevent users from hammering the API every few seconds to check if any of their packages were updated. Real world users check as often as every 5 or 10 seconds.
First of all, thanks for the patch! Out of curiosity: is there a noticeable performance impact caused by these requests? Our is this just a safety measure for the future? I am also asking because below, your are setting the default limit to one request every ~20 seconds. Do you think a reduction by a factor of two is sufficient to mitigate the issue?
Signed-off-by: Florian Pritz <bluewind@xinu.at> ---
Basic idea for a rate limiting solution. Currently the cleanup of old entries is done on each api request. That may be a bit excessive, but I didn't find a cronjob to put it and given the table is indexed, the query is probably fast enough. At least for a PoC this should be fine. Tell me what you think.
The overall idea sounds good to me. More comments below.
[...] --- /dev/null +++ b/upgrading/4.7.0.txt
You also need to create the table in our database schema (which is located under schema/aur-schema.sql).
@@ -0,0 +1,11 @@ +1. Add ApiRateLimit table: + +--- +CREATE TABLE `ApiRateLimit` ( + `ip` varchar(45) CHARACTER SET ascii COLLATE ascii_bin NOT NULL, + `requests` int(11) NOT NULL, + `window_start` bigint(20) NOT NULL,
We use for the field names in all other tables and I would like to keep this consistent. Upper case is used for the data types.
+ PRIMARY KEY (`ip`), + KEY `window_start_idx` (`window_start`)
Same applies to the key.
+) ENGINE=InnoDB;
Space before and after the "=".
diff --git a/web/lib/aurjson.class.php b/web/lib/aurjson.class.php index 9eeaafd..8a6dd7b 100644 --- a/web/lib/aurjson.class.php +++ b/web/lib/aurjson.class.php @@ -96,6 +96,11 @@ public function handle($http_data) {
$this->dbh = DB::connect();
+ if ($this->check_ratelimit($_SERVER['REMOTE_ADDR'])) { + header("HTTP/1.1 429 Too Many Requests"); + return $this->json_error('Rate limit reached'); + } + $type = str_replace('-', '_', $http_data['type']); if ($type == 'info' && $this->version >= 5) { $type = 'multiinfo'; @@ -130,6 +135,72 @@ public function handle($http_data) { } }
+ /* + * Check if an IP needs to be rate limited. + * + * @param $ip IP of the current request + * + * @return true if IP needs to be rate limited, false otherwise. + */ + private function check_ratelimit($ip) { + $limit = config_get("ratelimit", "request_limit"); + $window_length = config_get("ratelimit", "window_length"); + $this->update_ratelimit($ip); + $stmt = $this->dbh->prepare(" + SELECT requests,window_start FROM ApiRateLimit + WHERE ip = :ip"); + $stmt->bindParam(":ip", $ip);
We usually do not use prepared statements but I think it does not hurt -- unless it breaks with SQLite.
+ $result = $stmt->execute(); + + if (!$result) { + return false; + } + + $row = $stmt->fetch(PDO::FETCH_ASSOC); + + if ($row['window_start'] < time() - $window_length) { + $stmt = $this->dbh->prepare(" + DELETE FROM ApiRateLimit + WHERE ip = :ip"); + $stmt->bindParam(":ip", $ip); + $stmt->execute(); + return false; + } + + if ($row['requests'] > $limit) { + return true; + } + return false; + } + + /* + * Update a rate limit for an IP by increasing it's requests value by one. + * + * @param $ip IP of the current request + * + * @return void + */ + private function update_ratelimit($ip) { + $window_length = config_get("ratelimit", "window_length"); + $time = time(); + $deletion_time = $time - $window_length; + $stmt = $this->dbh->prepare(" + INSERT INTO ApiRateLimit + (ip, requests, window_start) + VALUES (:ip, 0, :window_start) + ON DUPLICATE KEY UPDATE requests=requests+1");
I do not think this works with SQLite. Is there a more standardized way of doing this? Maybe "INSERT OR REPLACE"?
+ $stmt->bindParam(":ip", $ip); + $stmt->bindParam(":window_start", $time); + $stmt->execute(); + + // TODO: this should move into some cronjob + $stmt = $this->dbh->prepare(" + DELETE FROM ApiRateLimit + WHERE window_start < :time"); + $stmt->bindParam(":time", $deletion_time); + $stmt->execute();
I am quite confused. There is a similar piece of code in check_ratelimit() already. Is either of these statements a leftover or do they serve different purposes? I am also curious about the general performance impact of this patch. It reduces request flooding but at the same time adds a couple of additional database queries to each request. Do you happen to have any statistics?
+ } + /* * Returns a JSON formatted error string. * -- 2.15.1