Robot - Rate Limiting
About
A page about rate limiting of HTTP request that is implemented to control bot.
A rate limiter caps how many requests a sender (user / IP address ) can issue in a specific window of time (e.g. 25 requests per minute)
Characteristic
- Storage:
- Distributed or not (ie shared storage / database or not)
- Minimum storage
- Rate limit window calculation:
- Fix where burst of requests can happen at the interval limit
- Minimum Time Between Request
Algorithm
Token Bucket
Token_bucket 1) to defined limits on bandwidth (Implementation: Php Tocken bucket)
For each new request, the rate limiter:
- will fetch the following data by user:
- the last request’s timestamp
- the available token count left
- would then first increment the available tokens based on:
- the refill rate
- and the time of the user’s last request.
- would decrement the available tokens
This is a read/write operation that should happen atomically and in isolation if the rate limiter data is shared.
Leaky Bucket
Leaky_bucket 2) is the inverse of token_bucket
The leaky bucket algorithm is widely used to deal with burstiness when bandwidth is limited such as in:
- in telecommunications
- and packet‑switched computer networks
The analogy is with a bucket where water:
- is poured in at the top
- and leaks from the bottom
If the rate at which water is poured in exceeds the rate at which it leaks, the bucket overflows.
In terms of request processing 3):
- the water represents requests from clients,
- the bucket represents a queue where requests wait to be processed according to a first‑in‑first‑out (FIFO) scheduling algorithm.
- the leaking water represents requests exiting the the queue for processing by the server,
- the overflow represents requests that are discarded and never serviced.
Fixed window counters
Records the number of requests from a sender occurring in a rate limit’s time interval
Data:
- user
- interval id (timestamp)
- counter
Due to the fix interval, a user could make twice the allowed requests in a short amount of time:
- the first allowed requests just before the end of an interval
- the second allowed requests just after the begin of the interval
Sliding window log
A Sliding window log stores all of user’s requests in a single sorted set (deque) ordered by timestamp
For each new request, the rate limiter would:
- remove expired requests
- add the current request
- count the request in the set
- allow / reject (or wait)
Because every request is recorded, the memory footprint can be huge if you store the data at the lowest grain level
On a day basis, for an application with
- a rate limit of 100 requests per user
- 5,000 active users
- a timestamp stored on 4byte
the total maximum memory could be <MATH> 100 . 5000 . 4 = 2000000 \text{byte} = 19 Mb </MATH>
A solution is to store the data at a highest grain such as the number of request by minute if you want to define your rate limit at the hour.
Configuration
- Robots - Crawl-delay parameters set the number of seconds that the crawler should wait before another crawl.
- The og:ttl object property will limit crawler access (ex: Facebook)
Response
Applications should respond to requests from users who surpass the rate limit with a HTTP 429 Too Many Requests response status code. ie the user has sent too many requests in a given amount of time (“rate limiting”).
However when attackers see a 429 response, they simply create new accounts to circumvent the rate limiting. Another response is to implement a shadow ban. (Shadow banned account don't see that their contributions are banned)
Tool
Nginx
The ngx_http_limit_req_module 5) uses the leaky_bucket method.