rate-limiter

Hello readers and welcome back to another article series in which we will decode the system design of a conventional rate limiter and further in the later part we will build our own rate limiter using node.js, so let's get started

What is a Rate Limiter and what is the Purpose of using it?

The Rate limiter is a limiting technique used to limit network traffic. A kind of filter it puts a cap on how often someone can repeat an action in a given timeframe. e.g. Posting on social media from your account in a given time frame.

In HTTP world a rate limiter would simply limit the number of client requests allowed to be sent over a specified period of time. If the API request count exceeds the threshold defined by the rate limiter, all the excess calls are blocked

The rate limiter helps us to maintain your system performance, by limiting traffic which is directed at an API or on our system, further also rate limiter help us to prevent our system from various malicious activities such as Bot attacks, DDos Attacks and Web scrapping.

Rate limiting also protects against API overuse, which is not necessarily malicious or due to bot activity, but is important to prevent nonetheless.

Where to put the Rate Limiter?

Let us think, about a basic simple client and server application, the rate limiter can be implemented on both sides either by the client or on the server side.

Client-side Implementation: The client-side implementation of the rate limiter is unreliable due to its prone nature of getting forged by malicious actors and uncontrolled behaviour.

Server-side Implementation : Server side implementation is basically added to our backend server as shown below in the diagram.

However, there is another method by which we can implement the rate limiter other than client and server-side implementation. Instead of implementing a limiter at server API, we can implement a rate limiter as a middleware which is also the technology provided by express in node.js. In Express also we can use their in-built rate limiter as a middleware, below is the representation of it.

let us learn how this type of implementation of rate limiter works, suppose your backend APIs support a maximum of 3 requests per second but the client 4 requests per second, in this situation the first 3 requests get passed through the middleware towards the server and the last request will throttle by the rate limiter middleware and will return an HTTP status 429 which means that client has sent to many requests over a period of time. Below is the illustration for the same

API Gateway:

Nowadays, Cloud microservices a gained significant popularity due to their fully managed services which include services like rate limiting, SSL termination, IP whitelisting, authentication and services static content and these services are included in a simple component called an API gateway.

Although designing a rate limiter can be a tedious task if one does not have enough resources, then you can simply take these fully managed cloud microservices they provide this API gateway which includes an inbuilt rate limiter, the popular example is Amazon API gateway.

Algorithms for Rate Limiter:

There are some great algorithms for building the rate limiter of our choices but each of these different algorithms has its pros and cons, algorithms are listed below

Token Bucket
Leaking bucket
Fixed window Counter
Sliding window log
Sliding window Counter

Token Bucket:

Token Bucket algorithm is a simple algorithm which is used to build a rate limiter it is simple and well-understood and it is used by most organisations such as Amazon and Stripe to throttle their API requests.

The algorithm approach is simple, each token Bucket has a fixed capacity in which tokens are filled at a preset interval the bucket gets filled with capacity the token will overflow

when the request arrives we will check for the token if it is available inside the bucket and the request will be allocated this token and passed towards the API server and token count will be decremented from the bucket, but if there is no token available inside the bucket the request will be throttled back with status code 429, below is the basic implementation of the same.

The token bucket algorithm takes two parameters the bucket size and refill rate, the bucket size corresponds to bucket capacity and the refill rate is the rate at which the token gets filled in the bucket per second.

while working with bucket tokens, we need to consider the number of tokens should we use, this depends on the system requirement that you are trying to build suppose you want to throttle the request based on IP address, then you need to add each bucket for each API.

But if you want to configure 10000 requests for your system then the global bucket is the best option for the system, below is the high-level implementation.

Leaking Bucket:

The leaking bucket algorithm is similar to a token bucket but except for the token, it redirects the request to Queue(FIFO), if the queue is empty the requests are processed if the queue is full the request is throttled back or dropped with status code 429, the successful request is then pulled from the queue and processed at a regular interval.

The leaking bucket algorithm takes two parameters, the bucket size which is the same as that of the queue and the outflow rate it defines how many requests should be processed at a regular interval.

Shopify an e-commerce building platform, uses this type of rate limiter on their system, below is the representation of the leaking bucket algorithm.

Summary:

So in this part of the decoding rate limiter, we get to know insight about the rate limiter that is used in modern industries and how they are implemented in such a way that it can reduce the negative impact and the complication that our API faces in case of DDoS attacks and heavy network traffic, we also learn about how to implement this rate limiter our system, we also learn about the two types of algorithms that are widely used in implementation rate limiter.

Although covering all the algorithms is out of scope for this single article i.e., I have covered two algorithms for now and in the later part we will cover the remaining algorithm and also implement our own rate limiter using node.js.

So, stay tuned for the next article.

I have learnt this on the internet and some of the references, I have mentioned below do checkout the link for in-depth understanding.

https://www.cloudflare.com/

https://bytebytego.com/

Decoding System Design of Rate Limiter: 1( Your Goto Protector for backend APIs)

Table of contents

What is a Rate Limiter and what is the Purpose of using it?

Where to put the Rate Limiter?

API Gateway:

Algorithms for Rate Limiter:

Token Bucket:

Leaking Bucket:

Summary:

Decoding System Design of Rate Limiter: 1( Your Goto Protector for backend APIs)

Table of contents

What is a Rate Limiter and what is the Purpose of using it?

Where to put the Rate Limiter?

API Gateway:

Algorithms for Rate Limiter:

Token Bucket:

Leaking Bucket:

Summary:

Did you find this article valuable?