Exploring JWTs from first principles

#tech

You might have heard about JWTs being used as an authentication mechanisms for applications on the Web. What exactly are JWTs? Let's explore this from a first principles standpoint.

JWTs stand for JSON Web Token and are a popular way to authenticate users on an application. What led to the creation of JWTs? Let's explore this further.

The Problem

Websites need some way to authenticate the users that are trying to interact with them. We see this phenomenon all around our daily lives - Facebook wants you to sign in when you're logging in from a new device, same goes for google, notion, obsidian, or all your favorite apps. Why? It allows the website to know this is a registered user who has some data stored with us, and it is the same user trying to access that data and interact with our website and not an impostor. It is a way of securing the access to data. Hence, every system needs authentication.

The Need

Why did the need for JWTs arise? Traditionally, authentication on web apps was done through "sessions". Whenever the user creates a new account or signs in to a website (basically sends that data to the server and asks it to display the webpage), the server generates a session token (sessionId) for that user and sends it back to the user. The server also simultaneously stores that sessionId in its database. Now, whenever the user makes a new request, it sends the sessionId along with it. The server then matches this sessionId in its database, authenticates if found, and returns the information stored for that sessionId in it's database. This is the entire flow. This entire system is stateful. What exactly is stateful? Stateful means the state of the system is being stored somewhere and retrieved at a later point.

The problem with this was:

  • It is quite difficult to scale
  • It is difficult to implement for distributed or serverless systems.

Why so?

  • Distributed or serverless may consists on hundreds of different microservices that interact with the user independently. It is difficult to maintain a general store that stores sessionIds.
  • When the application scales (image a million users), the database would have to store million entries (one for each). This will overwhelm the database pretty quickly or include extra overhead costs if you decide to scale up your database.

The solution? JWTs.

Exploring from First Principles

The core idea was to have some form of authentication mechanism to validate the user. "Sessions" did not work out when scaling and in distributed systems because they were stateful. We needed another way. Why not take a stateless route? The problem was maintaining and scaling the database that stored sessionIds. Why not mitigate that problem entirely? Let's find a way to store nothing on the server.

To do this, the client would have to let the server know who it is. Let's imagine this with the help of an ancient scenario. Imagine you're in the eygptian era and you're a traveller from another country trying to enter egypt. You're stopped at the front gate of the city and asked to provide some proof of entry. Consider it like a visa in modern times. You show the paper to the guard, he checks it (also looks at you), and lets you enter. This is a situation where the server (the guard) does not store any information about the client (you) - instead, the client gives the proof of itself to the server. Let's try to come up with a solution using this analogy.

Let's assume the paper you presented to be a token. It should contain the following:

  • your identity
  • your data
  • official signature

The token should have all of this too. It would then send it over to the server, who checks it and gives a pass to the user to access its data. Here we encounter another problem. How to send over this data? JSONs. JavaScript Object Notation (JSON) is a lightweight and flexible way to send data over the internet. We use the existing solution to send and receive the data.

Now that we've established this, let's go over the flow once agian. The user sends over the token to the server -> server checks it -> server grants access (returns back another token) -> user is validated -> user wants to request for something -> user sends the received token along with his request (as a way of letting the server know its already validated) -> server confirms and returns the requested data.

This is the entire flow of JWTs. You'll notice how this entire system is stateless. No information is being stored on the server. On the contrary, the client has to store the token received from the server in its local storage or as cookies. This would scale appropriate even if there are a million users.

Conclusion

I hope you had fun exploring JWTs from first principles. See you in another one! Adios!