Scaling Web Applications - Part 2 Notes

This is part 2 about scaling applications and it is dedicated to scaling the frontend layer. This post will detail the notes I’ve taken on a seriously awesome book called “Scalability for Startup Engineers”. I currently take the train in NYC for more than 3 hours per day. I take notes from this book, while I am on my way to and from work. This set of notes are about scaling the frontend layer of a web application system.

The frontend layer recieves the most traffic. It consists of a web browser, your front end server, caching servers, and other network components. If you are trying to scale, you want to keep this layer mostly stateless.

In todays day and age, we deal with SPA applications mainly for our frontends. Things like React, Vue, Svelte, Qwik etc.

Statelessness is a property of a service, server or object where it does not hold any data. Making a server, service or object stateless makes objects of the same type interchangeable. This allows better scalability. Which means you can add clones of the component and increase the throughput of your server or service more easily.

The opposite of being stateless is to be stateful. Stateful services keep some knowledge between requests which is not available to other services. This means you have to stick to the selected server which has the state, because other servers wouldn’t have the state you need.

A stateful service would have a cookie attached to it.

When a user sends a request to a server that keeps state, the server can decide to start a new session or cookie by adding a new session cookie header to a request. I come from Django land, If I want my React frontend to know that a user is logged in, when a user successfully logs in, I will simply add a cookie to each request I send to the frontend that will allow the frontend access to the users information from my Django server.

Approaches that enable scaling while keeping statelessness:

Store session state in cookies
Delegate the session storage into an external data store
Use a load balancer that supports sticky sessions

Approach 1:

Pro: Use session scope as normal then just before sending a response to the client, your framework serializes the session data, encrypts it and includes it in the response headers as a new value for the session data. The session is effectively being handed to your servers with every web request. this makes your application stateless.

Con: Session storage can become expensive. Cookies are sent by the browser with every request. Adding more data to session scope can quickly grow into kb’s making web requests higher latency and slower. Especially on mobile frontends.

Approach 2:

Pro: Store session data in a dedicated data store. In this case, your web app would take sessionn identifier information from the web request and then load session data from the web request and at the end of the web request cycle it would serialize the data and save it back to the data store.

Side Note: I’ve used things like this in the past, where we’d want to load a users applications settings which was a very heavy payload. But it was easy to access because it was an additional request where a session ID was used to get the app settings from a cache alongside the frontend application. The request to load the application was very fast and the additional requet didnt slow down the frontend.

Approach 3:

Con: Sticky session where a load balancer will be responsible for making sure that requests with the same session cookie will always go to the same server that initiated it.

This is not recommended as a good solution for maintaining stateless servers. What happens is that since state is tied to the load balancer it wouldnt be safe to decommision a server from the load balancer pool, as you would be breaking users sessions that are tied to some of the servers in the pool. So it would automatically disable you from the benefits of auto scaling. Avoid this solution if you can.

Common State in Web applications:

User generated content being uploaded to servers
Files generated by your system that needs to be downloaded

Long Ago user generated content wasn’t a thing, now it is a standard. Because it is a standard now, you will often need to store the files in the exact form it was uploaded in, and ensure that it will never change.

To store user generated content, you could use Azure Blob Storage or AWS S3. These solutions can enable your application to store the content publicly or privately. No matter how you store the data, you should try using a CDN to deliver public files to your end user.

Tip: you can set a long expiration time on static user content and basically cache the response forever.

• Avoid using CDN’s for private data. S3 is great for this, because it gives you private buckets. Private buckets on s3 come baked in with high availablility and high scalability so you don’t trade in any of the speed you’d get from cacheing with the CDN by storing files private. AWS has your back :)

Other state that can enter the frontend:

local server cache: Frontends are susceptible to cache inconsistencys. To avoid the drama, you can use a shared object cache, so there is only one copy of each object and it could be invalidated easily.

resource locks: Locks are used to prevent race conditions and to synchronize access to shared resources. Locks can possibly prevent you from scaling out, if implemented wrong. You can put a lock on a thread. You can clone your service. And in a second thread, the service can be trying to access the same service the first lock is seeking to prevent. Since the locks are on seperate threads, they won’t know about each other and they will fail to work and an update on the resource would occur, leading to data inconsistency.

A common way of scaling out is that you can isolate a piece of functionality that requires globally available state, remove the item from the application and create a new independent service encapsulating this functionality. This makes it easier to scale out.

Checkout distibuted lock services such as Zookeeper. This service is sophisticated and can alert you when a lock gets released. Locks can be implemented with redis, memcached, SQL databases like postgres and MySQL.

Keep web servers stateless is the long and short of this all here.

Frontend Layer Components in Detail:

DNS: this is the first component your clients interface with. Some DNS gives you can option to latency based routing. Similar in nature to GeoDNS where a client request is routed to a nearby data center for a lower latency response.

Load Balancers: The entry point to your data center should be a load balancer. Do not user a round robin DNS if you can help it. Roun robin DNS caches IP Addresses. This causes issues because if you go and try to remove a server out of rotation, and a client tries to connect and the IP address for the Defunct server is cached by the DNS, your app will not work for clients trying to reach the defunct IP address.

pros:

Hidden server maintenance: You can take a server out of a load balancer pool, wait for connections to drain and not affect any current clients.

Seamlessly increase capacity: add more web servers at anytime and clients will not be effected

Efficient failure management: can easily remove broken servers and no one would be affected by it

Automated Scaling

Hosted Load Balancer Services: Amazon Elastic Load Balancer: it’s hosted scaled and managed by AMAZON themselves. Amazon seems to rhyme with Amazing lol jk It’s a cheap and simple solution. Scales transparently, you won’t need to worry. It has built in high avaialability. It has SSL termination, which means connections coming to it are HTTP and not HTTPS. So you don’t need to run a HTTP’s server and can save on resources. (I have to learn more about this tbh)

ELB is a bit slow. So that may mean some 503 codes for clients in the event of traffic spikes.

Self Hosted Load balancers:

You can use NGINX as a reverse proxy or HAProxy

NGINX can be a great INTERNAL web service load balancer simply because it caches http responses. NGINX and HAProxy can forward many thousand of requests per second of conccurent clients.

Where you would want to use Round Robin DNS is to point to Load Balancers. If a DNS caches a load balancers ip address, the load balancer itself would be responsible for routing the request to servers.

</strong>Hardware Load balancers:</strong> •high throughput •extremely low latency •consistent performance

These can be quite expensive, ranging from a few thousand dollars to $100,000 dollars. Also some of these may need a specialized engineer who would know how to set it up.

Caching: Instead of adding more servers to make responses faster, use caching to avoid having to serve them in the first place. You can use the CDN cache to serve entire web pages. Be careful though, to not cache personal or dynamic data.

I hope this helps with giving some ideas of concepts and techniques you can use to scale your frontend horizontally.

If you liked this post consider clicking: