#5 Load Balancer and Techniques – Round Robin, Least Connections, Consistent Hashing

The Black Friday Crash

On Black Friday, an e-commerce website faced a disaster. Millions of users rushed to grab deals, but the site slowed down and crashed.

The problem? A single server couldn’t handle the traffic. They needed load balancing to distribute requests efficiently.

Load balancing prevents overload, ensures high availability, and improves response times. Let’s dive in.

So here we are now, with Phase 2 of our Series: Handling Scale & Performance Basics

What is Load Balancing?

Load balancing is the process of distributing incoming network traffic across multiple servers to ensure no single server is overwhelmed.

It improves:

Scalability – Handles high traffic smoothly.
Reliability – Prevents failures by rerouting traffic.
Performance – Reduces response time by optimising resource usage.

How Load Balancers Work

A load balancer sits between clients and servers, directing requests based on predefined rules.

Types of load balancers:

Hardware Load Balancers – Dedicated physical devices.
Software Load Balancers – Implemented via software (Nginx, HAProxy, AWS ELB).

Load Balancing Algorithms

Different algorithms determine how requests are distributed.

1. Round Robin – Simple and Fair

Each request is sent to the next available server in a cyclic manner.

Example:

Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A (cycle repeats)

Pros:

✔ Easy to implement. ✔ Works well when all servers have equal capacity.

Cons:

✖ Doesn't consider server load. ✖ Can overload a slower server.

2. Least Connections – Intelligent Distribution

Traffic is sent to the server with the fewest active connections.

Example:

Server A (5 active connections)
Server B (2 active connections) ← Next request goes here
Server C (3 active connections)

Pros:

✔ Balances load dynamically. ✔ Prevents overloading busy servers.

Cons:

✖ Slightly higher overhead to track connections. ✖ Not ideal for short-lived connections.

3. Consistent Hashing – Sticky Routing

Requests are routed based on a hash of the client’s IP address or request data. This ensures the same client connects to the same server.

Example:

Hash(User 1) → Server A
Hash(User 2) → Server B
Hash(User 3) → Server C

If a server is removed, traffic is rebalanced with minimal disruption.

Pros:

✔ Ensures session persistence. ✔ Efficient when servers are added or removed.

Cons:

✖ Requires hashing computation. ✖ Needs proper fallback mechanisms.

Real-World Use Cases

1. E-Commerce Websites

Handles sudden traffic spikes, ensuring smooth checkout processes.

2. Streaming Services (Netflix, YouTube)

Distributes video streams efficiently across multiple servers.

3. Online Gaming Platforms

Ensures real-time responsiveness and prevents lag.

Conclusion

Load balancing ensures high availability and optimal performance by distributing traffic efficiently.

Round Robin is simple but doesn’t account for load.
Least Connections dynamically balances requests.
Consistent Hashing provides session persistence.

Next, we’ll explore Caching Basics & Strategies – Why cache? Cache Invalidation, Write-Through, Write-Back.

#system-design #code

3/2/2025

#5 Load Balancer and Techniques – Round Robin, Least Connections, Consistent Hashing

The Black Friday Crash

On Black Friday, an e-commerce website faced a disaster. Millions of users rushed to grab deals, but the site slowed down and crashed.

The problem? A single server couldn’t handle the traffic. They needed load balancing to distribute requests efficiently.

Load balancing prevents overload, ensures high availability, and improves response times. Let’s dive in.

So here we are now, with Phase 2 of our Series: Handling Scale & Performance Basics

What is Load Balancing?

Load balancing is the process of distributing incoming network traffic across multiple servers to ensure no single server is overwhelmed.

It improves:

Scalability – Handles high traffic smoothly.
Reliability – Prevents failures by rerouting traffic.
Performance – Reduces response time by optimising resource usage.

How Load Balancers Work

A load balancer sits between clients and servers, directing requests based on predefined rules.

Types of load balancers:

Hardware Load Balancers – Dedicated physical devices.
Software Load Balancers – Implemented via software (Nginx, HAProxy, AWS ELB).

Load Balancing Algorithms

Different algorithms determine how requests are distributed.

1. Round Robin – Simple and Fair

Each request is sent to the next available server in a cyclic manner.

Example:

Request 1 → Server A
Request 2 → Server B
Request 3 → Server C
Request 4 → Server A (cycle repeats)

Pros:

✔ Easy to implement. ✔ Works well when all servers have equal capacity.

Cons:

✖ Doesn't consider server load. ✖ Can overload a slower server.

2. Least Connections – Intelligent Distribution

Traffic is sent to the server with the fewest active connections.

Example:

Server A (5 active connections)
Server B (2 active connections) ← Next request goes here
Server C (3 active connections)

Pros:

✔ Balances load dynamically. ✔ Prevents overloading busy servers.

Cons:

✖ Slightly higher overhead to track connections. ✖ Not ideal for short-lived connections.

3. Consistent Hashing – Sticky Routing

Requests are routed based on a hash of the client’s IP address or request data. This ensures the same client connects to the same server.

Example:

Hash(User 1) → Server A
Hash(User 2) → Server B
Hash(User 3) → Server C

If a server is removed, traffic is rebalanced with minimal disruption.

Pros:

✔ Ensures session persistence. ✔ Efficient when servers are added or removed.

Cons:

✖ Requires hashing computation. ✖ Needs proper fallback mechanisms.

Real-World Use Cases

1. E-Commerce Websites

Handles sudden traffic spikes, ensuring smooth checkout processes.

2. Streaming Services (Netflix, YouTube)

Distributes video streams efficiently across multiple servers.

3. Online Gaming Platforms

Ensures real-time responsiveness and prevents lag.

Conclusion

Load balancing ensures high availability and optimal performance by distributing traffic efficiently.

Round Robin is simple but doesn’t account for load.
Least Connections dynamically balances requests.
Consistent Hashing provides session persistence.

Next, we’ll explore Caching Basics & Strategies – Why cache? Cache Invalidation, Write-Through, Write-Back.

#system-design #code

3/2/2025

#11 Auto Scaling & Elasticity – How Systems Dynamically Handle Traffic Surges

Traffic spikes got you worried? Let's talk auto scaling and elasticity. We'll see how systems adapt in real-time to keep your app alive, and why it's a game-changer for online services.

Read Full Story

#20 Content Delivery Networks (CDNs) – Akamai, Cloudflare, AWS CloudFront

Want to make your website load instantly? Let's talk CDNs. We'll cover how they work and when to use them. Think of it as putting your website on a fast track around the world.

Read Full Story

#7 Distributed Caching – Redis, Memcached, CDN-Based Caching

Overwhelmed by traffic? Learn how distributed caching with Redis, Memcached, and CDNs can save the day. Deliver content faster, reduce server load, and keep your users happy.

Read Full Story