#13 Latency, Throughput & Performance Optimization – Network Latency, Processing Latency

The Frustrating Delay in Online Orders

Aditi ordered a laptop online. She clicked “Buy Now,” but the page kept loading.

The transaction finally went through, but the delay was annoying. The culprit? High latency.

For systems to perform efficiently, they must optimize latency, throughput, and processing time.

What is Latency?

Latency is the time taken for a request to travel from the client to the server and back.

A low-latency system is fast and responsive, while a high-latency system feels sluggish.

Types of Latency

1. Network Latency – The Travel Time

The time taken for data to move between servers and users.
Affected by distance, network congestion, and routing inefficiencies.

Example: Slow webpage loading due to a high round-trip time (RTT).

2. Processing Latency – The Server’s Speed

The time taken for a server to process a request and generate a response.
Affected by CPU performance, database queries, and application logic.

Example: A slow SQL query delaying an API response.

diagram showing a request being processed at different layers (web server, app server, database)

What is Throughput?

Throughput is the number of requests a system can handle per second.

A high-throughput system can serve more users efficiently, while a low-throughput system struggles under load.

Example: An API that processes 1,000 requests per second has higher throughput than one handling 100 requests per second.

Optimizing Performance – Reducing Latency & Increasing Throughput

1. Reducing Network Latency

✔ Use Content Delivery Networks (CDNs) to cache data closer to users. ✔ Optimize DNS resolution to reduce lookup delays. ✔ Implement compression (Gzip, Brotli) to shrink payload sizes.

Example: Netflix uses CDNs to serve videos faster worldwide.

diagram showing how CDNs cache and reduce data travel distance

2. Optimizing Processing Latency

✔ Use efficient indexing to speed up database queries. ✔ Implement caching (Redis, Memcached) to reduce database hits. ✔ Optimize server-side code and eliminate bottlenecks.

Example: E-commerce sites cache product details to reduce database queries.

3. Improving Throughput

✔ Implement load balancing to distribute requests across multiple servers. ✔ Use asynchronous processing to handle background tasks efficiently. ✔ Scale horizontally with auto-scaling clusters.

Example: Twitter handles millions of requests per second using load-balanced microservices.

Real-World Use Cases

1. Streaming Services (Netflix, YouTube)

Reduce network latency using CDNs.
Optimize throughput with high-bandwidth servers.

2. Financial Transactions (Banking, Trading Apps)

Use low-latency networks to execute transactions in milliseconds.
Process requests efficiently using optimized database queries.

3. Online Gaming (Multiplayer Games)

Minimize lag by reducing network latency.
Ensure high throughput for real-time gameplay.

Conclusion

For a fast, scalable system:

Reduce network latency using CDNs and compression.
Minimize processing latency with caching and optimized queries.
Increase throughput by load balancing and scaling.

Next, we’ll explore Message Queues & Event-Driven Architectures – Kafka, RabbitMQ, SQS, Web hooks. Be prepared for that.

#code #system-design

3/3/2025

#13 Latency, Throughput & Performance Optimization – Network Latency, Processing Latency

The Frustrating Delay in Online Orders

Aditi ordered a laptop online. She clicked “Buy Now,” but the page kept loading.

The transaction finally went through, but the delay was annoying. The culprit? High latency.

For systems to perform efficiently, they must optimize latency, throughput, and processing time.

What is Latency?

Latency is the time taken for a request to travel from the client to the server and back.

A low-latency system is fast and responsive, while a high-latency system feels sluggish.

Types of Latency

1. Network Latency – The Travel Time

The time taken for data to move between servers and users.
Affected by distance, network congestion, and routing inefficiencies.

Example: Slow webpage loading due to a high round-trip time (RTT).

2. Processing Latency – The Server’s Speed

The time taken for a server to process a request and generate a response.
Affected by CPU performance, database queries, and application logic.

Example: A slow SQL query delaying an API response.

What is Throughput?

Throughput is the number of requests a system can handle per second.

A high-throughput system can serve more users efficiently, while a low-throughput system struggles under load.

Example: An API that processes 1,000 requests per second has higher throughput than one handling 100 requests per second.

Optimizing Performance – Reducing Latency & Increasing Throughput

1. Reducing Network Latency

Example: Netflix uses CDNs to serve videos faster worldwide.

2. Optimizing Processing Latency

✔ Use efficient indexing to speed up database queries. ✔ Implement caching (Redis, Memcached) to reduce database hits. ✔ Optimize server-side code and eliminate bottlenecks.

Example: E-commerce sites cache product details to reduce database queries.

3. Improving Throughput

Example: Twitter handles millions of requests per second using load-balanced microservices.

Real-World Use Cases

1. Streaming Services (Netflix, YouTube)

Reduce network latency using CDNs.
Optimize throughput with high-bandwidth servers.

2. Financial Transactions (Banking, Trading Apps)

Use low-latency networks to execute transactions in milliseconds.
Process requests efficiently using optimized database queries.

3. Online Gaming (Multiplayer Games)

Minimize lag by reducing network latency.
Ensure high throughput for real-time gameplay.

Conclusion

For a fast, scalable system:

Reduce network latency using CDNs and compression.
Minimize processing latency with caching and optimized queries.
Increase throughput by load balancing and scaling.

Next, we’ll explore Message Queues & Event-Driven Architectures – Kafka, RabbitMQ, SQS, Web hooks. Be prepared for that.

#code #system-design

3/3/2025

#14 Message Queues & Event-Driven Architectures – Kafka, RabbitMQ, SQS, Webhooks

How do apps talk to each other without slowing down? Message queues and events! We'll break down Kafka, RabbitMQ, SQS, and webhooks – basically, how to build apps that can handle anything.

Read Full Story

#20 Content Delivery Networks (CDNs) – Akamai, Cloudflare, AWS CloudFront

Want to make your website load instantly? Let's talk CDNs. We'll cover how they work and when to use them. Think of it as putting your website on a fast track around the world.

Read Full Story

#18 WebSockets & Real-time Communication – Socket.io, SSE

Tired of refreshing? Learn about WebSockets and SSE! We'll show you how to build real-time apps like chat and live dashboards. Get instant updates and make your users happy.

Read Full Story