Logo
Published on

Back-of-the-Envelope Estimation - System Design Interview Complete Guide

πŸ”· Back-of-the-Envelope Estimation (BOE)

🧠 What it REALLY is

πŸ‘‰ Quick math to understand scale

"Not exact β†’ just directionally correct"

πŸ’¬ Say this in interview:

"I'm not trying to be precise here β€” I just need to understand the order of magnitude so my design choices are justified."


🎯 Why it matters (REAL reason)

❗This is what interviewers check:

  • Can you think in scale?
  • Can you justify your design?

πŸ”₯ Key Insight

"Without estimation β†’ your design is just guessing"

πŸ’¬ Say this in interview:

"I want to do a quick estimation before jumping into design β€” this will help me justify decisions like whether we need caching, sharding, or a CDN."


⚑ What BOE helps you decide

  • How many servers?
  • How much storage?
  • Can system handle traffic?
  • Where bottlenecks are?

πŸ’¬ Say this in interview:

"Based on these numbers, I'll determine what kind of infrastructure we need and where the scaling pressure will be."


🧩 Core Types (Must Know)

1. Load (Traffic)

πŸ‘‰ Requests per second (RPS)

Example:

  • 10⁢ (1 Million) users Γ— 10 actions/day = 10⁷ (10 Million) requests/day
  • 10⁷ Γ· 10⁡ seconds (β‰ˆ 86,400 β‰ˆ 10⁡) = 10Β² = ~100 RPS βœ…

Shortcut: 10⁷ (10M) requests ÷ 10⁡ (100K) seconds = 10² = 100 RPS

πŸ’¬ Say this in interview:

"With [X] million DAU and roughly [Y] actions per user per day, that's [XΓ—Y] million requests per day β€” divided by 86,400 seconds, that's about [Z] RPS. At peak, I'd assume 2–3x that, so roughly [2Z–3Z] RPS."


2. Storage

πŸ‘‰ Data growth

Example:

  • 1 photo = 2 MB = 2 Γ— 10⁢ bytes
  • 1 Million uploads/day = 10⁢ photos/day
  • 10⁢ (1M) Γ— 2 Γ— 10⁢ (2MB) bytes = 2 Γ— 10ΒΉΒ² bytes/day

Now convert bytes β†’ TB:

  • 2 Γ— 10ΒΉΒ² bytes Γ· 10Β³ = 2 Γ— 10⁹ (2 Billion) KB
  • 2 Γ— 10⁹ KB Γ· 10Β³ = 2 Γ— 10⁢ (2 Million) MB
  • 2 Γ— 10⁢ MB Γ· 10Β³ = 2 Γ— 10Β³ (2 Thousand) GB
  • 2 Γ— 10Β³ GB Γ· 10Β³ = 2 TB/day βœ…

Shortcut: 1M items Γ— 2MB = 2 Γ— 10ΒΉΒ² bytes = 2 TB (10⁢ Γ— 10⁢ = 10ΒΉΒ² = 1 TB β†’ Γ— 2 = 2 TB)

πŸ’¬ Say this in interview:

"Each [item] is roughly [size]. With [N] [items] per day, that's [N Γ— size] per day, or about [Y] TB per year. We'll need scalable object storage β€” something like S3."


3. Bandwidth

πŸ‘‰ Data transfer per second

Example:

  • 1 video stream = 5 MB/s = 5 Γ— 10⁢ bytes/sec
  • 1 Thousand concurrent users = 10Β³ users
  • 10Β³ (1K) Γ— 5 Γ— 10⁢ (5MB) bytes/sec = 5 Γ— 10⁹ bytes/sec

Convert bytes/sec β†’ GB/sec:

  • 5 Γ— 10⁹ bytes Γ· 10Β³ = 5 Γ— 10⁢ (5 Million) KB
  • 5 Γ— 10⁢ KB Γ· 10Β³ = 5 Γ— 10Β³ (5 Thousand) MB
  • 5 Γ— 10Β³ MB Γ· 10Β³ = 5 GB/sec βœ…

Shortcut: 1K users Γ— 5MB = 5 Γ— 10⁹ bytes = 5 GB (10Β³ Γ— 10⁢ = 10⁹ = 1 GB β†’ Γ— 5 = 5 GB)

πŸ’¬ Say this in interview:

"If [N] users are streaming simultaneously and each stream is [X] Mbps, total egress bandwidth is [N Γ— X] Gbps. That tells me we definitely need a CDN to handle this at the edge."


4. Latency

πŸ‘‰ Time taken

Sequential example (calls happen one after another β†’ add them up):

User Feed Request:
  Auth Service      β†’ 20 ms
  User Service      β†’ 30 ms
  Post Service      β†’ 50 ms
  Ranking Service   β†’ 40 ms
  ─────────────────────────
  Total             = 140 ms  βœ… (within 200ms p99)

Parallel example (calls happen at the same time β†’ take the max):

User Feed Request:
  Auth Service      β†’ 20 ms ─┐
  Post Service      β†’ 50 ms ── (all fire at once)
  Ranking Service   β†’ 40 ms β”€β”˜
  ─────────────────────────
  Total             = max(20, 50, 40) = 50 ms  βœ… (much faster!)

Rule: Sequential = sum all. Parallel = take the slowest one. Always ask: "Can these calls be parallelized?" β€” it can cut latency dramatically.

πŸ’¬ Say this in interview:

"This request involves 4 sequential service calls β€” auth (20ms), user lookup (30ms), post fetch (50ms), and ranking (40ms) β€” totalling 140ms, which is within our 200ms p99 target. If we're ever close to the limit, I'd parallelize the post fetch and ranking calls, bringing total latency down to ~50ms."


5. Compute (Servers/CPU)

πŸ‘‰ How many CPU cores and servers do we need?

Formula:

CPU cores needed = RPS Γ— latency (in seconds)
Servers needed   = CPU cores Γ· cores per server

Example (using the same Instagram numbers from above):

RPS          = 10K = 10⁴ (10 Thousand) req/sec
Latency      = 50 ms (parallel path from latency example) = 5 Γ— 10⁻² sec

CPU cores    = 10⁴ Γ— 5 Γ— 10⁻² = 500 cores
             = 10⁴ Γ— 10⁻² = 10Β² = 100 β†’ Γ— 5 = 500 cores  βœ…

Each server  = 16 cores (standard)
Servers      = 500 Γ· 16 β‰ˆ 32 servers

Add 2Γ— safety buffer for peak traffic and redundancy:

32 Γ— 2 = ~64 servers  βœ…

Summary table:

What Value Power of 10
RPS 10K 10⁴
Latency 50 ms 5 Γ— 10⁻²
CPU cores ~500 5 Γ— 10Β²
Cores/server 16 β€”
Base servers ~32 ~3.2 Γ— 10ΒΉ
With 2Γ— buffer ~64 servers ~6.4 Γ— 10ΒΉ

Rule of thumb: Always add a 2Γ— safety multiplier for peak traffic and node failures.

πŸ’¬ Say this in interview:

"We established 10K RPS from our traffic estimation, and from our latency analysis each request takes about 50ms end-to-end. Using the formula β€” RPS Γ— latency(sec) = CPU cores β€” that's 10,000 Γ— 0.05 = 500 CPU cores. With 16-core servers, that's about 32 servers. Adding a 2x buffer for peak traffic and redundancy, I'd provision around 64 servers to start, with auto-scaling enabled."


🧠 Golden Technique (How to Think)

Step 1: Break it down

"Users β†’ actions β†’ data"

πŸ’¬ Say this in interview:

"Let me break this down: how many users, how often they act, and how much data each action generates."


Step 2: Assume smartly

Use round numbers:

  • 1K, 1M, 1B
  • 1 KB, 1 MB, 1 GB

πŸ’¬ Say this in interview:

"I'll use round numbers β€” these are estimates, not exact figures. If you disagree with any assumption, let me know and I'll adjust."


Step 3: Convert to per second

πŸ‘‰ Always go to RPS

"per day Γ· 86400"

πŸ’¬ Say this in interview:

"I always convert to per-second numbers β€” that's what matters for infrastructure sizing. One day is roughly 86,000 seconds, so I'll use 100K for simplicity."


Step 4: Sanity check

Ask:

"Does this feel realistic?"

πŸ’¬ Say this in interview:

"Let me sanity check β€” [X] TB/day feels right for a system of this scale. Instagram reportedly stores petabytes, so we're in the right ballpark."


⚑ Powerful Shortcuts (Rules of Thumb)

Unit Value Power
1 day ~86K s ~10⁡ s
Thousand 1K 10Β³
Million 1M 10⁢
Billion 1B 10⁹
Trillion 1T 10ΒΉΒ²
1 KB 10Β³ B bytes
1 MB 10⁢ B bytes
1 GB 10⁹ B bytes
1 TB 10ΒΉΒ² B bytes
1 PB 10¹⁡ B bytes

πŸ’¬ Say this in interview:

"I'll use the standard shortcut β€” 1 day β‰ˆ 100K seconds. It keeps the math clean and the interviewer can follow along easily."


πŸš€ Real Example (Interview Style)

Design Instagram

Assume:

  • 10⁸ (100 Million) users
  • 10 posts/day

Load:

  • 10⁸ (100M) Γ— 10 = 10⁹ (1 Billion) posts/day
  • 10⁹ Γ· 10⁡ (100K sec/day) = 10⁴ = 10K RPS βœ…

Storage (with full 1MB posts):

  • 1 post = 1 MB = 10⁢ bytes
  • 10⁹ (1 Billion) posts Γ— 10⁢ (1MB) bytes = 10¹⁡ bytes
  • 10¹⁡ Γ· 10ΒΉΒ² = 1,000 TB = 1 PB/day βœ…

Storage (metadata only, 1KB/post):

  • 10⁹ (1 Billion) posts Γ— 10Β³ (1KB) bytes = 10ΒΉΒ² bytes = 1 TB/day βœ…

Rule: 1 Billion Γ— 1KB = 1TB β†’ 10⁹ Γ— 10Β³ = 10ΒΉΒ²

πŸ‘‰ Now you KNOW:

  • Need distributed storage
  • Need CDN
  • Need sharding

πŸ’¬ Say this in interview:

"100 million DAU, each posting 10 times a day β€” that's 1 billion write events per day, or about 10,000 writes per second. Each post with metadata and a compressed image is roughly 1 MB β€” so 1 billion Γ— 1 MB = 1 petabyte per day for new writes. That clearly requires distributed object storage like S3, a CDN for reads, and database sharding for write throughput."


βš–οΈ Most Important Insight

"Estimation β†’ drives architecture decisions"

Example:

  • High RPS β†’ load balancer + scaling
  • Huge storage β†’ S3 + sharding
  • High bandwidth β†’ CDN

πŸ’¬ Say this in interview:

"These estimates aren't just numbers β€” they're what tells me which architectural components I actually need. High RPS means I need horizontal scaling and a load balancer. Petabyte-scale storage means I can't use a single relational DB β€” I need object storage and sharding. High bandwidth means I need a CDN to avoid hammering the origin servers."


❌ Common Mistakes

  • Trying to be exact ❌
  • No assumptions ❌
  • Not converting to per second ❌
  • Ignoring peak traffic ❌

πŸ’¬ Say this in interview:

"I'll make sure to state my assumptions explicitly, work in round numbers, convert everything to per-second figures, and account for peak traffic β€” usually 2–3x the average."


🎀 Interview Script (MEMORIZE THIS)

Start:

"Before designing, let me do a quick back-of-the-envelope estimation to understand system scale and justify my architectural choices."


Assumptions:

"I'll assume 100 million DAU, with each user performing roughly 10 actions per day. That's 1 billion requests per day β€” about 10,000 to 12,000 RPS on average, and around 30,000 at peak."


Calculate:

"For storage: if each action generates 1 KB of metadata, that's 1 TB of metadata per day. If we also store media at roughly 1 MB each, and 10% of actions include media, that's 100 TB/day of media β€” about 36 PB per year."


Expand:

"For bandwidth: at 30K RPS with a 1 KB average response, that's 30 MB/sec of read bandwidth. For media serving, if 1% of users stream 1 MB/sec simultaneously, that's 1 TB/sec β€” we absolutely need a CDN."


Insight:

"Based on these numbers, the system needs: horizontal scaling behind a load balancer, a distributed database with sharding for write throughput, Redis or Memcached for hot reads, object storage like S3 for media, and a CDN for global low-latency delivery."


🧠 One-Line Summary

"Back-of-the-envelope estimation is used to quickly approximate system scale and guide architecture decisions."

πŸ’¬ Use this to open or close the estimation section in any interview:

"The goal of estimation isn't precision β€” it's to make sure my design is built for the right scale, not over-engineered or under-powered."