- Published on
Database Indexing - FAANG System Design Interview Guide
Table of Contents
- 🔷 Database Indexing (FAANG Level)
- 🧠 1. Core Idea (Strong Opening)
- ⚡ 2. Why Indexing (REAL Insight)
- Without index:
- With index:
- Analogy
- 🎤 FAANG Question
- ⚙️ 3. How it Works (Simple but Important)
- 🔥 Key Insight
- 🧩 4. Types of Indexes (Must Know)
- 🟢 1. Primary Index
- 🔵 2. Secondary Index
- 🎤 FAANG Question
- 🟣 3. Composite Index (IMPORTANT 🔥)
- Rule:
- 🎤 FAANG Question
- ⚡ 4. Covering Index (ADVANCED 🔥)
- Benefit:
- ⚙️ 5. Underlying Data Structures (FAANG MUST)
- 🌳 B-Tree (Most Common)
- ⚡ Hash Index
- 🎤 FAANG Question
- ⚠️ 6. Trade-offs (VERY IMPORTANT)
- Reads vs Writes
- Why writes slow?
- 🎤 FAANG Question
- 🚨 7. When NOT to Use Index (Often Asked)
- 🎤 FAANG Question
- 🔥 8. Missing but CRITICAL Concepts
- 🧠 A. Cardinality (VERY IMPORTANT)
- 🧠 B. Index Selectivity
- 🧠 C. Index Maintenance Cost
- 🧠 D. Clustered vs Non-Clustered (IMPORTANT)
- Clustered
- Non-clustered
- 🎤 FAANG Question
- ⚡ 9. Real System Thinking
- Example: Instagram
- ⚖️ 10. Trade-off Summary
- 🎤 11. FAANG Interview Script
- Start
- Explain
- Smart Choice
- Trade-off
- Close Strong
- 🧠 Final One-Line (Must Memorize)
- 💡 FAANG-Level Insight (DIFFERENTIATOR)
🔷 Database Indexing (FAANG Level)
🧠 1. Core Idea (Strong Opening)
"An index is a data structure that allows fast lookup of rows without scanning the entire table."
⚡ 2. Why Indexing (REAL Insight)
Without index:
- Full table scan ❌ (O(n))
With index:
- Direct lookup ✅ (O(log n))
Analogy
📚 Book:
- No index → read whole book
- With index → jump to page
🎤 FAANG Question
Q: Why do we need indexing? A:
"To reduce query latency by avoiding full table scans and enabling efficient lookups."
⚙️ 3. How it Works (Simple but Important)
👉 Index stores:
- Column value
- Pointer to actual row
👉 DB uses index → finds location → fetches row
🔥 Key Insight
"Index = extra data structure optimized for reads"
🧩 4. Types of Indexes (Must Know)
🟢 1. Primary Index
👉 On primary key
- Unique
- One per table
🔵 2. Secondary Index
👉 On non-primary columns
Example:
- search by email, name
🎤 FAANG Question
Q: When do you add secondary index? A:
"When a column is frequently used in WHERE, JOIN, or ORDER BY clauses."
🟣 3. Composite Index (IMPORTANT 🔥)
👉 Multiple columns
Example:
(index: userId, createdAt)
Rule:
"Order matters"
🎤 FAANG Question
Q: Why column order matters? A:
"Because indexes are sorted, so prefix columns are used efficiently."
⚡ 4. Covering Index (ADVANCED 🔥)
👉 Index contains all needed columns
Benefit:
- No need to hit DB table
⚙️ 5. Underlying Data Structures (FAANG MUST)
🌳 B-Tree (Most Common)
👉 Balanced tree
- Search: O(log n)
- Supports range queries
⚡ Hash Index
👉 Key → value mapping
- Fast lookup
- ❌ No range queries
🎤 FAANG Question
Q: Why B-Tree over hash? A:
"Because B-Trees support both exact match and range queries."
⚠️ 6. Trade-offs (VERY IMPORTANT)
Reads vs Writes
| Operation | Impact |
|---|---|
| Read | Faster ✅ |
| Write | Slower ❌ |
Why writes slow?
👉 Need to:
- Update data
- Update index
🎤 FAANG Question
Q: Why not index everything? A:
"Because indexes increase write latency and storage overhead."
🚨 7. When NOT to Use Index (Often Asked)
- Small tables (full scan is fine)
- High write systems (logs, metrics)
- Low cardinality columns (e.g., gender)
🎤 FAANG Question
Q: Why avoid indexing low-cardinality columns? A:
"Because they don't filter much data, so index is not useful."
🔥 8. Missing but CRITICAL Concepts
🧠 A. Cardinality (VERY IMPORTANT)
👉 Uniqueness of data
- High cardinality → good index
- Low cardinality → bad index
🧠 B. Index Selectivity
👉 How much data it filters
"Higher selectivity = better index"
🧠 C. Index Maintenance Cost
- Storage overhead
- Rebalancing (B-Tree updates)
🧠 D. Clustered vs Non-Clustered (IMPORTANT)
Clustered
- Data stored in index order
Non-clustered
- Separate structure
🎤 FAANG Question
Q: Difference between clustered & non-clustered? A:
"Clustered defines physical order of data, non-clustered stores pointers."
⚡ 9. Real System Thinking
Example: Instagram
- Index on
userId→ fetch posts - Index on
createdAt→ sort feed
👉 Without index:
- Feed = slow ❌
⚖️ 10. Trade-off Summary
| Benefit | Cost |
|---|---|
| Fast reads | Slow writes |
| Efficient queries | Extra storage |
| Scalable queries | Maintenance overhead |
🎤 11. FAANG Interview Script
Start
"To optimize query performance, I'll add indexes on frequently queried columns."
Explain
"Indexes allow us to avoid full table scans and perform O(log n) lookups."
Smart Choice
"I'll use composite indexes based on query patterns."
Trade-off
"However, indexes increase write latency and storage overhead."
Close Strong
"So I'll only index high-cardinality, frequently queried fields."
🧠 Final One-Line (Must Memorize)
"Indexes improve read performance by trading off write latency and storage."
💡 FAANG-Level Insight (DIFFERENTIATOR)
"Good engineers add indexes. Great engineers design indexes based on query patterns and access frequency."