Logo
Published on
9 min read

Designing a URL Shortening Service like TinyURL

Authors

UNDERSTAND, SCOPE & CONSTRAINTS

1. Clarify Ambiguity + Define Scope

  • What is the primary goal? (Create short aliases for long URLs)
  • What happens when a short link is clicked? (Redirect to original URL)
  • Are there specific constraints on the short link length?

Before jumping into the design, I want to clarify the scope. We are building a URL shortening service similar to TinyURL. The core goal is to convert long URLs into short, unique aliases and redirect users back to the original destination. I'll focus on the shortening and redirection logic, and exclude complex user account management for now to keep the scope manageable.

2. Functional Requirements

Let me list the core functional requirements - the key actions users will perform.

  • Generate a unique short alias for a given long URL.
  • Redirect users from the short link to the original long URL.
  • Support expiration times for links.
  • Provide basic analytics (click counts).
  • Support custom short aliases (optional).

3. Non-Functional Requirements

Now I'll define the non-functional requirements, as these will dictate our database and caching choices.

  • High Availability: The service must be always available; if it goes down, all redirections fail.
  • Low Latency: Redirection should happen in real-time.
  • Read-Heavy: Redirections will far outnumber URL creations \rightarrow 100:1 Ratio
  • Unpredictability: Short links should not be easily guessable (security/spam prevention).'

The system must be highly available with minimal redirection latency. Given a projected 100:1 read-to-write ratio, the system is heavily read-oriented. Short links must be non-predictable to prevent malicious scraping

4. Constraints & Assumptions

I'll make a few assumptions regarding traffic and data retention to guide the capacity estimation.

  • Read/Write ratio is 100:1.
  • Average storage duration is 5 years.
  • Default expiration is 2 years.

PHASE 2: ESTIMATION & INTERFACE

5. Capacity Estimation

I'll perform some back-of-the-envelope calculations to determine the scale of our infrastructure.

Traffic:

  • New URLs: 500M / month \approx 200 URLs/sec. (2.6 x 10^8 seconds in a month and 10^5 seconds in a day)
  • Redirections: 50B / month \approx 20K requests/sec.

Storage:

  • Total Objects (5 years): 500M×12 months×5 years=30B500\text{M} \times 12\text{ months} \times 5\text{ years} = 30\text{B} records.
  • Size per record: 500 bytes\approx 500\text{ bytes}.
  • Total Storage: 30B×500 bytes15 TB30\text{B} \times 500\text{ bytes} \approx 15\text{ TB}.

Bandwidth:

  • Ingress (Writes): 200 URLs/s×500 bytes100 KB/s200\text{ URLs/s} \times 500\text{ bytes} \approx 100\text{ KB/s}.
  • Egress (Reads): 20K URLs/s×500 bytes10 MB/s20\text{K URLs/s} \times 500\text{ bytes} \approx 10\text{ MB/s}.

Memory (Cache):

  • Following the 80/20 rule, we cache 20% of daily traffic.
  • Daily requests: 20K/s×86,400s1.7B20\text{K/s} \times 86,400\text{s} \approx 1.7\text{B} requests.
  • Cache Size: 0.2×1.7B×500 bytes170 GB0.2 \times 1.7\text{B} \times 500\text{ bytes} \approx 170\text{ GB}.

6. Interface Definition

Now I'll define the REST APIs. I'm using a POST method for creation to handle the payload and a simple GET for the redirection to ensure maximum speed.

Create Short URL

POST /shorten

  • Request: { original_url, custom_alias?, expiration_date?, user_id? }
  • Response: { shortened_url, creation_date, expiration_date }

Redirect API

GET /{shortened_url}

  • Response: HTTP 302 Redirect \rightarrow original_url

Analytics API

GET /analytics/{shortened_url}

  • Response: `{ click_count, unique_clicks, location_data, device_data }

**URL Management API

Retrieves a list of all URLs shortened by a specific user, with metadata.

GET /user/urls

  • user_id (string, required): The ID of the user whose URLs are being requested.
  • page (integer, optional): The page number for paginated results.
  • page_size (integer, optional): The number of results per page.

Response: urls (list): A list of URLs shortened by the user, including metadata like creation date and expiration date.


PHASE 3: HIGH-LEVEL DESIGN

7. High Level Architecture

  • The system follows a layered architecture: Client → CDN → Load Balancer → API servers → Cache → Database.
  • A dedicated Key Generation Service (KGS) provides unique short keys to API servers, decoupling key creation from request handling and ensuring constant-time allocation.
  • Frequently accessed URLs are served from CDN (edge caching) or Redis (in-memory cache) to minimize latency and reduce database load.
  • The database acts as the source of truth, with cache used as a performance layer.
  • Expiration is handled via cache TTL, read-time validation, and background cleanup jobs.

8. Data Model & Access Patterns

  • Choice: NoSQL (DynamoDB, Cassandra).

  • Reasoning:

    • Access pattern is simple: shortURL → originalURL (key-value lookup)
    • No joins required
    • Handles billions of records with horizontal scaling
    • Optimized for high write throughput and low-latency reads

9. Database Selection

  • URL Table: ShortURL (PK), OriginalURL, UserID, CreationDate, ExpirationDate

  • User Table: UserID (PK), UserDetails

  • Design Notes:

    • ShortURL as primary key enables fast lookups for redirects
    • Denormalized schema avoids joins
    • ExpirationDate supports TTL and validation logic

PHASE 4: DEEP DIVE & DETAILED DESIGN

10. Core Component Deep Dive

To generate the short key, I'll propose a Key Generation Service (KGS). This avoids the collision problems associated with hashing and allows us to provide keys in constant time.

Approach 1: Counter + Base62 (recommended in real-world)

  • Use distributed counter (e.g., Twitter Snowflake)
  • Convert numeric ID → Base62

Example:

123456789 → "aZ91xK"

✅ Pros:

  • No collisions
  • Ordered IDs (debugging easier)
  • No storage needed for unused keys

Approach 2: Random pre-generated keys

  • Generate random 6-char strings
  • Store in DB

❗ Trade-off:

  • Requires storage (~hundreds of GB for billions of keys)
  • Slight overhead vs counter-based
🚀 Why KGS is superior (core reasoning)
  1. Guaranteed uniqueness

    • No collision checks
    • No retry logic
  2. Constant-time allocation

    • No hashing + DB lookup loop
  3. Predictable latency

    • Critical for high-scale systems
  4. Better scalability

    • Independent scaling of key generation
❓ What if KGS becomes bottleneck?
  • Mitigation:

    • Batch key fetching (app servers rarely call KGS)
    • Horizontal scaling of KGS
    • Partition key space across multiple KGS nodes
❓ What if KGS crashes?
  • App servers continue using cached keys

  • If exhausted:

    • Failover to replica KGS
    • Or temporary write degradation
❓ What if app server crashes after fetching keys?
  • Some keys are lost (never used)

✔ Acceptable because:

  • Key space is huge (billions/trillions)
  • Avoids expensive coordination
❓ How do we ensure no duplicate keys across servers?
  • KGS uses:

    • Atomic operations (DB transactions / Redis INCR)
    • Or partitioned key ranges per node
❓ What if key space is exhausted?
  • Increase key length:

    • 6 chars → ~56B combinations
    • 7 chars → ~3.5T combinations

✔ Can be done without breaking existing URLs

❓ Why not generate keys directly in app server?
  • Leads to:

    • Race conditions
    • Duplicate keys
    • Need for distributed locking

✔ KGS centralizes this responsibility safely

I prefer KGS because it guarantees uniqueness and provides constant-time key allocation without collision handling. By prefetching keys and distributing key ranges, we eliminate it from the critical path while maintaining scalability and fault tolerance.

11. Caching Strategy (DEEPER)

  • Use Redis as primary cache layer (in-memory, low latency).

  • Apply Cache-Aside pattern:

    • Read → check cache → fallback to DB → update cache
  • Cache key: shortURL → originalURL

  • Use TTL aligned with expiration time to auto-evict expired links

Why this works:

  • Reduces DB load significantly (~80–90% hits)
  • Ensures low-latency redirects

12. Partitioning & Sharding

  • Use hash-based partitioning on shortURL
  • Distributes data evenly across DB nodes
  • Apply consistent hashing for scalability and minimal rebalancing

Why:

  • Avoids hotspot partitions
  • Enables horizontal scaling to billions of records

13. Async Processing & Messaging

  • Use Kafka / Queue system for:

    • Analytics (click tracking)
    • Logging
    • Cleanup jobs
  • Redirect flow remains synchronous, analytics is asynchronous

Why:

  • Keeps user-facing latency low
  • Decouples heavy processing from critical path

14. Load Balancing & Service Scaling

  • Use Load Balancer (L4/L7) between:

    • Client → API servers
    • API → Cache/DB
  • API servers are stateless

  • Enable horizontal auto-scaling

Why:

  • Eliminates single point of failure
  • Handles traffic spikes efficiently

15. Fault Tolerance and Failure Handling

  • Cache failure → fallback to DB
  • DB failure → use replication / failover
  • KGS failure → continue with pre-fetched keys
  • Retries with backoff for transient failures

Why:

  • Ensures high availability
  • Graceful degradation instead of full system failure

16. Multi-Region & Geo Distribution

  • Deploy system across multiple regions
  • Use Geo-DNS / routing to direct users to nearest region
  • Replicate data asynchronously across regions

Why:

  • Reduces latency for global users
  • Improves availability and disaster recovery

17. Security & Privacy

  • Use HTTPS for secure communication
  • Authentication via JWT/OAuth (optional for users)
  • Validate URLs to prevent malicious links
  • Rate limiting to prevent abuse

Why:

  • Protects user data
  • Prevents system misuse (spam/abuse)

18. Monitoring & Observability

  • Track:

    • Latency (p95/p99)
    • QPS
    • Error rates
  • Use:

    • Prometheus + Grafana (metrics)
    • ELK stack (logs)

Why:

  • Enables quick detection of issues
  • Helps in scaling and optimization decisions

Here’s your finalized Phase 5, kept concise, sharp, and interview-ready with just the essential depth:


🔷 PHASE 5: EVALUATION

19. Trade-offs Analysis

  • KGS vs Hashing

    • KGS → no collisions, predictable latency, but adds system complexity
    • Hashing → simpler, but requires collision handling and retries
  • Cache vs Freshness

    • Caching improves latency and reduces DB load
    • May serve slightly stale data (acceptable for this system)
  • NoSQL vs SQL

    • NoSQL → better horizontal scalability and high throughput
    • SQL → stronger consistency but harder to scale at large volume
  • Pre-fetching keys (KGS)

    • Improves performance (O(1) allocation)
    • May waste some keys if servers crash

20. Bottlenecks & Future Improvements

  • Hot URLs (viral links)

    • Bottleneck at cache/DB
    • Improve via CDN caching and cache replication
  • Cache limitations

    • Increase cache size or use distributed cache cluster
  • Database scaling

    • Add more shards / partitions as data grows
  • Analytics overhead

    • Move fully to async pipelines (Kafka + data warehouse)
  • Future Improvements

    • Add ranking / smart redirection (geo/device-based)
    • Introduce ML-based link analytics
    • Improve abuse detection and rate limiting