All posts

System Design Mastery: The Complete Guide for Software Engineers

In the fast-paced world of software engineering, the ability to build code is standard, but the ability to architect systems is what sets the masters apart. This comprehensive guide dives deep into the core principles of high-level system design, from understanding load balancing and caching strategies to navigating the complexities of microservices and database sharding. Whether you are preparing for a grueling technical interview or looking to scale your current application to support millions of users, this blog provides the roadmap you need.

April 20, 202620 min read
System Design Mastery: The Complete Guide for Software Engineers

System Design Mastery: The Complete Guide for Software Engineers

System design interviews are the biggest obstacle between mid-level developers and senior engineering roles at top tech companies. More importantly, system design skills separate engineers who can build features from those who can architect entire platforms.

Whether you're preparing for FAANG interviews or want to design better systems at your current job, this guide will teach you the frameworks, patterns, and thinking processes that senior engineers use to design scalable systems.

By the end, you'll understand how to design systems like Netflix, Uber, Twitter, and YouTube from scratch, and apply these patterns to solve real-world engineering challenges.

Prerequisites

To get the most from this guide, you should have:

  • 2+ years of software development experience
  • Understanding of basic data structures (arrays, trees, graphs, hash tables)
  • Familiarity with databases and APIs
  • Basic knowledge of how the web works (HTTP, DNS, servers)
  • No distributed systems experience required

What Is System Design?

System design is the process of defining the architecture, components, modules, interfaces, and data flow for a system to satisfy specific requirements.

It answers questions like:

  • How does Netflix serve videos to 200+ million users simultaneously?
  • How does Google process billions of searches per day?
  • How does Uber match riders with drivers in real-time?
  • How does Instagram store and serve billions of photos?

Why System Design Matters

For Interviews:

  • All senior+ engineering roles require system design rounds
  • Often the deciding factor between candidates with similar coding skills
  • Tests real-world engineering judgment, not just algorithms

For Your Career:

  • Ability to design systems is what makes you a senior engineer
  • Critical for technical leadership and architecture roles
  • Helps you make better engineering decisions daily
  • Essential for building scalable products

For Your Team:

  • Poor design decisions cost companies millions in rewrites
  • Good architecture scales smoothly as users grow
  • Prevents technical debt and system failures

The System Design Framework

Use this framework for every system design question:

1. REQUIREMENTS (5 minutes)
   ↓
   Clarify functional and non-functional requirements
   Define scope and constraints
   
2. CAPACITY ESTIMATION (5 minutes)
   ↓
   Calculate traffic, storage, bandwidth needs
   Identify bottlenecks
   
3. HIGH-LEVEL DESIGN (10 minutes)
   ↓
   Draw basic architecture diagram
   Identify major components
   
4. DETAILED DESIGN (15 minutes)
   ↓
   Deep dive into critical components
   Discuss trade-offs
   
5. SCALE AND OPTIMIZE (10 minutes)
   ↓
   Add caching, load balancing, replication
   Handle failures
   
6. ADDITIONAL CONSIDERATIONS (5 minutes)
   ↓
   Security, monitoring, deployment

Let's apply this framework to real examples.

Example 1: Design URL Shortener (Like Bit.ly)

Step 1: Requirements Clarification

Functional Requirements:

  • Given a long URL, generate a short URL
  • Short URL redirects to original long URL
  • Optional: Custom short URLs
  • Optional: Analytics (click count, locations)

Non-Functional Requirements:

  • High availability (99.9% uptime)
  • Low latency (< 100ms redirect time)
  • Short URLs should be unpredictable
  • System should scale to 100M URLs

Constraints:

  • Short URL length: 6-7 characters
  • Read-heavy system (100:1 read-write ratio)
  • URLs don't expire

Step 2: Capacity Estimation

Traffic:

  • 100M new URLs per month
  • ~40 new URLs per second
  • With 100:1 ratio: 4000 reads/second

Storage:

  • Each URL entry: ~500 bytes (long URL + metadata)
  • 100M URLs = 50 GB per month
  • For 5 years: 50 GB × 60 = 3 TB

Bandwidth:

  • Write: 40 requests/sec × 500 bytes = 20 KB/sec
  • Read: 4000 requests/sec × 500 bytes = 2 MB/sec

Cache:

  • 80-20 rule: 20% URLs generate 80% traffic
  • Cache 20% of daily URLs: ~500 MB cache

Step 3: High-Level Design

┌─────────┐
│ Client  │
└────┬────┘
     │
     ▼
┌─────────────────┐
│  Load Balancer  │
└────┬────────────┘
     │
     ▼
┌─────────────────┐
│  API Servers    │  (Create/Redirect)
└────┬────────────┘
     │
     ├──────────┬─────────┐
     ▼          ▼         ▼
┌────────┐  ┌──────┐  ┌──────────┐
│ Cache  │  │  DB  │  │ Analytics│
└────────┘  └──────┘  └──────────┘

API Design:

POST /api/shorten
Request: { "longUrl": "https://example.com/very/long/url" }
Response: { "shortUrl": "https://short.ly/abc123" }

GET /abc123
Response: 302 Redirect to original URL

Step 4: Detailed Design

URL Encoding Strategy:

Option 1: Hash-based

import hashlib

def encode_url_hash(long_url):
    # MD5 hash of URL
    hash_value = hashlib.md5(long_url.encode()).hexdigest()
    
    # Take first 6 characters
    short_code = hash_value[:6]
    
    # Problem: Collisions possible
    return short_code

Option 2: Counter-based (Better)

import string

def encode_url_counter(counter):
    """
    Convert counter to base62 (0-9, a-z, A-Z)
    """
    chars = string.digits + string.ascii_lowercase + string.ascii_uppercase
    base = len(chars)  # 62
    
    result = []
    while counter > 0:
        result.append(chars[counter % base])
        counter //= base
    
    return ''.join(reversed(result))

# Examples:
# 1 -> "1"
# 62 -> "10"
# 916132832 -> "abc123"

Database Schema:

-- URLs table
CREATE TABLE urls (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    short_code VARCHAR(7) UNIQUE NOT NULL,
    long_url TEXT NOT NULL,
    user_id BIGINT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_short_code (short_code)
);

-- Analytics table
CREATE TABLE analytics (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    short_code VARCHAR(7),
    clicked_at TIMESTAMP,
    ip_address VARCHAR(45),
    user_agent TEXT,
    country VARCHAR(2),
    INDEX idx_short_code (short_code),
    INDEX idx_clicked_at (clicked_at)
);

Implementation:

from flask import Flask, request, redirect
import redis
import mysql.connector

app = Flask(__name__)

# Redis for caching
cache = redis.Redis(host='localhost', port=6379)

# MySQL connection
db = mysql.connector.connect(
    host="localhost",
    user="user",
    password="password",
    database="urlshortener"
)

def base62_encode(num):
    chars = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
    if num == 0:
        return chars[0]
    
    result = []
    while num:
        result.append(chars[num % 62])
        num //= 62
    
    return ''.join(reversed(result))

@app.route('/api/shorten', methods=['POST'])
def shorten_url():
    long_url = request.json['longUrl']
    
    # Get next ID from database
    cursor = db.cursor()
    cursor.execute(
        "INSERT INTO urls (long_url) VALUES (%s)",
        (long_url,)
    )
    db.commit()
    
    url_id = cursor.lastrowid
    short_code = base62_encode(url_id)
    
    # Update short_code
    cursor.execute(
        "UPDATE urls SET short_code = %s WHERE id = %s",
        (short_code, url_id)
    )
    db.commit()
    
    # Cache it
    cache.setex(short_code, 86400, long_url)  # 24h TTL
    
    return {
        'shortUrl': f'https://short.ly/{short_code}',
        'longUrl': long_url
    }

@app.route('/<short_code>')
def redirect_url(short_code):
    # Check cache first
    long_url = cache.get(short_code)
    
    if not long_url:
        # Cache miss - query database
        cursor = db.cursor()
        cursor.execute(
            "SELECT long_url FROM urls WHERE short_code = %s",
            (short_code,)
        )
        result = cursor.fetchone()
        
        if not result:
            return "URL not found", 404
        
        long_url = result[0]
        
        # Update cache
        cache.setex(short_code, 86400, long_url)
    
    # Log analytics (async would be better)
    log_click(short_code, request)
    
    return redirect(long_url.decode() if isinstance(long_url, bytes) else long_url)

def log_click(short_code, request):
    cursor = db.cursor()
    cursor.execute(
        """INSERT INTO analytics 
           (short_code, ip_address, user_agent) 
           VALUES (%s, %s, %s)""",
        (short_code, request.remote_addr, request.user_agent.string)
    )
    db.commit()

Step 5: Scale and Optimize

Caching Strategy:

# Multi-level caching
class URLCache:
    def __init__(self):
        self.local_cache = {}  # In-memory LRU
        self.redis = redis.Redis()
    
    def get(self, short_code):
        # Level 1: Local memory (fastest)
        if short_code in self.local_cache:
            return self.local_cache[short_code]
        
        # Level 2: Redis (fast)
        long_url = self.redis.get(short_code)
        if long_url:
            self.local_cache[short_code] = long_url
            return long_url
        
        # Level 3: Database (slow)
        long_url = self.db_get(short_code)
        if long_url:
            self.redis.setex(short_code, 86400, long_url)
            self.local_cache[short_code] = long_url
        
        return long_url

Database Sharding:

# Shard by short_code
def get_shard(short_code):
    # Hash short_code and mod by number of shards
    shard_count = 10
    shard_id = hash(short_code) % shard_count
    return shard_id

def get_db_connection(short_code):
    shard_id = get_shard(short_code)
    return db_connections[shard_id]

Rate Limiting:

from flask_limiter import Limiter

limiter = Limiter(
    app,
    key_func=lambda: request.remote_addr,
    default_limits=["100 per hour"]
)

@app.route('/api/shorten', methods=['POST'])
@limiter.limit("10 per minute")
def shorten_url():
    # ... implementation

Step 6: Additional Considerations

Security:

  • Validate URLs to prevent malicious links
  • Rate limiting to prevent abuse
  • HTTPS for all traffic
  • Check for spam/phishing URLs

Monitoring:

  • Track cache hit rate
  • Monitor database query time
  • Alert on high error rates
  • Track redirect latency

Deployment:

  • Use CDN for static assets
  • Deploy in multiple regions
  • Blue-green deployment for zero downtime
  • Auto-scaling based on traffic

Example 2: Design Instagram

Step 1: Requirements

Functional:

  • Upload photos/videos
  • Follow other users
  • View feed of followed users' posts
  • Like and comment on posts
  • Search users

Non-Functional:

  • 500M daily active users
  • Low latency (feed loads in < 1s)
  • High availability (99.99%)
  • Eventually consistent (likes can have slight delay)

Step 2: Capacity Estimation

Traffic:

  • 500M DAU
  • Each user uploads 2 photos/day = 1B photos/day
  • Each user views 50 posts/day = 25B views/day
  • Write: 1B / 86400 = ~12K writes/sec
  • Read: 25B / 86400 = ~290K reads/sec

Storage:

  • Average photo size: 2 MB
  • 1B photos/day × 2 MB = 2 PB/day
  • For 5 years: 2 PB × 365 × 5 = 3.6 EB

Bandwidth:

  • Upload: 12K/sec × 2 MB = 24 GB/sec
  • Download: 290K/sec × 2 MB = 580 GB/sec

Step 3: High-Level Design

                  ┌──────────────┐
                  │  CDN Network │
                  └──────┬───────┘
                         │
┌─────────┐       ┌──────▼───────┐
│ Client  │──────▶│Load Balancer │
└─────────┘       └──────┬───────┘
                         │
        ┌────────────────┼────────────────┐
        ▼                ▼                ▼
┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│  API Servers │  │ Image Server │  │ Feed Service │
└──────┬───────┘  └──────┬───────┘  └──────┬───────┘
       │                 │                 │
       ▼                 ▼                 ▼
┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│  PostgreSQL  │  │ Object Store │  │    Redis     │
│   (Metadata) │  │   (S3/GCS)   │  │   (Cache)    │
└──────────────┘  └──────────────┘  └──────────────┘

Step 4: Detailed Design

Database Schema:

-- Users table
CREATE TABLE users (
    user_id BIGINT PRIMARY KEY,
    username VARCHAR(50) UNIQUE,
    email VARCHAR(255) UNIQUE,
    profile_pic_url TEXT,
    created_at TIMESTAMP
);

-- Posts table
CREATE TABLE posts (
    post_id BIGINT PRIMARY KEY,
    user_id BIGINT REFERENCES users(user_id),
    image_url TEXT NOT NULL,
    caption TEXT,
    created_at TIMESTAMP,
    INDEX idx_user_created (user_id, created_at)
);

-- Follows table
CREATE TABLE follows (
    follower_id BIGINT REFERENCES users(user_id),
    followee_id BIGINT REFERENCES users(user_id),
    created_at TIMESTAMP,
    PRIMARY KEY (follower_id, followee_id),
    INDEX idx_follower (follower_id),
    INDEX idx_followee (followee_id)
);

-- Likes table
CREATE TABLE likes (
    post_id BIGINT REFERENCES posts(post_id),
    user_id BIGINT REFERENCES users(user_id),
    created_at TIMESTAMP,
    PRIMARY KEY (post_id, user_id),
    INDEX idx_post (post_id)
);

Image Upload Flow:

from flask import Flask, request
import boto3
import uuid
from PIL import Image
import io

app = Flask(__name__)
s3_client = boto3.client('s3')

@app.route('/api/upload', methods=['POST'])
def upload_photo():
    file = request.files['photo']
    user_id = request.form['user_id']
    caption = request.form.get('caption', '')
    
    # Generate unique ID
    photo_id = str(uuid.uuid4())
    
    # Process image
    image = Image.open(file)
    
    # Create multiple sizes
    sizes = {
        'original': (1080, 1080),
        'medium': (640, 640),
        'thumbnail': (150, 150)
    }
    
    urls = {}
    for size_name, dimensions in sizes.items():
        # Resize image
        resized = image.copy()
        resized.thumbnail(dimensions, Image.LANCZOS)
        
        # Convert to bytes
        buffer = io.BytesIO()
        resized.save(buffer, format='JPEG', quality=85)
        buffer.seek(0)
        
        # Upload to S3
        key = f'photos/{user_id}/{photo_id}_{size_name}.jpg'
        s3_client.upload_fileobj(
            buffer,
            'instagram-photos',
            key,
            ExtraArgs={'ContentType': 'image/jpeg'}
        )
        
        urls[size_name] = f'https://cdn.instagram.com/{key}'
    
    # Save metadata to database
    cursor = db.cursor()
    cursor.execute(
        """INSERT INTO posts (user_id, image_url, caption, created_at) 
           VALUES (%s, %s, %s, NOW())""",
        (user_id, urls['original'], caption)
    )
    db.commit()
    post_id = cursor.lastrowid
    
    # Fan-out: Push to followers' feeds (async)
    fanout_to_followers(user_id, post_id)
    
    return {
        'post_id': post_id,
        'urls': urls
    }

Feed Generation:

Two approaches:

1. Pull Model (Fetch on demand):

def get_feed_pull(user_id, page=1, page_size=20):
    """
    Fetch posts from followed users on demand
    Good for: Users with many followees
    """
    offset = (page - 1) * page_size
    
    cursor = db.cursor()
    cursor.execute("""
        SELECT p.post_id, p.user_id, p.image_url, p.caption, p.created_at
        FROM posts p
        JOIN follows f ON p.user_id = f.followee_id
        WHERE f.follower_id = %s
        ORDER BY p.created_at DESC
        LIMIT %s OFFSET %s
    """, (user_id, page_size, offset))
    
    return cursor.fetchall()

2. Push Model (Pre-compute feeds):

import redis

redis_client = redis.Redis()

def fanout_to_followers(poster_id, post_id):
    """
    When user posts, push to all followers' feeds
    Good for: Users with few followers
    """
    # Get all followers
    cursor = db.cursor()
    cursor.execute(
        "SELECT follower_id FROM follows WHERE followee_id = %s",
        (poster_id,)
    )
    followers = cursor.fetchall()
    
    # Add post to each follower's feed (Redis sorted set)
    for (follower_id,) in followers:
        redis_client.zadd(
            f'feed:{follower_id}',
            {post_id: time.time()}
        )
        # Keep only latest 500 posts
        redis_client.zremrangebyrank(f'feed:{follower_id}', 0, -501)

def get_feed_push(user_id, page=1, page_size=20):
    """
    Fetch pre-computed feed from Redis
    """
    start = (page - 1) * page_size
    end = start + page_size - 1
    
    # Get post IDs from Redis sorted set
    post_ids = redis_client.zrevrange(
        f'feed:{user_id}',
        start,
        end
    )
    
    # Fetch post details from database
    cursor = db.cursor()
    placeholders = ','.join(['%s'] * len(post_ids))
    cursor.execute(f"""
        SELECT post_id, user_id, image_url, caption, created_at
        FROM posts
        WHERE post_id IN ({placeholders})
    """, post_ids)
    
    return cursor.fetchall()

Hybrid Approach (Best):

def get_feed_hybrid(user_id):
    """
    Push for active users with few followers
    Pull for celebrities with millions of followers
    """
    # Get follower count
    cursor = db.cursor()
    cursor.execute(
        "SELECT COUNT(*) FROM follows WHERE followee_id = %s",
        (user_id,)
    )
    follower_count = cursor.fetchone()[0]
    
    # Threshold: 10K followers
    if follower_count < 10000:
        return get_feed_push(user_id)
    else:
        return get_feed_pull(user_id)

Step 5: Scale and Optimize

Database Sharding:

# Shard by user_id
def get_shard_id(user_id):
    return user_id % 100  # 100 shards

# When querying
shard_id = get_shard_id(user_id)
db = get_db_connection(shard_id)

Caching Strategy:

from functools import lru_cache
import json

class FeedCache:
    def __init__(self):
        self.redis = redis.Redis()
    
    def get_cached_feed(self, user_id, page):
        key = f'feed:{user_id}:page:{page}'
        cached = self.redis.get(key)
        
        if cached:
            return json.loads(cached)
        
        # Cache miss - generate feed
        feed = generate_feed(user_id, page)
        
        # Cache for 5 minutes
        self.redis.setex(key, 300, json.dumps(feed))
        
        return feed
    
    def invalidate_feed(self, user_id):
        """Invalidate when new post created"""
        pattern = f'feed:{user_id}:page:*'
        keys = self.redis.keys(pattern)
        if keys:
            self.redis.delete(*keys)

CDN for Images:

# CloudFront/Cloudflare configuration
CDN_DOMAINS = {
    'us-east': 'https://us-east.cdn.instagram.com',
    'us-west': 'https://us-west.cdn.instagram.com',
    'europe': 'https://eu.cdn.instagram.com',
    'asia': 'https://asia.cdn.instagram.com'
}

def get_image_url(image_key, user_location):
    """Serve from nearest CDN"""
    cdn_domain = CDN_DOMAINS.get(user_location, CDN_DOMAINS['us-east'])
    return f'{cdn_domain}/{image_key}'

Async Processing:

from celery import Celery

celery = Celery('tasks', broker='redis://localhost:6379')

@celery.task
def process_upload(file_path, user_id, caption):
    """Process upload asynchronously"""
    # Resize images
    # Upload to S3
    # Generate thumbnails
    # Update database
    # Fanout to followers
    pass

@app.route('/api/upload', methods=['POST'])
def upload_photo():
    # Save file temporarily
    temp_path = save_temp_file(request.files['photo'])
    
    # Queue for processing
    task = process_upload.delay(temp_path, user_id, caption)
    
    return {'task_id': task.id, 'status': 'processing'}

Common Design Patterns

Pattern 1: Load Balancing

Round Robin:

class RoundRobinBalancer:
    def __init__(self, servers):
        self.servers = servers
        self.index = 0
    
    def get_server(self):
        server = self.servers[self.index]
        self.index = (self.index + 1) % len(self.servers)
        return server

Consistent Hashing:

import hashlib

class ConsistentHash:
    def __init__(self, nodes, virtual_nodes=150):
        self.virtual_nodes = virtual_nodes
        self.ring = {}
        
        for node in nodes:
            self.add_node(node)
    
    def add_node(self, node):
        for i in range(self.virtual_nodes):
            key = f'{node}:{i}'
            hash_value = int(hashlib.md5(key.encode()).hexdigest(), 16)
            self.ring[hash_value] = node
    
    def get_node(self, key):
        if not self.ring:
            return None
        
        hash_value = int(hashlib.md5(key.encode()).hexdigest(), 16)
        
        # Find closest node clockwise
        for ring_key in sorted(self.ring.keys()):
            if hash_value <= ring_key:
                return self.ring[ring_key]
        
        # Wrap around to first node
        return self.ring[min(self.ring.keys())]

Pattern 2: Caching

Cache-Aside Pattern:

def get_user(user_id):
    # Try cache first
    user = cache.get(f'user:{user_id}')
    
    if user:
        return user
    
    # Cache miss - fetch from database
    user = db.query(f'SELECT * FROM users WHERE id = {user_id}')
    
    # Update cache
    cache.set(f'user:{user_id}', user, ttl=3600)
    
    return user

Write-Through Cache:

def update_user(user_id, data):
    # Update database
    db.update(f'UPDATE users SET ... WHERE id = {user_id}')
    
    # Update cache simultaneously
    cache.set(f'user:{user_id}', data, ttl=3600)

Write-Behind Cache:

def update_user(user_id, data):
    # Update cache immediately
    cache.set(f'user:{user_id}', data, ttl=3600)
    
    # Queue database update asynchronously
    queue.push('db_updates', {'user_id': user_id, 'data': data})

Pattern 3: Rate Limiting

Token Bucket Algorithm:

import time

class TokenBucket:
    def __init__(self, capacity, refill_rate):
        self.capacity = capacity
        self.tokens = capacity
        self.refill_rate = refill_rate  # tokens per second
        self.last_refill = time.time()
    
    def allow_request(self):
        # Refill tokens
        now = time.time()
        elapsed = now - self.last_refill
        self.tokens = min(
            self.capacity,
            self.tokens + elapsed * self.refill_rate
        )
        self.last_refill = now
        
        # Check if request allowed
        if self.tokens >= 1:
            self.tokens -= 1
            return True
        return False

# Usage
bucket = TokenBucket(capacity=100, refill_rate=10)  # 10 req/sec

@app.route('/api/endpoint')
def endpoint():
    if not bucket.allow_request():
        return {'error': 'Rate limit exceeded'}, 429
    
    # Process request
    return {'success': True}

Sliding Window:

import redis
import time

class SlidingWindowRateLimiter:
    def __init__(self, redis_client, max_requests, window_seconds):
        self.redis = redis_client
        self.max_requests = max_requests
        self.window = window_seconds
    
    def is_allowed(self, user_id):
        key = f'rate_limit:{user_id}'
        now = time.time()
        window_start = now - self.window
        
        # Remove old entries
        self.redis.zremrangebyscore(key, 0, window_start)
        
        # Count requests in window
        request_count = self.redis.zcard(key)
        
        if request_count < self.max_requests:
            # Add current request
            self.redis.zadd(key, {now: now})
            self.redis.expire(key, self.window)
            return True
        
        return False

Pattern 4: Database Scaling

Read Replicas:

class DatabasePool:
    def __init__(self):
        self.master = connect_to_db('master')
        self.replicas = [
            connect_to_db('replica1'),
            connect_to_db('replica2'),
            connect_to_db('replica3')
        ]
        self.replica_index = 0
    
    def write(self, query):
        """All writes go to master"""
        return self.master.execute(query)
    
    def read(self, query):
        """Reads distributed across replicas"""
        replica = self.replicas[self.replica_index]
        self.replica_index = (self.replica_index + 1) % len(self.replicas)
        return replica.execute(query)

Database Sharding:

class ShardedDatabase:
    def __init__(self, shards):
        self.shards = shards
        self.shard_count = len(shards)
    
    def get_shard(self, key):
        """Determine which shard contains the key"""
        shard_id = hash(key) % self.shard_count
        return self.shards[shard_id]
    
    def insert(self, key, value):
        shard = self.get_shard(key)
        shard.execute(f"INSERT INTO table VALUES ({key}, {value})")
    
    def get(self, key):
        shard = self.get_shard(key)
        return shard.execute(f"SELECT * FROM table WHERE key = {key}")

Trade-offs in System Design

Every design decision involves trade-offs. Here are the most important ones:

1. Consistency vs Availability (CAP Theorem)

Strong Consistency        Eventual Consistency
      ↓                          ↓
All nodes see same       Nodes may temporarily
data at same time        have different data
      ↓                          ↓
Lower availability       Higher availability
Slower writes           Faster writes
      ↓                          ↓
Use for: Banking,        Use for: Social media,
inventory, bookings      likes, views, comments

Example:

# Strong consistency (two-phase commit)
def transfer_money(from_account, to_account, amount):
    # Transaction ensures both succeed or both fail
    with db.transaction():
        db.deduct(from_account, amount)
        db.add(to_account, amount)
    # If any fails, both rollback

# Eventual consistency
def like_post(post_id, user_id):
    # Update multiple caches independently
    cache1.increment(f'likes:{post_id}')
    cache2.add_to_set(f'likers:{post_id}', user_id)
    # May be temporarily inconsistent
    queue.push('sync_likes', {'post_id': post_id, 'user_id': user_id})

2. SQL vs NoSQL

SQL (PostgreSQL, MySQL)      NoSQL (MongoDB, Cassandra)
          ↓                              ↓
Structured schema              Flexible schema
ACID transactions             Eventually consistent
Complex queries               Simple queries
Vertical scaling              Horizontal scaling
          ↓                              ↓
Use for: Financial,           Use for: Logs, analytics,
user data, orders             time-series, sessions

3. Synchronous vs Asynchronous Processing

# Synchronous (blocking)
@app.route('/upload')
def upload_file():
    file = request.files['file']
    
    # User waits for all of this
    resize_image(file)       # 2 seconds
    upload_to_s3(file)       # 3 seconds
    generate_thumbnail(file) # 2 seconds
    update_database(file)    # 1 second
    
    return {'success': True}  # 8 seconds total

# Asynchronous (non-blocking)
@app.route('/upload')
def upload_file():
    file = request.files['file']
    
    # Queue for background processing
    task_id = queue.enqueue(process_file, file)
    
    return {'task_id': task_id}  # Returns immediately

# Background worker
def process_file(file):
    resize_image(file)
    upload_to_s3(file)
    generate_thumbnail(file)
    update_database(file)

4. Normalization vs Denormalization

-- Normalized (avoid duplication)
CREATE TABLE users (
    id INT PRIMARY KEY,
    username VARCHAR(50)
);

CREATE TABLE posts (
    id INT PRIMARY KEY,
    user_id INT REFERENCES users(id),
    content TEXT
);

-- Need JOIN to get username with post
SELECT p.content, u.username
FROM posts p
JOIN users u ON p.user_id = u.id;

-- Denormalized (duplicate data for speed)
CREATE TABLE posts (
    id INT PRIMARY KEY,
    user_id INT,
    username VARCHAR(50),  -- Duplicated!
    content TEXT
);

-- No JOIN needed
SELECT content, username FROM posts;

Interview Tips

1. Start with Clarifying Questions

Don't jump into design immediately. Ask:

Functional Requirements:
- What features are most important?
- What's the expected user flow?
- Are there any features we can defer?

Scale Requirements:
- How many users?
- How many requests per second?
- How much data storage needed?
- Read vs write ratio?

Performance Requirements:
- What's acceptable latency?
- Uptime requirements?
- Consistency vs availability priority?

2. Think Out Loud

Interviewers want to understand your thought process:

❌ Bad: "I'll use Redis for caching"

✅ Good: "We have a read-heavy workload with 100:1 read-write 
ratio, so caching will significantly reduce database load. 
Redis is a good fit because it's in-memory, supports TTL, 
and can handle millions of ops/sec. We could also consider 
Memcached, but Redis gives us more data structures if we 
need them later."

3. Discuss Trade-offs

Every decision has pros and cons:

"We could use a microservices architecture which gives us:
+ Independent scaling of components
+ Team autonomy
+ Technology flexibility

But also:
- Increased operational complexity
- Network latency between services
- Distributed system challenges

Given our team size and requirements, I'd recommend 
starting with a monolith and splitting into services 
as we identify bottlenecks."

4. Draw Diagrams

Visual communication is crucial:

Components:
┌─────────┐  = Service/Server
│         │

├─────────┤  = Database
│         │

[ ]          = Queue

──▶          = Data flow

┄┄▶          = Async flow

5. Know Your Numbers

Memorize these latencies:

L1 cache:                0.5 ns
L2 cache:                7 ns
RAM:                     100 ns
SSD:                     150 μs
Network (same DC):       500 μs
HDD:                     10 ms
Network (different DC):  150 ms

1 million requests/day = ~12 requests/second
1 billion requests/day = ~12K requests/second

Practice Problems

Beginner Level

  1. Design a Pastebin (Like pastebin.com)

    • Store and retrieve text snippets
    • Generate unique URLs
    • Set expiration times
  2. Design a Key-Value Store (Like Redis)

    • GET/SET operations
    • TTL support
    • Persistence
  3. Design a Web Crawler

    • Crawl websites systematically
    • Avoid duplicates
    • Respect robots.txt

Intermediate Level

  1. Design Twitter

    • Post tweets
    • Follow users
    • View timeline
    • Trending topics
  2. Design Uber

    • Match riders with drivers
    • Real-time location tracking
    • Fare calculation
    • ETA estimation
  3. Design Netflix

    • Video streaming
    • Recommendations
    • Search
    • Continue watching

Advanced Level

  1. Design WhatsApp

    • Real-time messaging
    • Group chats
    • Message delivery confirmation
    • End-to-end encryption
  2. Design Google Search

    • Web indexing
    • Query processing
    • Ranking algorithm
    • Auto-complete
  3. Design YouTube

    • Video upload and processing
    • Streaming at scale
    • Comments and recommendations
    • Live streaming

Resources for Learning

Books

  • "Designing Data-Intensive Applications" by Martin Kleppmann (Best overall)
  • "System Design Interview" by Alex Xu (Interview-focused)
  • "Database Internals" by Alex Petrov (Deep dive)

Online Courses

  • Grokking the System Design Interview (educative.io)
  • System Design Primer (GitHub - free)
  • Scalability Lectures (Harvard CS75)

Practice Platforms

  • Pramp - Mock interviews with peers
  • Interviewing.io - Practice with engineers
  • LeetCode - System design section

YouTube Channels

  • Gaurav Sen - System design concepts
  • Tech Dummies - Design deep dives
  • ByteByteGo - Visual explanations

Blogs

  • High Scalability (highscalability.com)
  • Engineering blogs: Netflix, Uber, Airbnb tech blogs
  • Martin Fowler's blog (martinfowler.com)

Conclusion

System design is not about memorizing solutions—it's about understanding principles and applying them to solve problems. The best system designers:

  • Ask clarifying questions before designing
  • Understand trade-offs and explain them clearly
  • Start simple and iterate
  • Consider non-functional requirements (scale, reliability, performance)
  • Think about operations and monitoring
  • Know when to optimize and when "good enough" is fine

Key Takeaways

  1. There's no perfect design - Every solution has trade-offs
  2. Start simple - Add complexity only when needed
  3. Think in terms of components - Break problems into manageable pieces
  4. Consider the entire lifecycle - Deployment, monitoring, maintenance matter
  5. Practice regularly - System design is a skill that improves with practice

Next Steps

  1. Study one system deeply this week - Pick URL shortener or Pastebin
  2. Design something you use daily - How would you build Twitter, Instagram?
  3. Join a study group - Discuss designs with peers
  4. Read engineering blogs - Learn from production systems
  5. Do mock interviews - Practice explaining your designs

Remember: Companies hire you to build systems that solve real problems for real users. System design skills make you valuable because you can think beyond individual features and see the bigger picture.

Start practicing today. Pick a system, work through the framework, and you'll be surprised how quickly you improve.