NGINX API Rate Limiting

Written by Ashnik Team

| Jun 03, 2025

3 min read

NGINX API Rate Limiting: Powerful DDoS Defense Guide

Launch-Day Horror Story

08:59 a.m.—Your new mobile app goes live. Legitimate traffic climbs exactly as marketing predicted… then a botnet hammers /v1/orders at 2 million RPS. Kubernetes nodes gasp, dashboards bleed red, log-ins crawl.

Goal: show how NGINX API rate limiting becomes a programmable safety valve—throttling abusers while real users stay blazing fast.

Why Rate Limiting Beats “Just Auto-Scale”

Myth Reality
“Auto-scale will save me.” Scaling costs $$$ and never stops credential-stuffing. See the Kubernetes HPA docs.
“A WAF is enough.” A WAF blocks known attacks, not runaway legit traffic. The OWASP API Security Top 10 lists inadequate rate limiting as a primary threat.
“Put a CDN in front.” CDNs soak L3/4 floods, but origin APIs still need shaping. Cloudflare’s DDoS primer explains this “last-mile” gap.
bulb
Business takeaway:
every 429 returned to a greedy client equals 503s avoided for paying customers—brand equity preserved.

Four Design Principles Before You Touch nginx.conf

  1. Profile, then police — baseline real RPS per tenant; limits without context punish innocents.
  2. Layer the nets — CDN ➜ Edge NGINX ➜ Gateway NGINX; each tier catches a class of abuse.
  3. Make limits elastic — pipe Grafana alerts into Ansible to tune rates hourly; traffic is never static.
  4. Log everything — $limit_req_status and $limit_conn_status feed Grafana dashboards and post-incident forensics.

Implementation Recipes

  1. Per-IP Burst Buffer

    nginx

    limit_req_zone $binary_remote_addr zone=ip:10m rate=10r/s;
    limit_req_status 429; # RFC 6585-compliant
    # 10 MiB ≈ 160 K IP counters
    server {
    listen 443 ssl http2;
    server_name api.example.com;
    location / {
    limit_req zone=ip burst=20 nodelay;
    proxy_pass http://apps;
    }
    }

    Set burst = 2 × average RPS so legitimate clients aren’t punished for momentary jitter.

  2. Tenant-Aware Limits with JWT

    (Requires dynamic
    ngx_http_auth_jwt_module
    or NGINX Plus.)

    nginx

    js_import authutils.js;

    limit_req_zone $jwt_claim_sub zone=tenant:20m rate=100r/s;

    server {
    location /v2/ {
    auth_jwt “API Gateway”;
    auth_jwt_key_file /etc/nginx/jwt_public.pem;

    set $jwt_claim_sub ”;
    js_set $jwt_claim_sub authutils.jwt_sub;

    limit_req zone=tenant burst=200;
    proxy_pass http://apps_v2;
    }
    }

    Stops a single enterprise tenant from hogging all shared microservices.

  3. Connection-Exhaustion Shield

    nginx

    limit_conn_zone $binary_remote_addr zone=conn:10m;

    server {
    listen 443 ssl;
    limit_conn conn 1; # One open connection per IP
    # Protects against slow-loris (connection-starvation) attacks
    }

  4. Sliding-Window Algorithm Bonus

    Leaky-bucket can feel blunt. Combine njs with keyval to maintain a rolling 60-second window—smoother throttling and fewer false positives. Full code lives in the official F5 / NGINX njs rate-limiting guide.

Incident Runbook & Pitfalls

Runbook

Time Action
T-0 min Flip CDN to aggressive bot mode; verify Anycast health.
T + 2 min Apply emergency limit_conn conn 1 at Edge; confirm drop in connections.
T + 5 min Increase /auth/refresh burst to avoid locking out users.
T + 10 min Review $limit_req_status spikes; adjust tenant caps + 20 %.
Rollback Remove emergency limits via tagged Ansible play once load normalizes.
Post-mortem Compare p95 latency before/after limits; codify new baseline.

Common Pitfalls & Rapid Fixes

Pitfall Fix
Forgot limit_conn_zone ⇒ setting ignored. Define the zone before the server block—see the NGINX limit_conn docs.
Same limit for every endpoint. Exclude /auth/*, /health paths.
Counters lost on redeploy. Use Zone Sync (Plus) or a Redis-backed njs store.
Static limits in dynamic traffic. Auto-tune via Grafana ➜ Ansible.

Conclusion — Rate Limiting Is a Business Strategy, Not Just a Config

Every throttled request signals that you value customer experience over raw traffic volume and predictable revenue over unpredictable scale costs. When NGINX enforces fair-use policies in microseconds, your platform gains:

  1. Resilience by Design — bots and viral spikes become load-balanced opportunities instead of outage headlines.
  2. Cost Discipline — you spend on intentional capacity, not firefighting CPU thrash.
  3. Data-Driven Trust — transparent 429 responses with “Retry-After” build developer confidence.

Rate limiting is the safety valve that lets innovation scale without blowing the gasket.

Ready for a blueprint tuned to your exact traffic patterns?

Book a 30-minute Application Delivery Diagnostic — let’s engineer a zero-downtime, always-fair API gateway together.


Go to Top