Author Avatar
Anchal Rawat
Overview

Building a Rate Limiter Middleware in Go

January 1, 2026
3 min read

In the world of microservices and public-facing APIs, traffic is a double-edged sword. While growth is the goal, an unexpected surge—whether from a viral moment, a misconfigured client, or a malicious bot—can bring your entire infrastructure to its knees.

To build resilient systems, you need a gatekeeper. In this post, I’ll walk through my implementation of a thread-safe, memory-efficient rate limiter middleware in Go, designed to protect your services without adding external dependencies.

The Strategy: Token Bucket vs. Fixed Window

While many beginners start with a “Fixed Window” algorithm (resetting counters every minute), that approach often allows “bursting” at the edges of the window.

For this project, I used the Token Bucket algorithm via Go’s x/time/rate package.

  • The Concept: Each user has a “bucket” of tokens. Every request consumes one. Tokens refill at a steady rate.
  • The Benefit: It allows for occasional bursts of traffic while maintaining a strict long-term average, making it much “smoother” for real-world API usage.

The Architecture: Handling Scale and Concurrency

A production-grade rate limiter has three main requirements:

  1. Identification: Distinguishing between different users (IP-based).
  2. Thread Safety: Handling thousands of concurrent requests without race conditions.
  3. Memory Management: Cleaning up data for inactive users to prevent memory leaks.

1. Tracking Visitors

We define a Visitor to store both the limiter and the last time they were seen. This allows us to track “stale” visitors who haven’t made a request recently.

type Visitor struct {
limiter *rate.Limiter
lastSeen time.Time
}
var (
visitors = make(map[string]*Visitor)
mu sync.Mutex
)

2. Thread-Safe Retrieval

In Go, maps are not thread-safe. Since our middleware will be hit by multiple goroutines simultaneously, we use sync.Mutex to lock the map during reads and writes.

func getVisitor(ip string) *rate.Limiter {
mu.Lock()
defer mu.Unlock()
v, exists := visitors[ip]
if !exists {
// 1 request per second, with a burst capacity of 5
limiter := rate.NewLimiter(1, 5)
visitors[ip] = &Visitor{limiter, time.Now()}
return limiter
}
v.lastSeen = time.Now()
return v.limiter
}

3. The “Silent Killer”: Memory Leaks

If we never delete visitors from our map, a simple IP-spoofing attack could fill our RAM until the application crashes (OOM). To solve this, I implemented a background cleanup routine (the “Janitor”):

func init() {
go cleanupVisitors()
}
func cleanupVisitors() {
for {
time.Sleep(time.Minute)
mu.Lock()
for ip, v := range visitors {
if time.Since(v.lastSeen) > 3*time.Minute {
delete(visitors, ip)
}
}
mu.Unlock()
}
}

This routine runs in its own goroutine, waking up every minute to evict anyone inactive for over 3 minutes.

Implementing the Middleware Pattern

The beauty of Go’s net/http package is the middleware pattern. We wrap our business logic in a function that checks the rate limit before ever reaching the expensive database queries or logic.

func limitMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ip, _, err := net.SplitHostPort(r.RemoteAddr)
if err != nil {
http.Error(w, "Internal Server Error", http.StatusInternalServerError)
return
}
limiter := getVisitor(ip)
if !limiter.Allow() {
http.Error(w, "Status Too Many Requests", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}

Production Considerations

While this implementation is “production-ready” for single-instance services, there are two things to keep in mind for massive scale:

  1. The Proxy Problem: In production, your app is likely behind Nginx or a Load Balancer. r.RemoteAddr might give you the IP of the proxy instead of the user. You should check the X-Forwarded-For header in those cases.
  2. Distributed Systems: If you run 5 instances of your API, each will have its own local memory map. A user could technically hit the limit on Instance A and then immediately jump to Instance B. For global rate limiting, you would swap the internal map for a Redis store.

Conclusion

Building your own middleware is one of the best ways to master Go’s concurrency primitives (Mutex, Goroutines, Maps). It forces you to think about edge cases like race conditions and memory management that “magic” libraries often hide from you.

Explore the full source code here: GitHub - rate-limiter-middleware

If you found this helpful, feel free to star the repo or reach out with questions!