June Retro: Scaling APIs for Growth

June 30, 2026at 2:01 PM UTCBy Pocket Portfolio TeamEngineering

#api#june#retro#scaling

Problem

As your application grows, your APIs need to handle an increasing number of requests without compromising on performance. A failure to scale effectively can lead to slow response times, server crashes, and a poor user experience. In June's retro, we faced a similar challenge when our API usage doubled overnight due to a successful marketing campaign. To address this, we focused on scaling our APIs to sustain growth efficiently.

Solution with Code

Load Balancing

One of the first steps in scaling is distributing traffic across multiple servers to prevent any single server from becoming a bottleneck.

const http = require('http');
const { createProxyServer } = require('http-proxy');

const proxy = createProxyServer({});
const serverList = ['http://server1.example.com', 'http://server2.example.com'];
let currentIndex = 0;

http.createServer((req, res) => {
  proxy.web(req, res, { target: serverList[currentIndex] });
  currentIndex = (currentIndex + 1) % serverList.length;
}).listen(8080);

Caching

Implement caching to reduce the load on your servers by storing frequently requested data.

const express = require('express');
const apicache = require('apicache');

const app = express();
const cache = apicache.middleware;

app.use(cache('5 minutes'));

app.get('/data', (req, res) => {
  // Fetch data from database
  res.json({ data: 'This is cached data' });
});

app.listen(3000);

Rate Limiting

Protect your API from abuse by limiting the number of requests a user can make.

const rateLimit = require('express-rate-limit');

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // limit each IP to 100 requests per windowMs
});

app.use(limiter);

Key Concepts

Horizontal Scaling: Involves adding more servers to handle increased traffic. This is achieved with load balancers that distribute requests evenly.
Caching: Temporarily stores data to reduce server load and improve response times. Common caching strategies include in-memory caching and distributed caching systems like Redis.
Rate Limiting: Prevents abuse by limiting the number of requests a user can make in a given time period. This protects your API from being overwhelmed by excessive traffic.
Monitoring and Alerts: Continuously monitor your API performance using tools like Prometheus or Grafana to get real-time alerts on traffic spikes or server issues.

By implementing these strategies, you can ensure your APIs are well-prepared to handle increased demand, providing a seamless experience for your users as your application grows.