We had to scale our REST API from 100 to 40,000 requests per minute. At first, we thought the challenge would be infrastructure or caching. In reality, the biggest bottleneck came from how we managed database transactions.
The Problem
Our API was built with a typical setup: GO, PostgreSQL database, and deployed on a Cloud Run. It worked great for our initial users, but as traffic grew, we started seeing:
- Response times jumping from 200ms to 5+ seconds
- Database connection pool exhaustion
- Memory usage spiking during peak hours
- Intermittent 500 errors under load
Create Database Transaction On Every Request
The major issue was database transactions. Our app was creating new transactions for every request, which quickly overwhelmed PostgreSQL.
What happened was that some engineers didn’t realize how expensive and inefficient it is to create a new connection for every request.
Secondly, creating a transaction for read-only queries was unnecessary.
Lastly, transactions were being started at the top level of the request handler and kept open for the entire duration of the request, including time spent waiting for external services or processing data. This resulted in long-lived transactions that held locks on database rows or tables, blocking other transactions and leading to contention and deadlocks.
What we changed:
We removed all unnecessary transactions that didn’t involve data processing. We also shortened the lifespan of transactions, opening them only when needed and closing them as soon as the operation was complete.
Result: Database connection errors dropped to zero, and average response time improved by 60%.
Monitoring and Observability
You can't optimize what you can't measure. So we added comprehensive monitoring:
- Application metrics: Response times, error rates, throughput
- Infrastructure metrics: CPU, memory, disk I/O
- Database metrics: Connection count, query performance, lock waits
We used New Relic for application metrics, and GCP for everything else since our cloud provider is GCP. Having real-time dashboards helped us spot issues before they became critical.
Results
After implementing these changes:
- Response time: P95 went from 5s to 300ms
- Error rate: Dropped from 7% to 2.5%
- Database load: 50% reduction in connection usage and CPU
Key Takeaways
- Keep transactions short: Only open a transaction when you’re about to modify data, and close it immediately after.
- Avoid transactions for read-only queries: They add overhead and can hold unnecessary locks.
- Don’t open transactions at the top of request handlers: Long-lived transactions block other queries, leading to contention and deadlocks.
- Database bottlenecks appear before infrastructure bottlenecks: Connection pooling and transaction handling usually limit scale earlier than CPU or memory.
- Measure everything: Application, infra, and database metrics are the only way to see what’s really happening in production.