REST API: From seconds to milliseconds

Jul 18, 2025

Introduction

There’s a small application that a few dozen other people and I have been using for many years now. The average user is quite happy with it. However, as a professional web developer, I could not help but be bothered by the latency and aged UI.

After sunsetting a major personal project, I had some spare time to find a solution and I am thrilled to say that I managed to make the underlying REST API 100x faster.

The legacy system

As a starting point, I dug as deep into the legacy system as it would let me. The legacy application is basically a black box. I cannot access or modify the source code or behavior.

What I found out is that it is based on Django-REST and the underlying database is either resource-limited or handles queries synchronously. Why do I think that? If sending a single request at a time takes (for example) 1 second, sending 3 requests at the same time will take 3 seconds on average.

I also found that there is a general overhead of 0.5 seconds for each request, regardless of the data requested.

Response times scale linearly in general:

1 day: 0.7s
2 days: 0.9s
5 days: 1.3s
14 days: 2.2s
30 days: 4.2s

Finding a solution

There are some constraints that I had to keep in mind:

I must not accidentally DDoS the legacy system.
The legacy system must remain the single point of truth.

Implementing two-way synchronization with a slow and limited legacy API was too complex for a personal project at this point, especially when taking the second constraint into consideration. The legacy system must remain the single point of truth and the legacy API will still be used by some users.

However, making the reads extremely performant would massively improve the user experience and this would only require one-way synchronization. The new REST API would be a “sidecar” providing the data in milliseconds instead of seconds.

Typing-on-keyboard

Implementation

Since this project was all about performance, I chose to write the application in Rust.

I have never written Rust code before. I have to say, it’s not as difficult as I thought it would be. The book on rust-lang.org was extremely helpful in teaching me the fundamentals of the language. And the documentation of popular crates (dependencies) is fantastic! I cannot remember the last time I opened the documentation for a dependency to find out how to use it. I usually just find my answers on stackoverflow. But docs.rs was a game-changer for me. I never thought I would ever say this, but it was actually enjoyable to find out how tools like axum, tokio, tower, serde, reqwest, etc. work!

So the new API contains new endpoints, a sync module, and an in-memory SQLite database. Every 30 seconds, the data is fetched from the legacy system. This slow polling rate ensures that the legacy system is not overloaded. If changes are detected, the data in the SQLite database is cleared and filled with the fetched data from the legacy system. The API endpoints read the data from the SQLite database. Write operations are proxied to the legacy system instead. Changes made through the new API trigger the sync, so the updated data is immediately fetched.

This solution is, in my humble opinion, simple, effective, and elegant. And the performance is incredible!

Rust-Lang

Performance

My first performance test ran a query to get the data for the next 3 months (this is the default query when opening the legacy UI.)

The query is sent 10 times consecutively by a single “virtual user”:

Legacy System:
- Average Request Duration: 6.76s
- Minimum Request Duration: 5.93s
- Maximum Request Duration: 9.35s
New:
- Average Request Duration: 0.065s
- Minimum Request Duration: 0.051s
- Maximum Request Duration: 0.144s

The improvement for a single person accessing the data is on average 100x!

But what if multiple people try to access the data at the same time? Remember, the legacy application on average doubles the request time if 2 requests are sent simultaneously.

Sending the same query as above but this time 3 at a time (30 requests in total):

Legacy System:
- Average Request Duration: 18.38s
- Minimum Request Duration: 8.73s
- Maximum Request Duration: 43.28s
New:
- Average Request Duration: 0.058s
- Minimum Request Duration: 0.048s
- Maximum Request Duration: 0.137s

The new API handles requests asynchronously and until the server’s CPU hits 100% usage it will barely slow down. Sending 3 times the requests also tripled the average performance gain to 300x!!

I stuck to realistic scenarios for the tests. With a few dozen users, it’s extremely unlikely that more than 3 users open the application (literally) at the same time. I do remember one situation when two other people and I opened the application simultaneously. That’s the reason I added the final test results.

Conclusion

This project has been a personal journey into the Rust programming language and its ecosystem. It was a thrilling challenge, to find a creative solution within the constraints of the legacy system. The experience also broadened my horizons, making me more mindful of the performance of the code I write and the applications I deploy.

Rocket

← See all posts