How RapidPipeline Masters Peak Loads with AWS

Case Study on Scaling Workloads on AWS

January 22, 2025

Results at a Glance

Servers scale automatically based on current load
Better maintainability through Infrastructure as Code with AWS CDK
Critical modules run serverless all others scale automatically

Challenge

The company operates a proprietary 3D optimization tool called RapidPipeline, previously known as RapidCompact. The platform processes 3D models for customers via a combination of worker servers and API servers. This infrastructure faced the following challenges:

Peak loads: A single user can upload hundreds of 3D models simultaneously, resulting in sudden peak loads.
Scaling and costs: Servers must scale dynamically to ensure optimal performance at minimal cost.
Performance: On the customer side, waiting times should be reduced so that more requests can be processed in parallel.

Our Solution

After a detailed analysis of the existing infrastructure, the following measures were implemented:

Implementation of Auto Scaling Groups: Auto Scaling Groups were set up for both the worker servers and the API servers to automatically adjust the number of servers to the current load.
Serverless architecture: Critical parts of the infrastructure have been moved out of the monolith and deployed as a standalone service with API Gateway and AWS Lambda. This ensures high availability and low response times.
Containerization: The workload was deployed as containers on EC2 instances within Auto Scaling Groups to ensure a robust and repeatable deployment. The same approach was implemented for the API servers.
Scalable API : The API servers were equipped with load balancers and Auto Scaling Groups to dynamically distribute requests and maximize scalability.

Previous State

Individual API and worker servers
No automatic scaling based on load
Instances managed manually
Critical endpoints as part of the main application

Before the new architecture was implemented, it was difficult to take various load requirements into account. New servers had to be added manually and computing power was not used efficiently.

For example, instances had to be started up via CLI command if the number of jobs in the queue became too high. There was no automation for this.

The Result

API and worker servers scale automatically depending on the load
Server costs are only incurred when the computing power is needed
Deployments can be done conveniently and robustly via git push
Critical endpoints outsourced to Lambda for very high availability

With the introduction of Auto Scaling Groups (ASGs) and the associated configuration, it is no longer necessary to run servers on spec. If the load exceeds a defined level, additional servers are automatically added and then shut down again. Critical parts of the infrastructure have been outsourced and are executed serverless using AWS Lambda. This ensures that these operations are completely excluded from potential downtimes of the main application. This ensures that costs are reduced while unexpected load peaks are absorbed.

In addition, the entire infrastructure has been moved to CDK so that every change can go through a code review, making the entire system more resistant to failures.