Load Test Simulation ⚡

The Load Test Simulation feature is designed to rigorously test the resilience of your system design or external API endpoints under stress.

A seemingly robust system can fail when faced with high pressure in production. This feature allows you to simulate thousands of concurrent users making requests, giving you deep insights into how your design or API will perform during busy hours or frantic traffic spikes.

Workload Patterns

We currently offer four distinct patterns for load testing your design, allowing you to mimic different real-world traffic scenarios.

Pattern	Description	Key Behavior	Duration Required
Ramp	Simulates a gradual increase in user traffic over a period of time.	Requests are evenly distributed and gradually increase from the start to the end of the duration.	Yes
Sequential	Simulates a series of users making one request after another.	Requests are executed one at a time with a 100ms delay between them.	No
Burst	Simulates an unexpected, sudden traffic spike in the middle of a sustained load period.	80% of total requests are concentrated into the middle 20% of the duration.	Yes
Parallel	Simulates the maximum possible simultaneous load your system can handle.	All requests are fired simultaneously at the start of the simulation (`t=0`).	No

Configuration Limits

You can configure the simulation to fit your testing needs within the following parameters:

Parameter	Minimum	Maximum	Default (Ramp/Burst)
Total Requests	1	1,000	20
Duration (Seconds)	1	300 (5 minutes)	10

note

Duration is only required for the Ramp and Burst patterns. The maximum limit of requests is currently set to 1,000, but we plan to increase this based on user demand.

Analysis and Results

After each load test run is complete, you will receive a comprehensive summary of your system's performance.

Metric Summary

The summary provides critical metrics for measuring the overall success and efficiency of your system:

Total Requests / Success Rate: The number of successful and failed requests, and the overall success percentage.
Requests Per Second (RPS): Also known as Throughput, this is the number of requests your system processed per second.
Total Time: The actual duration of the load test.

Latency Percentiles (P-Values)

Latency metrics are key for understanding the user experience. Instead of just an average, we provide percentiles to show the worst-case performance for a majority of users. All values are displayed in milliseconds (ms).

Metric	Description
Min Latency	The fastest single request.
Max Latency	The slowest single request.
Avg Latency	The average time taken across all successful requests.
P50 Latency (Median)	50% of all requests completed in this time or faster.
P95 Latency	95% of all requests completed in this time or faster. This is a crucial benchmark for customer experience.

Time Breakdown

You can inspect the results of each individual request to understand where the time was spent:

Duration: The total time for the request to traverse the entire design.
Code Time: The time spent executing custom Python code within your API Service components.
Network Time (Overhead): The time spent simulating network latency and traversing connections between components. This represents the architectural and data-fetching costs, excluding your business logic code.

Workload Patterns​

Configuration Limits​

Analysis and Results​

Metric Summary​

Latency Percentiles (P-Values)​

Time Breakdown​