Performance Testing
The performance testing framework benchmarks the simulator across varying configurations to identify throughput characteristics and bottlenecks.
Usage
python gpu_main.py --mode performance
How It Works
The PerformanceTester class (analysis/performance.py) generates a matrix of test configurations and runs each one multiple times:
- Generate test configs -- Creates combinations of r0 values, resolutions, and layer counts
- Run each test -- Instantiates a fresh simulator per config, runs warmup, then measures
num_iterationsruns - Collect metrics -- Timing, GPU memory, throughput
- Generate reports -- JSON results, PNG plots, summary text
Default Test Matrix
If no custom configurations are specified in config.json, the default matrix is:
r0 Sweep
- r0 values:
[0.1, 0.15, 0.2, 0.3] - Resolution: 1024x1024
- Layers: 10
Resolution Sweep
- Resolutions:
[2048, 4096] - Layers: 10
- r0: 0.15
Layer Sweep
- Layer counts:
[10, 15, 20] - Resolution: 1024x1024
- r0: 0.15
Custom Configuration
Define test configurations in config.json under performance_testing:
{
"performance_testing": {
"enabled": { "value": true },
"iterations_per_config": { "value": 50 },
"configurations": {
"value": [
{
"num_layers": [5, 10, 20],
"resolution": [512, 1024],
"r0_values": [0.1, 0.2]
}
]
}
}
}
Output
Results are saved to performance/ inside the simulation output directory:
performance/
test_0_layers10_res1024_r0.10/
results.json
test_1_layers10_res1024_r0.15/
results.json
...
iteration_times.png
r0_performance.png
resolution_performance.png
layer_performance.png
memory_usage.png
summary_report.txt
all_results.json
Result Fields
Each test produces:
| Field | Description |
|---|---|
mean_time | Average iteration time (seconds) |
std_time | Standard deviation of iteration time |
min_time | Fastest iteration |
max_time | Slowest iteration |
gpu_memory_peak | Peak GPU memory usage (MB) |
throughput | Iterations per second |