Skip to main content

Performance Testing

The performance testing framework benchmarks the simulator across varying configurations to identify throughput characteristics and bottlenecks.

Usage

python gpu_main.py --mode performance

How It Works

The PerformanceTester class (analysis/performance.py) generates a matrix of test configurations and runs each one multiple times:

  1. Generate test configs -- Creates combinations of r0 values, resolutions, and layer counts
  2. Run each test -- Instantiates a fresh simulator per config, runs warmup, then measures num_iterations runs
  3. Collect metrics -- Timing, GPU memory, throughput
  4. Generate reports -- JSON results, PNG plots, summary text

Default Test Matrix

If no custom configurations are specified in config.json, the default matrix is:

r0 Sweep

  • r0 values: [0.1, 0.15, 0.2, 0.3]
  • Resolution: 1024x1024
  • Layers: 10

Resolution Sweep

  • Resolutions: [2048, 4096]
  • Layers: 10
  • r0: 0.15

Layer Sweep

  • Layer counts: [10, 15, 20]
  • Resolution: 1024x1024
  • r0: 0.15

Custom Configuration

Define test configurations in config.json under performance_testing:

{
"performance_testing": {
"enabled": { "value": true },
"iterations_per_config": { "value": 50 },
"configurations": {
"value": [
{
"num_layers": [5, 10, 20],
"resolution": [512, 1024],
"r0_values": [0.1, 0.2]
}
]
}
}
}

Output

Results are saved to performance/ inside the simulation output directory:

performance/
test_0_layers10_res1024_r0.10/
results.json
test_1_layers10_res1024_r0.15/
results.json
...
iteration_times.png
r0_performance.png
resolution_performance.png
layer_performance.png
memory_usage.png
summary_report.txt
all_results.json

Result Fields

Each test produces:

FieldDescription
mean_timeAverage iteration time (seconds)
std_timeStandard deviation of iteration time
min_timeFastest iteration
max_timeSlowest iteration
gpu_memory_peakPeak GPU memory usage (MB)
throughputIterations per second