Performance Testing

The performance testing framework benchmarks the simulator across varying configurations to identify throughput characteristics and bottlenecks.

Usage

python gpu_main.py --mode performance

How It Works

The PerformanceTester class (analysis/performance.py) generates a matrix of test configurations and runs each one multiple times:

Generate test configs -- Creates combinations of r0 values, resolutions, and layer counts
Run each test -- Instantiates a fresh simulator per config, runs warmup, then measures num_iterations runs
Collect metrics -- Timing, GPU memory, throughput
Generate reports -- JSON results, PNG plots, summary text

Default Test Matrix

If no custom configurations are specified in config.json, the default matrix is:

r0 Sweep

r0 values: [0.1, 0.15, 0.2, 0.3]
Resolution: 1024x1024
Layers: 10

Resolution Sweep

Resolutions: [2048, 4096]
Layers: 10
r0: 0.15

Layer Sweep

Layer counts: [10, 15, 20]
Resolution: 1024x1024
r0: 0.15

Custom Configuration

Define test configurations in config.json under performance_testing:

{
  "performance_testing": {
    "enabled": { "value": true },
    "iterations_per_config": { "value": 50 },
    "configurations": {
      "value": [
        {
          "num_layers": [5, 10, 20],
          "resolution": [512, 1024],
          "r0_values": [0.1, 0.2]
        }
      ]
    }
  }
}

Output

Results are saved to performance/ inside the simulation output directory:

performance/
  test_0_layers10_res1024_r0.10/
    results.json
  test_1_layers10_res1024_r0.15/
    results.json
  ...
  iteration_times.png
  r0_performance.png
  resolution_performance.png
  layer_performance.png
  memory_usage.png
  summary_report.txt
  all_results.json

Result Fields

Each test produces:

Field	Description
`mean_time`	Average iteration time (seconds)
`std_time`	Standard deviation of iteration time
`min_time`	Fastest iteration
`max_time`	Slowest iteration
`gpu_memory_peak`	Peak GPU memory usage (MB)
`throughput`	Iterations per second

Usage​

How It Works​

Default Test Matrix​

r0 Sweep​

Resolution Sweep​

Layer Sweep​

Custom Configuration​

Output​

Result Fields​