RHDL offers six simulation backends spanning three orders of magnitude in performance. This guide covers benchmarks, backend selection, and profiling.
Backend Performance Summary
| Backend | Speed | Startup | Best For |
|---|---|---|---|
| Ruby Behavioral | Baseline | Instant | Development, debugging |
| IR Interpreter | ~60K cycles/s | Fast | Quick gate-level verification |
| IR JIT | ~200–600K cycles/s | Moderate | Medium-length simulations |
| IR Compiler (AOT) | ~1–2M cycles/s | 5–8s | Long batch simulations |
| Verilator | ~5–6M cycles/s | Compile time | Maximum throughput |
| CIRCT/MLIR (Arcilator) | Native RTL parity | Compile time | RTL benchmarking |
Benchmark Commands
MOS 6502 CPU
rake bench[mos6502] # Default: 5 million cyclesSample results:
- Interpreter: ~60K cycles/s
- JIT: ~230K cycles/s
- Compiler: ~1.58M cycles/s (6.8x over JIT)
Apple II Full System
rake bench[apple2] # CPU + memory + I/OGame Boy
rake bench[gameboy] # Frame-based executionSample results:
- IR Compiler: ~1.27 MHz (~30% of real-time)
- Verilator: exceeds real hardware speed
Backend Selection Guide
| Simulation Length | Recommended Backend |
|---|---|
| < 100K cycles | Interpreter or JIT |
| 100K – 1M cycles | JIT |
| 1M – 10M cycles | Compiler (AOT) |
| > 10M cycles | Verilator or CIRCT/MLIR |
| Use Case | Recommended Backend |
|---|---|
| Development and debugging | Ruby Behavioral |
| RSpec test suite | Ruby Behavioral |
| Gate-level verification | IR Interpreter |
| Extended batch testing | IR Compiler |
| Maximum performance | Verilator |
| Native RTL benchmarking | CIRCT/MLIR (Arcilator) |
Using Each Backend
Ruby Behavioral (Default)
component = MyDesign.new('test')
component.set_input(:a, 42)
component.propagateIR Interpreter
sim = RHDL::Codegen.gate_level([component], backend: :interpreter)
sim.poke('a', 42)
sim.evaluate
result = sim.peek('y')IR JIT
sim = RHDL::Codegen.gate_level([component], backend: :jit)IR Compiler (AOT)
sim = RHDL::Codegen.gate_level([component], backend: :compiler)Verilator
# Requires Verilator installed
rhdl export --lang verilog MyComponent
verilator --cc my_component.v --exe testbench.cpp
make -C obj_dirCIRCT/MLIR
# Requires firtool and arcilator
rhdl export --lang firrtl MyComponent
firtool my_component.fir --lowering-options=emitVerilog
arcilator my_component.mlir -o simProfiling Tips
Ruby Profiling
require 'benchmark'
time = Benchmark.measure do
1000.times do
component.propagate
end
end
puts "1000 propagations: #{time.real}s"Gate Count as Complexity Metric
rhdl gates --statsGate count correlates with simulation time — a component with 400 gates will simulate roughly 8x slower than one with 50 gates at the gate level.
SIMD Lane Count
For gate-level simulation, increase SIMD lanes for batch throughput:
RHDL_BENCH_LANES=64 rake bench[mos6502]Default is 64 lanes. Increasing beyond 64 requires wider SIMD operations.
Cycle Count
Control benchmark duration:
RHDL_BENCH_CYCLES=1000000 rake bench[mos6502]Optimization Strategies
- Start with behavioral — get correctness first
- Switch to JIT for CI — fast enough for test suites, catches gate-level bugs
- Use AOT Compiler for regression — best throughput for long test runs
- Profile hot components — gate count reveals complexity bottlenecks
- Parallelize with SIMD — 64 test vectors for free
Next Steps
- RTL Simulation — behavioral simulation details
- Gate-Level Simulation — gate-level backends
- Frontends and Backends — architecture overview