Skip to content
GitHub

Sandbox Provider Leaderboard

Sandbox Benchmarks

A leaderboard of common benchmarks for each of our sandbox providers.

Last run: March 16, 2026

Provider Leaderboard

1
e2b logo
90.2Composite Score
2
daytona logo
89.8Composite Score
3
blaxel logo
88.8Composite Score
4
hopx logo
88.1Composite Score
5
runloop logo
83.9Composite Score
6
namespace logo
81.6Composite Score
7
vercel logo
79.0Composite Score
8
codesandbox logo
75.0Composite Score
9
cloudflare logo
74.0Composite Score
10
modal logo
69.5Composite Score

Performance Over Time

Composite Score

Detailed Metrics

#
Provider
Score
Median
P95
P99
Success
1
90.20.44s0.95s3.15s100%
2
89.80.71s1.44s1.54s100%
3
88.11.06s1.38s1.42s100%
4
88.81.10s1.15s1.17s100%
5
83.91.60s1.62s1.65s100%
6
81.61.76s1.94s1.98s100%
7
79.02.02s2.18s2.27s100%
8
69.52.24s4.12s4.48s100%
9
74.02.27s3.02s3.20s100%
10
75.02.39s2.64s2.68s100%

Want to see a provider added? Let us know on X.

Methodology

What We Measure

Every benchmark measures Time to Interactive (TTI) — the elapsed time from calling compute.sandbox.create() to the first successful runCommand() inside the sandbox.

Each provider is tested with 100 iterations per run. Benchmarks run automatically via GitHub Actions on a recurring schedule. All results are committed to the public benchmarks repo.

Sequential Test: Sandboxes are launched one at a time, waiting for each to become interactive before starting the next.

Staggered Test: Sandboxes are launched with 200ms delays between each.

Burst Test: All sandboxes are launched concurrently in a single burst.

How We Score

The Composite Score is a weighted blend of timing metrics multiplied by the success rate. Each metric is scored against a fixed 10-second ceiling: 100 × (1 − value / 10,000ms), so a 200ms median scores 98 and anything ≥10s scores 0.

The weighted timing score is then multiplied by the success rate (0–1), so providers that fail frequently are penalized proportionally.

  • Median: 60% — primary signal for typical experience
  • P95: 25% — tail latency / consistency
  • P99: 15% — extreme tail latency

Sandbox Benchmarks FAQs

Have another question? Email us.

A sandbox is anywhere you can run code in isolation. It could be a VM, bare metal, a container, anywhere with compute resources.