Skip to content
GitHub

Sandbox Provider Leaderboard

Sandbox Benchmarks

A leaderboard of common benchmarks for each of our sandbox providers.

Last run: May 8, 2026
ArchilLatitudeBrowserbaseTigris

Performance Over Time

Composite Score

Detailed Metrics

Provider
Score
Median
P95
P99
Success
Declaw97.90.19s0.23s0.23s100%
Daytona95.70.34s0.56s0.60s100%
Tensorlake91.70.57s1.22s1.22s100%
Archil92.70.69s0.79s0.81s100%
E2B92.40.69s0.82s0.93s100%
Vercel90.00.77s1.25s1.50s100%
Upstash83.61.01s2.56s2.64s100%
Blaxel81.81.63s2.09s2.13s100%
Cloudflare77.71.97s2.51s2.80s100%
Namespace44.02.21s14.35s14.61s94%
Runloop32.74.79s9.42s13.56s100%
CodeSandbox11.38.11s15.66s18.89s100%

Want to see a provider added?

Let us know on X

Methodology

What We Measure

Every benchmark measures Time to Interactive (TTI) — the elapsed time from calling compute.sandbox.create() to the first successful runCommand() inside the sandbox.

Each provider is tested with 100 iterations per run. Benchmarks run automatically via GitHub Actions on a recurring schedule. All results are committed to the public benchmarks repo.

Sequential Test: Sandboxes are launched one at a time, waiting for each to become interactive before starting the next.

Staggered Test: Sandboxes are launched with 200ms delays between each.

Burst Test: All sandboxes are launched concurrently in a single burst.

How We Score

The Composite Score is a weighted blend of timing metrics multiplied by the success rate. Each metric is scored against a fixed 10-second ceiling: 100 × (1 − value / 10,000ms), so a 200ms median scores 98 and anything ≥10s scores 0.

The weighted timing score is then multiplied by the success rate (0–1), so providers that fail frequently are penalized proportionally.

  • Median: 60% — primary signal for typical experience
  • P95: 25% — tail latency / consistency
  • P99: 15% — extreme tail latency

Sandbox Benchmarks FAQs

Have another question? Email us.

A sandbox is anywhere you can run code in isolation. It could be a VM, bare metal, a container, anywhere with compute resources.