TwinRouterBench v0

Best Routers by Resolve Rate

Compare conditional per-call LLM routers on heldout SWE-bench Verified runs and inspect the static tier-supervision bank used by the benchmark.

Quick Picks

Best routers for common comparisons

Success Rate Rankings

Rank Router Badges Resolve Rate Cost Steps Value Notes

Static Supervision Bank

conditional per-call labels

Protocol

Fixed assumptions behind the public leaderboard numbers.

Dynamic split

Ranking fields

Success Rate sorts by resolved cases. Cost sorts by average routed spend. Value sorts by resolve-rate percentage per average dollar.

Static target

Static labels are conditional per-call tier targets over low, mid, mid_high, and high model tiers.