# Arena schedules contestants in a local lane (serial) and a cloud lane (parallel) A Battle runs the same prompt against 2–6 Contestants. The local llama-swap server can only hold one model in memory at a time, so llama-swap-backed Contestants are placed in a **local lane** and run strictly one at a time, while cloud-backed Contestants (Claude Code, OpenCode-on-cloud) run all in parallel in a **cloud lane**; the two lanes run concurrently. We chose this over running everything serially (too slow for cloud) or everything in parallel (impossible for local, and it would corrupt the speed Benchmark) because the single-model constraint is physical and the serial local lane also gives each local model an uncontended, fair tokens/sec measurement. ## Consequences - A Battle's wall-clock is roughly `max(slowest cloud contestant, sum of local contestants)`. Deep local lanes (especially all-local Q&A battles) are slow by design; the launcher warns when the local lane is deep. - The speed Benchmark (tokens/sec) is only meaningful for local-lane Contestants, which is acceptable since external CLI agents don't report token usage anyway.