Skip to content

Model Routing

Not all AI models are created equal. Some are better at reasoning through complex plans. Others are faster at generating code. Some excel at catching subtle bugs. Codewick’s model routing system automatically assigns the best available model to each pipeline stage, so you get the right tool for each part of the job.

You don’t need to know which models exist or how they compare. Codewick handles the selection automatically. But if you’re curious — or want to fine-tune behavior — full transparency and manual overrides are available.

When a task enters the pipeline, Codewick’s routing algorithm evaluates several factors for each stage:

  • Task type — Planning stages favor models with strong reasoning. Building stages favor models with high code quality benchmarks. Debugging stages favor models that perform well on fault-localization tasks.
  • Priority mode — Your selected mode shifts how trade-offs are made (see below).
  • Token cost — Some models cost significantly more per token than others. The algorithm weighs this against expected quality.
  • Model benchmarks — Codewick tracks performance data across coding benchmarks (HumanEval, SWE-bench, and others) to inform assignments.
  • Availability — If a model is experiencing high latency or downtime, traffic routes to the next best option automatically.
  • Context window — Some stages need large context (building with many files). The router only considers models whose context window fits the payload.

The routing decision happens in milliseconds before each stage starts. You never wait for routing itself.

Priority modes control how the routing algorithm weighs its decisions. You can switch modes at any time — the change applies to your next task.

ModeBehaviorBest for
CostPicks the cheapest model capable of handling each stage. Avoids premium models unless the task specifically requires them.Routine tasks, exploration, learning projects
QualityPicks the highest-performing model for each stage regardless of cost. Maximizes output quality.Production code, complex architecture, security-sensitive work
SpeedPicks the fastest-responding model for each stage. Prioritizes low latency over all other factors.Quick iterations, simple edits, rapid prototyping
BalancedDefault mode. Weighs cost, quality, and speed together with a slight bias toward quality.General development (recommended for most work)
  1. Click the priority mode indicator in the top-right corner of the chat panel (it shows your current mode as a label, e.g., “Balanced”).
  2. Select a new mode from the dropdown.
  3. The change takes effect on your next message.

Your priority mode persists across sessions until you change it.

Consider a task that activates all six pipeline stages. Here’s roughly how the modes would route:

  • Cost mode — Uses smaller, efficient models for every stage. The entire pipeline costs less, but output may be less polished.
  • Quality mode — Uses the most capable model available for each stage. Orchestration and Review get strong reasoning models. Building gets the best code generation model. Higher total cost.
  • Speed mode — Uses the fastest models across the board. Great for when you need quick answers and don’t mind occasional rough edges.
  • Balanced mode — Mixes model tiers. May use a strong model for Planning and Review but an efficient model for Building if the task is straightforward.

Codewick shows you exactly which model is assigned to each stage. After a task completes, each AI response card displays a model label — a friendly name like “Claude Opus 4” or “GPT-4o” rather than a raw API identifier.

To see the full routing breakdown for a task:

  1. Click the model label on any response card.
  2. A panel expands showing all six pipeline stages, the model assigned to each, and whether the stage was active or skipped.
  3. Active stages show the model name, token count, and time taken.
  4. Skipped stages are grayed out with a “Skipped” label.

This transparency means you’re never guessing what’s behind a response. You can compare routing decisions across different tasks to understand how the algorithm behaves.

If you want a specific model on a specific stage, you can override the automatic routing.

  1. Open the model transparency panel (click any model label).
  2. Click the model name next to the stage you want to change.
  3. A dropdown appears with all available models for that stage.
  4. Select your preferred model.

You can override one stage, multiple stages, or all six.

  • Overrides persist for your current session. Starting a new session resets to automatic routing.
  • Overridden stages show a small pin icon next to the model name so you can tell at a glance which stages are manually set.
  • To clear a single override, click the pin icon next to that stage.
  • To clear all overrides at once, use the Reset all overrides button at the bottom of the transparency panel.
  • Overridden stages ignore your priority mode — the model you selected is always used regardless of the mode.

Codewick uses friendly model names throughout the interface. You’ll see names like:

  • Claude Opus 4
  • Claude Sonnet 4
  • GPT-4o
  • Gemini Pro

These are human-readable labels, not API identifiers. The specific model version (including any fine-tuned variants) is visible in the expanded transparency panel if you need the detail.

Display names are consistent everywhere — the transparency panel, response cards, override dropdowns, and usage breakdowns all use the same friendly names.

As new models become available, Codewick’s routing table updates automatically. You don’t need to install updates or change settings — new models appear in the routing pool and the algorithm begins considering them based on their benchmark performance.

When a new model is added:

  1. It enters the routing pool with benchmark scores from public evaluations.
  2. The algorithm may route some stages to it, weighted by the scores.
  3. Performance is monitored. If the model delivers strong results, it gets routed more frequently.
  4. If it underperforms expectations, it gets deprioritized.

This happens in the background with no action required from you. If a model you previously selected as a manual override is deprecated or removed, you’ll see a notification suggesting you reset that override.

Different models have different per-token costs. The routing algorithm factors this into every decision, weighted by your priority mode:

  • In Cost mode, token cost is the dominant factor. The algorithm picks the cheapest model that meets a quality threshold.
  • In Quality mode, token cost is nearly ignored. The best model is always chosen.
  • In Speed mode, cost is secondary to latency. Fast, affordable models tend to win.
  • In Balanced mode, cost is one of three roughly equal factors, with quality slightly favored.

To understand how token costs add up across a task, see Token Usage & Budgets.

Most users never need to override routing. The automatic system handles the vast majority of cases well. But overrides are useful when:

  • You’ve found that a specific model writes better code in your project’s language or framework.
  • You want maximum quality on the Review stage for security-critical work.
  • You’re debugging a tricky issue and want the strongest reasoning model on the Debugging stage.
  • You’re exploring and want to compare output from different models on the same task.
  • A particular model handles your niche framework or language especially well.

Yes. Open the model transparency panel and click any stage’s model dropdown. The full list of currently available models appears, grouped by provider.

What happens if a model goes down mid-task?

Section titled “What happens if a model goes down mid-task?”

If a model becomes unavailable while a stage is running, Codewick automatically retries with the next best model for that stage. You may see a brief delay, but the task completes without intervention.

Does my priority mode affect all projects?

Section titled “Does my priority mode affect all projects?”

Yes. Your priority mode is a workspace-level setting that applies to all projects. If you want different modes for different projects, switch before starting work in each one.

Can I set a default override that persists across sessions?

Section titled “Can I set a default override that persists across sessions?”

Not currently. Overrides are session-scoped by design, so routing stays adaptive as models improve. This may change based on user feedback during beta.