Voice-controlled multi-agent setup with dictare

Most developers don't use just one AI agent. Claude Code for architecture and complex reasoning. Codex for quick edits. Aider for git-aware changes. Gemini CLI when you want a second opinion.

The problem: switching between them means switching terminals, copying context, losing flow. With dictare, you switch agents with your voice. Mid-sentence if you want.

Setting up multiple agents¶

Each agent runs in its own terminal. dictare connects to all of them simultaneously through the OpenVIP protocol.

# Terminal 1: Claude Code
dictare agent freddie --profile claude

# Terminal 2: Codex
dictare agent bowie --profile codex

# Terminal 3: Aider
dictare agent hendrix --profile aider

# Terminal 4: Gemini CLI
dictare agent gilmour --profile gemini

dictare discovers connected agents automatically. As soon as an agent subscribes to the OpenVIP server, it's available for voice input.

Switching with your voice¶

The magic is in the agent filter. Say the agent's name and dictare routes your voice there:

"agent claude" — switches to Claude Code
"agent codex" — switches to Codex
"agent aider" — switches to Aider
"agent gemini" — switches to Gemini CLI

The switch is instant. No delay, no confirmation prompt. You hear a brief TTS announcement ("Claude") and your next words go to the new agent.

Custom agent configuration¶

Define your agents in config.toml:

[agent_profiles]
default = "claude"

[agent_profiles.claude]
command = ["claude"]
description = "Claude Code"

[agent_profiles.codex]
command = ["codex"]
description = "OpenAI Codex"

[agent_profiles.aider]
command = ["aider"]
description = "Aider"

[agent_profiles.gemini]
command = ["gemini"]
description = "Gemini CLI"

You can also customize the voice triggers in the agent filter:

[pipeline.agent_filter]
triggers = ["agent", "switch to", "go to"]

Now "switch to claude" and "go to codex" work too.

When to use which agent¶

This is personal, but here's how I work:

Claude Code for anything that needs deep reasoning. Architecture decisions, complex refactors, debugging subtle issues. "Claude, this function has a race condition in the event handler. Find it and fix it."

Codex for fast, focused edits. "Codex, add type hints to the functions in utils.py." It's quick, it stays scoped.

Aider for git-aware work. "Aider, refactor the database module into separate files for models, queries, and migrations." It understands your repo structure and commits incrementally.

Gemini CLI for exploration and second opinions. "Gemini, what are the tradeoffs between SSE and WebSockets for real-time event streaming?" When I want a different perspective on a design choice.

Practical workflow¶

Here's a real session:

Start with Claude Code for planning: "Claude, I need to add WebSocket support to the server. What's the cleanest approach given the current architecture?"
Review Claude's plan, then switch to implementation: "Agent aider. Implement the WebSocket handler in server.py following the pattern Claude suggested."
Quick fix needed in another file: "Agent codex. Add the WebSocket import to the init.py file."
Back to Claude for review: "Agent claude. Review the changes in the last three commits. Any issues?"

All of this without touching the mouse. Without switching windows. Without copying context between terminals.

Tips¶

Default agent: The first agent that connects becomes the default. You can change this in config.
Agent names are flexible: dictare matches partial names. "clau" works for "claude" if there's no ambiguity.
Focus tracking: If you focus a terminal with an agent, dictare automatically routes to that agent. Voice switching overrides this.
Status bar: The dictare status bar shows which agent is currently active, so you always know who's listening.

Multi-agent voice control sounds like a gimmick until you try it. Then you wonder how you ever worked without it.