← dragfly
dragfly

Voice-controlled multi-agent setup with dictare

2026-04-20

Most developers don't use just one AI agent. Claude Code for architecture and complex reasoning. Codex for quick edits. Aider for git-aware changes. Gemini CLI when you want a second opinion.

The problem: switching between them means switching terminals, copying context, losing flow. With dictare, you switch agents with your voice. Mid-sentence if you want.

Setting up multiple agents

Each agent runs in its own terminal. dictare connects to all of them simultaneously through the OpenVIP protocol.

# Terminal 1: Claude Code
dictare agent freddie --profile claude

# Terminal 2: Codex
dictare agent bowie --profile codex

# Terminal 3: Aider
dictare agent hendrix --profile aider

# Terminal 4: Gemini CLI
dictare agent gilmour --profile gemini

dictare discovers connected agents automatically. As soon as an agent subscribes to the OpenVIP server, it's available for voice input.

Switching with your voice

The magic is in the agent filter. Say the agent's name and dictare routes your voice there:

The switch is instant. No delay, no confirmation prompt. You hear a brief TTS announcement ("Claude") and your next words go to the new agent.

Custom agent configuration

Define your agents in config.toml:

[agent_profiles]
default = "claude"

[agent_profiles.claude]
command = ["claude"]
description = "Claude Code"

[agent_profiles.codex]
command = ["codex"]
description = "OpenAI Codex"

[agent_profiles.aider]
command = ["aider"]
description = "Aider"

[agent_profiles.gemini]
command = ["gemini"]
description = "Gemini CLI"

You can also customize the voice triggers in the agent filter:

[pipeline.agent_filter]
triggers = ["agent", "switch to", "go to"]

Now "switch to claude" and "go to codex" work too.

When to use which agent

This is personal, but here's how I work:

Claude Code for anything that needs deep reasoning. Architecture decisions, complex refactors, debugging subtle issues. "Claude, this function has a race condition in the event handler. Find it and fix it."

Codex for fast, focused edits. "Codex, add type hints to the functions in utils.py." It's quick, it stays scoped.

Aider for git-aware work. "Aider, refactor the database module into separate files for models, queries, and migrations." It understands your repo structure and commits incrementally.

Gemini CLI for exploration and second opinions. "Gemini, what are the tradeoffs between SSE and WebSockets for real-time event streaming?" When I want a different perspective on a design choice.

Practical workflow

Here's a real session:

  1. Start with Claude Code for planning: "Claude, I need to add WebSocket support to the server. What's the cleanest approach given the current architecture?"

  2. Review Claude's plan, then switch to implementation: "Agent aider. Implement the WebSocket handler in server.py following the pattern Claude suggested."

  3. Quick fix needed in another file: "Agent codex. Add the WebSocket import to the init.py file."

  4. Back to Claude for review: "Agent claude. Review the changes in the last three commits. Any issues?"

All of this without touching the mouse. Without switching windows. Without copying context between terminals.

Tips

Multi-agent voice control sounds like a gimmick until you try it. Then you wonder how you ever worked without it.