← dragfly
dragfly

Customize how dictare understands you

2026-04-27

dictare's default behavior works out of the box. But the pipeline — the chain of steps between your voice and the agent — is fully customizable. You control what triggers submission, how muting works, how agent switching responds, and what confidence thresholds to accept.

All of it lives in config.toml.

Pipeline architecture

Every transcription flows through two stages:

Filters inspect the text and decide what to do with it. They can transform it, consume it (preventing further processing), or let it pass through unchanged.

Executors act on the filtered text. They deliver it to the agent, toggle mute state, switch agents, or perform other actions.

The default pipeline has three filters and three executors:

Voice → [MuteFilter] → [AgentFilter] → [SubmitFilter] → [MuteExecutor | AgentSwitchExecutor | InputExecutor]

Each filter checks for specific trigger phrases. If it matches, it consumes the text and signals the corresponding executor. If it doesn't match, the text passes to the next filter.

Submit triggers

The SubmitFilter decides when to submit text to the agent. By default, double-tapping the hotkey submits. But you can also submit with your voice.

[pipeline.submit_filter.triggers]
"*" = [
    ["ok|okay", "send|submit"],
]

Now saying "OK send" at the end of your sentence submits it immediately, without the double-tap. The trigger phrase gets stripped from the text before delivery.

You can also set a confidence threshold — if the STT engine isn't confident enough in the transcription, it won't submit:

[pipeline.submit_filter]
confidence_threshold = 0.7

Mute control

The MuteFilter lets you pause and resume voice capture without touching the keyboard.

[pipeline.mute_filter.mute_triggers]
"*" = [["ok|okay", "mute|stop"]]

[pipeline.mute_filter.listen_triggers]
"*" = [["ok|okay", "listen"]]

Say "OK mute" and dictare stops processing your voice. Background conversations, phone calls, side discussions — none of it reaches the agent. Say "OK listen" to resume.

The mute state is reflected in the status bar and tray icon, so you always know whether dictare is listening.

Agent switching

The AgentFilter handles voice-based agent switching.

[pipeline.agent_filter]
triggers = ["agent", "switch to"]

When the filter hears "agent claude" or "switch to codex", it extracts the agent name and signals the AgentSwitchExecutor. The executor handles the actual switch, including TTS feedback announcing the new agent.

Putting it together

Here's a complete pipeline configuration:

[pipeline.agent_filter]
triggers = ["agent", "switch to", "talk to"]

[pipeline.submit_filter]
confidence_threshold = 0.6

[pipeline.submit_filter.triggers]
"*" = [
    ["ok|okay", "send|submit"],
    ["do", "it"],
]

With this config:

Custom filters

The pipeline is loaded by PipelineLoader, which uses dependency injection. If the built-in filters don't cover your needs, you can write your own. A filter is any class that implements the Filter protocol:

class Filter(Protocol):
    def process(self, text: str, context: dict) -> FilterResult:
        ...

Return FilterResult.PASS to let the text continue down the pipeline, or FilterResult.CONSUME to stop processing and trigger an action.

Tips

The pipeline is where dictare becomes yours. Spend ten minutes with config.toml and you'll have a voice workflow that fits exactly how you think.