← dragfly
dragfly

dictare beyond coding: voice for any CLI tool

2026-04-06

dictare was built for AI coding agents. But the core is just a voice layer — audio capture, speech recognition, text output. That layer doesn't care whether the text goes to Claude Code or to grep.

Here's how to use dictare as a universal voice input tool.

dictare transcribe: voice as text

The simplest building block:

dictare transcribe

This captures audio, transcribes it, and prints text to stdout. One line per utterance. That's it. Standard Unix text output that pipes anywhere.

# Auto-submit mode: each utterance is a separate line, sent immediately
dictare transcribe --auto-submit

Without --auto-submit, text accumulates until you trigger submission (double-tap hotkey or voice trigger). With it, each natural pause becomes a line break.

The pipe pattern

Unix philosophy: small tools connected by pipes. dictare fits right in.

# Voice-powered search
dictare transcribe --auto-submit | xargs -I {} grep -r "{}" ./src/

# Voice to clipboard
dictare transcribe | pbcopy  # macOS
dictare transcribe | xclip   # Linux

# Voice journal
dictare transcribe --auto-submit | ts '[%Y-%m-%d %H:%M]' >> ~/journal.md

# Voice-powered git
dictare transcribe --auto-submit | while read line; do
  git commit -m "$line"
done

Anything that reads stdin can accept voice input. No integration needed, no API, no SDK. Just a pipe.

Keyboard mode

Sometimes you don't want to pipe text — you want it typed into whatever application has focus. That's keyboard mode.

dictare config set output.mode keyboard

dictare transcribes your speech and simulates keyboard input. Works with any application: your browser, a chat app, a text editor, a form field. Speak and the text appears wherever your cursor is.

This is closer to what tools like Wispr Flow do, but local and free.

Custom integrations via OpenVIP

For deeper integration, any application can subscribe to dictare's OpenVIP server and receive transcriptions as structured events:

from openvip import Client

client = Client("http://localhost:8770/openvip")

for event in client.subscribe("my-app"):
    if event.type == "transcription":
        do_something(event.text)

SSE-based, so it works from any language that can make HTTP requests. No SDK required — though the Python SDK makes it easier.

Use cases beyond coding

Note-taking

dictare transcribe --auto-submit | tee -a notes.md | dictare speak

Speak your thoughts, they get saved to markdown, and read back for verification. Hands-free note-taking with spoken confirmation.

Shell commands

dictare transcribe --auto-submit | llm "Convert this to a shell command, output only the command" | sh

Speak what you want in natural language, an LLM converts it to a shell command, and it executes. Add 2>&1 | dictare speak at the end to hear the output.

Accessibility

For developers with RSI, carpal tunnel, or other conditions that make typing painful, dictare provides a way to keep working. Voice for the agent interactions, keyboard for the precise edits. Or full voice with keyboard mode for everything.

Meeting notes

dictare transcribe --auto-submit | ts '[%H:%M]' | tee meeting.md

Run this during a meeting (with everyone's knowledge and consent). Timestamped transcription saved to a file in real time.

Quick translations

dictare transcribe --auto-submit | llm "Translate to Spanish" | dictare speak

Speak in English, hear it in Spanish. Swap languages as needed.

Why this works

dictare doesn't try to be a platform. It's a Unix tool: it does one thing (voice to text), it does it well, and it plays nicely with everything else. The transcribe command outputs plain text. The speak command reads plain text. Pipes connect them to the rest of your toolkit.

That's the advantage of building on Unix conventions rather than proprietary APIs. Every tool you already use becomes voice-enabled, for free, with a single pipe.