dictare was built for AI coding agents. But the core is just a voice layer — audio capture, speech recognition, text output. That layer doesn't care whether the text goes to Claude Code or to grep.
Here's how to use dictare as a universal voice input tool.
dictare transcribe: voice as text¶
The simplest building block:
dictare transcribe
This captures audio, transcribes it, and prints text to stdout. One line per utterance. That's it. Standard Unix text output that pipes anywhere.
# Auto-submit mode: each utterance is a separate line, sent immediately
dictare transcribe --auto-submit
Without --auto-submit, text accumulates until you trigger submission (double-tap hotkey or voice trigger). With it, each natural pause becomes a line break.
The pipe pattern¶
Unix philosophy: small tools connected by pipes. dictare fits right in.
# Voice-powered search
dictare transcribe --auto-submit | xargs -I {} grep -r "{}" ./src/
# Voice to clipboard
dictare transcribe | pbcopy # macOS
dictare transcribe | xclip # Linux
# Voice journal
dictare transcribe --auto-submit | ts '[%Y-%m-%d %H:%M]' >> ~/journal.md
# Voice-powered git
dictare transcribe --auto-submit | while read line; do
git commit -m "$line"
done
Anything that reads stdin can accept voice input. No integration needed, no API, no SDK. Just a pipe.
Keyboard mode¶
Sometimes you don't want to pipe text — you want it typed into whatever application has focus. That's keyboard mode.
dictare config set output.mode keyboard
dictare transcribes your speech and simulates keyboard input. Works with any application: your browser, a chat app, a text editor, a form field. Speak and the text appears wherever your cursor is.
This is closer to what tools like Wispr Flow do, but local and free.
Custom integrations via OpenVIP¶
For deeper integration, any application can subscribe to dictare's OpenVIP server and receive transcriptions as structured events:
from openvip import Client
client = Client("http://localhost:8770/openvip")
for event in client.subscribe("my-app"):
if event.type == "transcription":
do_something(event.text)
SSE-based, so it works from any language that can make HTTP requests. No SDK required — though the Python SDK makes it easier.
Use cases beyond coding¶
Note-taking¶
dictare transcribe --auto-submit | tee -a notes.md | dictare speak
Speak your thoughts, they get saved to markdown, and read back for verification. Hands-free note-taking with spoken confirmation.
Shell commands¶
dictare transcribe --auto-submit | llm "Convert this to a shell command, output only the command" | sh
Speak what you want in natural language, an LLM converts it to a shell command, and it executes. Add 2>&1 | dictare speak at the end to hear the output.
Accessibility¶
For developers with RSI, carpal tunnel, or other conditions that make typing painful, dictare provides a way to keep working. Voice for the agent interactions, keyboard for the precise edits. Or full voice with keyboard mode for everything.
Meeting notes¶
dictare transcribe --auto-submit | ts '[%H:%M]' | tee meeting.md
Run this during a meeting (with everyone's knowledge and consent). Timestamped transcription saved to a file in real time.
Quick translations¶
dictare transcribe --auto-submit | llm "Translate to Spanish" | dictare speak
Speak in English, hear it in Spanish. Swap languages as needed.
Why this works¶
dictare doesn't try to be a platform. It's a Unix tool: it does one thing (voice to text), it does it well, and it plays nicely with everything else. The transcribe command outputs plain text. The speak command reads plain text. Pipes connect them to the rest of your toolkit.
That's the advantage of building on Unix conventions rather than proprietary APIs. Every tool you already use becomes voice-enabled, for free, with a single pipe.
