Voice Assistant: Installation & Usage

This guide covers how to get it running on your hardware and walks through the difference between PROD and DEV modes, so you can choose the right setup for your use case.

Voice Assistant ships as a self-contained binary. Download it here: link

Once downloaded, unpack the archive.

Running Voice Assistant

The binary runs like any standard executable — no dependencies, no package manager, no install step.

Example for Raspberry Pi:

wget -c -nd http://192.168.178.240:3000/voice-control/prod-voice-assistant/releases/download/v1.0.9/voice-assistant-linux_aarch64.zip
unzip voice-assistant-linux_aarch64.zip
./voice-assistant-linux_aarch64

CLI Options

# ./voice-assistant-linux_aarch64 --help
Usage:
  voice-assistant-linux_aarch64 [--daemon|--foreground|--debug|--status|--stop|--help]

Options:
  --daemon      Start in background and suppress console output
  --foreground  Run in current console (debug mode)
  --debug       Enable debug mode (debug logs, debug files, debug pages)
  --status      Show whether background process is running
  --stop        Stop background process started by --daemon
  --help, -h    Show this help
  /?, -?        Help aliases (quote '/?' in zsh)

Two Operating Modes: DEV and PROD

PROD Mode

PROD is the default operating mode and gives you access to the full feature set:

Connection to a stable, fully-featured model
Web search integration
Unlimited concurrent connections
Compressed traffic between Voice Assistant and the model
Simple, predictable billing (per-second pricing)
Server-side AI VAD
Multi-stage audio processing pipeline (NS + VAD) — significantly reduces false triggers
No need to track model version changes
Your data is never used for model training

In PROD mode, Voice Assistant always connects to the latest stable model version — version tracking and naming changes are handled on the service side. Billing is normalized into straightforward per-second metrics, so there's no need for a Google Cloud corporate account or time spent wiring up service-level cost tracking.

This mode is enabled by default. On startup, Voice Assistant checks for environment variables and a .env file. If MQTT credentials are present, they'll be used automatically.

Example .env with MQTT configured:

MQTT_SERVER=192.168.1.200
MQTT_PORT=1883
MQTT_CLIENT_NAME=voice-assist
MQTT_USERNAME=your_account_name
MQTT_PASSWORD=your_account_password

DEV Mode

DEV mode is for debugging individual devices without burning tokens from your account. In this mode, Voice Assistant connects directly to the model via Google AI Studio. Your function set defined in the dashboard remains editable.

Known limitations in DEV mode:

Requires your own AI Studio API key (configured in settings)
You're responsible for tracking model name changes yourself
The models themselves can be unstable — unexpected disconnects, no web search, incorrect function call behavior, etc.
Per Google's policy, your data may be used for model training
Higher bandwidth usage — model connection requires 800 Kbps or more
Heavy usage may incur additional charges

DEV mode is intended purely for device setup and integration testing — it lets you iterate on hardware without touching your token budget.

To activate DEV mode, add the following to your .env:

# DEV mode
ASSISTANT_MODE=DEV
DEV_MODEL_NAME=gemini-2.5-flash-native-audio-preview-09-2025
STUDIO_API_KEY=<your API key from AI Studio>

To get an API key, open the "Get API key" menu in AI Studio and generate one. Full instructions: AI Studio instructions

For the correct Live API model name, refer to the official guide

Heads up: Always restart Voice Assistant after changing the mode or any configuration values.