Voice bridge

Programmable Voice Interface for Devices

A lightweight voice terminal built around ESP32-S3 that connects to live AI models and executes functions locally.

Instead of trying to squeeze a full assistant into a microcontroller, this project separates the system into clear layers:

ESP32 voice frontend: captures audio, runs the device UI, and detects the wake word.
Local assistant agent: runs logic and executes functions in your environment.
AI model integration: adds reasoning, speech understanding, and real-time responses.

This keeps the hardware simple while still enabling powerful voice control over scripts, devices, and automation systems.

Connect an ESP32-S3 to a live AI model and let it:

answer questions
search the web
control devices
run automations
execute server commands
play audio
generate TTS with voice and emotion control
MCP integration STDIO, STREAMABLE HTTP

Up to 90% bandwidth reduction via compression
AI noise cancellation
Wake word support for microcontrollers
MCP server and client

MCP support (client and server)

Added MCP support as client (ext system integration) and server (provide TTS, voice control, etc).

Details

System Architecture

The system is intentionally split into independent components so each layer can focus on a specific task.

Details

Voice -> Function Execution

Instead of being limited to predefined commands, the assistant uses AI models to determine which function should be executed.

Example flow:

User: "Turn on the kitchen light" Model decides: call turn_on_light(location="kitchen") Assistant executes locally: MQTT publish -> kitchen_light -> ON

Functions can control almost anything:

smart home devices
scripts and CLI tools
webhooks
local automation systems
GPIO hardware

Voice becomes a universal control interface. No firmware updates required when you modify functions in the dashboard. Changes take effect instantly.

Details

Hardware

The hardware is intentionally minimal and easy to build.

The repository includes:

schematics
PCB design
ready-to-use Gerber files
firmwares
multiplatform scripts
Nice TUI interface
You can select Wake word model from the list

This allows anyone to assemble a voice terminal with standard components.

A minimal setup takes about 15-20 minutes. (if your hardware schema is ready)

Flash the ESP32 firmware:

git clone https://github.com/visorbarnis/voice-assistant ./configure_settings.sh ./run_upload.sh

GitHub Repository Quick setup and instructions

Example Use Cases

The system is designed as a flexible voice interface rather than a fixed assistant.

Smart Home: Control lights, devices, and automation systems through MQTT or webhooks.
Developer Tools: Trigger builds, run scripts, or deploy projects by voice.
Custom Hardware: Add voice control to robotics, lab equipment, or electronics projects.
Automation Systems: Integrate with workflow engines like n8n or other automation platforms.

Demo

Turn Your Controller into a Voice Interface

Connect devices, execute commands, and integrate external services through one AI layer.

Start Connecting