Security and Privacy

Privacy isn't an afterthought here-it’s baked into the core. Leaving a home assistant open to eavesdropping is simply not an option, which is why the entire voice streaming pipeline is end-to-end encrypted. All traffic between your local controller and the model runs over a secure WSS/TLS channel, authenticated via a unique device key that also serves as your hardware identifier. From your user dashboard, you have total control to instantly revoke or suspend any key (and by extension, the device itself).

The local voice-assistant agent operates in a strict "privacy-first" sandbox. It doesn't snoop around your local network or harvest metadata. Outbound traffic is locked down to exactly two endpoints (voice-assistant.io and pr.voice-assistant.io). This makes it trivially easy to isolate the device behind a strict firewall and whitelist only what's absolutely necessary.

The Elephant in the Room: Proxy Dependency

It’s no secret that this is a sore subject for the self-hosted crowd. Yes, there is a reliance on a central proxy, but realistically, it’s no different than relying directly on the Google Live API. We live in an AI-driven world, and right now, the heavy lifting still happens in the cloud.

Under the Hood: What the proxy actually does

Terminates WSS connections: Handles the secure WebSocket feed from your hardware.
Validates auth: Checks your device key against the dashboard registry.
Decompresses audio: This is the proxy's primary reason for existing—squashing bandwidth down to a lean 70–80 kbps.
Routes payloads: Proxies the audio stream alongside service messages and function calls. Crucially, your actual command payloads aren't sent to the model—only the function names, parameters, and descriptions make the round trip.
Tracks tokens: Keeps a real-time tally to ensure you don't blow past your budget limits.
Runs local VAD: Uses a lightweight Microsoft AI model for Voice Activity Detection before hitting the LLM. Dropping dead silence and background mic noise saves a massive amount of tokens and drastically cuts your running costs.
Resamples audio: The LLM expects 16 kHz input but spits out 24 kHz output. The proxy normalizes both streams to 16 kHz, taking the processing load off your local microcontroller.
Masks your identity: Routes all requests through a single master account, acting as a privacy shield between you and Google.

What the proxy DOESN'T do

No logging: It does not record or transcribe your voice prompts or AI responses. This isn't implemented and never will be.
No third-party sharing: Your data stays yours.
No tracking: Zero sneaky fingerprinting, telemetry harvesting, or "phoning home."
No payment data storage: We don't ask for your phone number or hoard credit card info. Billing is handled securely off-site via Stripe.
No surprise charges: We don't auto-charge your card directly. You manage a prepaid balance. While slightly more manual, it guarantees absolute peace of mind—we hold zero billing data to compromise. (You can, however, check a box to let your subscription auto-renew directly from those pre-paid funds).

Frequently Asked Questions

Is it possible to wipe all traces of my activity from your system? Yes. Delete your account, and your data goes with it. All of it. No soft-deletes.

Can I run this without going through the proxy? Yep. DEV mode lets you connect directly to the model's API. You still use the dashboard editor to tweak your functionSet for now, but I’m actively working on a way to let you host your functionSet locally and cut the cord from the proxy entirely. Keep in mind: going proxy-free means losing audio compression, pre-model VAD, and token protection. But for local debugging, it’s highly viable.

Can I get access to the voice-assistant source code? That’s a tough one, and I don’t have a definitive answer just yet. I haven't made a final call on the licensing model because I want to protect the project from low-effort cloning right out of the gate. That being said, I’m very open to collaborating with serious contributors—feel free to reach out.