PixVoice
Open studio
The sonic atelier

Refining voice
for the social era.

Curate the perfect tone. Refine the pacing. Bring your text to life — then export it as a TikTok-ready social video. All on-device. No account. No catch.

Built for laptop + Wi-Fi · ~700MB first-run download, cached forever
Chrome · Edge · Firefox · WebGPU

Meet the Voices

28 distinct personas running entirely on your hardware. Ideal for ASMR, podcasts, and cinematic narration.

American Female

Heart

Breathy intimacy. The voice that sounds like it knows you. Perfect for ASMR, meditation, and storytelling.

British Male

Lewis

Authoritative, warm, and cinematic. The classic British documentary narrator AI voice.

Podcast

Puck

Upbeat and conversational. Excellent for high-energy tech podcast intros and YouTube explainer videos.

What you shape

Three dimensions.
One craft.

Tone

28 sculpted voices across American and British English plus seven more languages. Each tagged, each curated, each character-driven. Pick the one that fits the moment.

Pacing

Insert silences with [pause:500]. Adjust playback speed. Tune karaoke word-highlights to hit the beat. Rhythm is half the performance.

Frame

Export as a vertical video with burned-in karaoke captions in six curated styles. TikTok, Reels, Shorts — ready to upload without ever touching another editor.

Under the hood

No servers.
No invoices.

PixVoice runs Kokoro-82M for speech, with Whisper-base for word-level alignment on capable desktop devices (mobile falls back to phoneme timing). Everything stays in your browser via WebGPU. Models cache on first visit — sessions load instantly thereafter, with no servers, no accounts, no quotas.

First-run download
  • Kokoro-82M text-to-speech ~300MB
  • Whisper-base caption alignment ~400MB
  • Total ~700MB
Every subsequent visit loads in ~1.5 seconds. Once cached, works offline.

The studio awaits.

Enter the studio