Senior Audio / ML Engineer (Local TTS / On-Device)
Fully Remote | Start: July 2026
About the Project
VoiceWunder has built a native ARA2 plugin for Pro Tools and UXP for Premiere Pro launching soon. We are currently powered by ElevenLabs v3 and are developing a high-performance local multi-model TTS engine aimed at professional Dialogue & ADR workflows.
Existing production plugin with active professional users.
Your Missio n
Build a local TTS / STS system aimed at matching or exceeding current top-tier cloud engines in quality and speed, running natively in our plugins.
Core Responsibilities
-
Design and implement a multi-model inference router / orchestration layer
-
Integrate and optimize state-of-the-art open-source TTS models
-
Deliver high-quality TTS, STS, voice cloning, emotional expression, prosody control, voice design & remix
-
Implement an integrated denoising pipeline
-
Heavy performance optimization for Apple Silicon (MLX) and NVIDIA GPUs
-
Ensure real-time performance suitable for professional DAW environments
-
Fine-tune models using high-quality studio recordings
-
Collaborate closely with our lead ARA2/JUCE developer
Requirements
-
Strong experience with modern local TTS models and on-device inference
-
Deep knowledge in model optimization, quantization, streaming and low-latency audio generation
-
Experience with Apple Silicon (MLX) and/or CUDA is a strong plus
-
Passion for building production-grade, expressive speech synthesis systems
Compensation
-
Competitive fixed-price compensation for an 8-month project
-
Milestone-based payments
What We Offer
-
High-impact role — your work will be core to our Local Pro product
-
Direct collaboration with experienced plugin developers and audio editors
-
Fullyr emote & flexible schedule
How to Apply
If this sounds like a good fit, please send your resume, short cover letter, and portfolio or examples of relevant work to: info@voicewunder.ai. Please use the subject line: “Senior Audio / ML EngineerApplication”
