Create a Campaign
Define your target language, sentence prompts, and validation criteria to launch a voice data collection campaign.
The open-source platform for teams who want to collect speech data across languages, validate contributions from the community, and export production-ready voice datasets.
Record, validate, and export high-quality voice data across any language with a platform built for researchers and teams.
Speak sentences aloud and contribute recordings to language datasets in your native tongue.
Review community recordings and confirm transcription accuracy to maintain dataset quality.
Submit new sentences for contributors to record, expanding the prompt library for any language.
Create and manage targeted voice data collection campaigns with custom goals and language settings.
Track recordings, validation rates, and contributor statistics with real-time dashboard insights.
Collect voice data across dozens of languages and dialects from communities around the world.
Three simple steps to build a production-ready, community-validated voice dataset in any language.
Define your target language, sentence prompts, and validation criteria to launch a voice data collection campaign.
Contributors from around the world read and record sentences in their natural voice directly in the browser.
Review submissions for quality, validate transcriptions, and export clean labeled datasets ready for training.
Researchers, engineers, and linguists around the world trust OpenVoice to build high-quality voice datasets.
Everything you need to know about collecting, validating, and exporting voice datasets with OpenVoice.
OpenVoice is an open-source platform for collecting, validating, and exporting voice datasets. It is built for NLP researchers, speech AI teams, linguists, and community organizations who need high-quality labeled audio data across any language.