Watch all the Transform 2020 sessions on-demand here.
Mozilla is expanding its crowdsourced Common Voice project — an initiative that’s setting out to create an open source voice-recognition dataset — to include more languages.
The tech organization first announced Common Voice last June, inviting volunteers from around the world to record snippets of text with their voice through web and mobile apps.

Above: Record your voice
The project serves as a sort of antithesis to the growing arsenal of proprietary voice recognition technologies being developed by the likes of Amazon, Google, Apple, and Microsoft. The aforementioned juggernauts are investing heavily in their voice-activated digital assistants Alexa, Google Assistant, Siri, and Cortana, but the respective datasets are owned by the companies themselves.
Mozilla launched the first fruits of its Common Voice datasets in English back in November, a collection that contained some 500 hours of speech and constituted 400,000 recordings from 20,000 individuals. Today, Mozilla officially kick starts the process of collecting voice data for three more languages — French, German, and — a little randomly — Welsh. Another 40 tongues are currently being prepped for the data collection process, with the likes of Brazilian Portuguese, Chinese (Taiwan), Indonesian, Polish, and Dutch already halfway toward being ready to start crowdsourcing voice data.
June 5th: The AI Audit in NYC
Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.
Next big platform
It has been apparent for a number of years that voice will be the next big platform in technology. Just yesterday, Amazon officially launched its new camera-infused, Alexa-powered Echo Look smart speaker that tells you what clothes you should wear. We are still very much in the early days of this movement, but it’s clear that voice will only become more pervasive.
It’s against this backdrop that Mozilla is pushing ahead with plans to create an open source dataset that can be freely used by anyone to build voice-recognition smarts into all manner of applications and services.
“We believe these interfaces shouldn’t be controlled by a few companies as gatekeepers to voice-enabled services, and we want users to be understood consistently, in their own languages and accents,” said Mozilla’s chief innovation officer, Katharina Borchert, in a blog post.
The Common Voice project serves a purpose similar to that of other open-license projects that have emerged to counter privately owned platforms. OpenStreetMap is a good example of a similarly crowdsourced project that gives developers open and freely usable maps of the world, without the costs or restrictions of rival services such as Google Maps.
In terms of accessibility, English may be the lingua franca of the internet in many regards, but the fact remains that most people speak a language other than English as their native tongue. And with the voice-recognition AI revolution gaining steam, anything that offers developers and technologists multilingual datasets to train machine-learning models can only be a good thing.
“Going multilingual marks a big step for Common Voice, and we hope that it’s also a big step for speech technology in general,” added Michael Henretty, digital strategist for Mozilla’s Common Voice project. “Democratizing voice technology will not only lower the barrier for global innovation, but also the barrier for access to information.”