Baidu open-sources its WARP-CTC artificial intelligence software

Chinese Web company Baidu is announcing today that it is releasing key artificial intelligence (AI) software under an open-source Apache license. The WARP-CTC C library and optional Torch bindings are now available on GitHub, by way of Baidu Research’s Silicon Valley AI Lab (SVAIL).

The connectionist temporal classification (CTC) approach dates back to 2006, when it was documented in a paper from the Swiss AI lab IDSIA. Baidu Research developed WARP-CTC on top of that technology in order to improve its own speech recognition capability.

“We found that currently available implementations of CTC generally required significantly more memory and/or were tens to hundreds of times slower,” the Baidu Research team wrote in a blog post on the news.

The CTC approach involves recurrent neural networks (RNNs), an increasingly common component used for a type of AI called deep learning. Recurrent nets have been shown to work well even in noisy environments.

June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

Andrew Ng, Baidu Research’s chief scientist, is noted for his research on artificial neural networks running on top of graphics processing units (GPUs), and indeed WARP-CTC works on top of GPUs and x86 CPUs alike.

Facebook, Google, and Microsoft, among others, have open-sourced their AI software as well. Recently Facebook went so far as to share its AI server hardware designs with the public. Today’s move from Baidu marks a big step forward in terms of Baidu’s knowledge sharing outside of academic papers.

“A lot of open source software for deep learning exists, but previous code for training end-to-end networks for sequences (like our Deep Speech engine) has been too slow,” Baidu wrote. “We want to start contributing to the machine learning community by sharing an important piece of code that we created.”

The insights you need without the noise