Skip to main content

Facebook’s latest deep learning tech can quickly interpret text and video

A demo of AI answering questions about a "cribbed" version of "Lord of the Rings" at Facebook's F8 conference in San Francisco on March 26.
Image Credit: Screenshot

Facebook chief technology officer Mike Schroepfer today showed off the newest capabilities of the company’s artificial intelligence systems. Not only can they mine images, but they can also recognize actions like sports in video, and they can answer questions about text, too.

“You can really get these systems to understand deep, minute differences,” Schroepfer said today at Facebook’s developer-oriented F8 conference in San Francisco.

Like Twitter, Baidu, Google, and other companies, Facebook has been adding on artificial-intelligence talent and developing systems for a type of AI called deep learning. The latest innovations from Facebook fall clearly in that category.

The advances in question answering were documented in an academic paper called “Memory Networks,” which you can read here. Facebook first talked about the research publicly in November.


June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.


Regarding video, Schroepfer demonstrated how Facebook can now identify hundreds of sports as they happen, even if they’re similar. So it can determine if a video is showing figure skating or roller skating, or whether it’s sledge hockey or roller hockey.

The video work follows advancements from a startup called Clarifai, which earlier this year expanded deep learning systems beyond image recognition. Now the startup’s technology can pick up on objects that appear in videos.

You can find clips from Schroepfer’s presentation here and here.