Watch all the Transform 2020 sessions on-demand here.
Researchers from Samsung’s AI Center in Cambridge, England and Imperial College London have created an end-to-end generative adversarial network (GAN) that animates and syncs facial movement of a 2D talking head image with an audio clip with human speech.
In addition to syncing lip movement, the face synthesis model also sprinkles in eyebrow movement and blinks to make the depictions it generates seem more natural. Syncing lips with audio today is often done during the post-editing process or through the use of computer graphics.
Researchers believe the model could be used to automatically create talking heads for characters in animated movies, fill in the blanks when video frames are dropped during low-bandwidth video calls, or provide better lip sync or dubbing of films in foreign languages. The tech could also be used for manipulative fakes.
In examples shared ahead of the research on YouTube, lead researchers Konstantinos Vougioukas depicts the dead Russian mystic Rasputin singing Beyoncé’s “Halo,” rappers 2Pac and Biggie singing their work, and Albert Einstein reciting a quote about the common language of science. More examples, a paper to explain the model, and code can be found on this website.
June 5th: The AI Audit in NYC
Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.
The news comes a month after Samsung’s AI Center in Moscow introduced AI for animating 2D still images without 3D modeling, tech that could be used to make more convincing digital avatars or deepfakes.
A number of GANs for manipulation of digital media have been introduced in recent weeks as AI researchers converged on Long Beach, California for ICML and CVPR conferences. Examples of models talked about at CVPR include Nvidia’s GauGAN for painting realistic but fake landscapes, and CollaGAN, a method devised by South Korean researchers to replace missing data in images.
In other recent examples, Facebook’s MelNet created GANs that can imitate music and sound like Bill Gates, and a fake video of CEO Mark Zuckerberg was posted on Instagram. A honeypot spying scheme was also discovered on LinkedIn last week aimed at attracting U.S. State Department employees.
Other GANs made in recent months include models to supply synthetic data for enterprise customers or create things like memes or African masks.
Concerned with what’s to come in 2020, the United States Congress heard testimony from deepfake experts last week. Legal and technical experts such as OpenAI’s policy director Jack Clark shared their concerns about the future of democracy and the marketplace of ideas, as well as the fact that the number of GANs being created greatly outweighs the number of systems being made to detect them.