Skip to main content

ProBeat: We can’t get over how human Google Duplex sounds

testsetset

Watch the above video. Then watch it again, but close your eyes. Listen carefully to the voice making a restaurant reservation.

Duplex — Google’s artificially intelligent chat agent that can arrange appointments over the phone — has started rolling out to a “small group” of Google Pixel phone owners in select cities (Atlanta, New York City, Phoenix, and San Francisco). For now, the feature only works in English, with some restaurants, and can’t handle any other businesses that take appointments.

As news of the feature becoming slowly available has spread, a lot of debate has focused on whether it’s worth the effort. As many have pointed out, it seems faster to just call the restaurant yourself than to have to input all that is required into Google Assistant and wait for a confirmation. There are plenty of scenarios where this is useful, though — if you have a speech impediment, social anxiety when making phone calls, in a location where you can’t place a call, the restaurant is closed when you want to make the reservation, and so on.

I want to focus on the other hotly discussed part of the news: the Google Duplex voice. Many can’t get over just how humanlike it sounds, although I’ve watched the video so many times that I’ve convinced myself it doesn’t sound human.


June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.


Disclosure and transparency

In this Duplex ad from earlier this year, here is how the voice introduced itself:

Hi! I’m the Google Assistant calling to make a reservation for a client. This automated call will be recorded.

In the call we recorded, the wording has changed slightly, removing the part that makes it crystal clear this is not a human calling:

Hi, I’m calling to make a reservation for a client. I’m calling from Google, so the call may be recorded.

I’m sure Google is still iterating here — the wording will likely change a few more times. The team could in fact be A/B testing multiple versions.

Update on November 26: At this stage, some Duplex calls are made using Google’s automated system and others are conducted using human operators. The former announce that they are “automated” up-front (see the Google ad) while the human operators don’t say that (see our video) — they simply identify that they’re calling from Google and that the call will be recorded. Google says the majority of Duplex calls via Google Assistant are automated, meaning a bot is speaking. The call we recorded, however, was part of the manual baseline, meaning a human was speaking the whole time.

Some sort of disclosure is present either way because Google received a ton of criticism after its initial Duplex demo in May — many were not amused that Google Assistant mimicked a human so well. In June, the company promised that Google Assistant using Duplex would first introduce itself.

Too human

What Duplex actually says sounds extremely believable — especially the multiple thank-yous and the “ba-bye” at the end. But the pauses seem a little too long, especially at the very beginning and at the end. Getting a conversational AI’s voice to not sound robotic makes sense — it’s simply more pleasant and comfortable to talk to. But having it perfectly replicate what a human would do? Or having a human sometimes make the calls while other times it’s a bot?

This is worse than a double-edged sword. If Duplex gets things wrong and screws up the conversation, it makes Google look bad. If Duplex tries too hard to act human, it comes off as creepy and … makes Google look bad. If Duplex sometimes isn’t even Duplex, but a human, that makes it all the more confusing.

Google needs to strike a perfect balance: accurate and intelligent, but also transparent and honest.

While Duplex is a user-facing feature, currently exclusive to Pixel phones, it is ultimately businesses that interface with the conversational AI. That’s the part it can’t screw up. Google has to tread lightly on that tightrope or the whole experience will come crashing down.

More videos to come

We may have recorded the first video of Duplex in action, but I suspect this is going to birth a whole genre of new content.

Duplex is going to mess up, and it will be hilarious. Duplex is going to make serious mistakes, and it will be concerning. Duplex is going to get things too right, and it will be scary.

But hey, at least the internet will document it with plenty of videos.

ProBeat is a column in which Emil rants about whatever crosses him that week.