It has been a dream of science fiction authors since the advent of computers: hands-free interfaces that can respond to our every whim — without the need to strike a single key.
That future is now closer than ever, with engineers across dozens of industries hard at work designing both computers and mobile devices that can interact through simple conversation. Known as natural language Interfaces (NLI), expectations are that this form of communication will spread from talking programs such as Siri, Alexa, and Cortana — and most recently, Samsung’s Bixby — to a multitude of interactive apps and programs in the coming months. And, in many cases, it already has.
Nevertheless, it’s no secret that chatbots and NLIs have had their fair share of growing pains — take Microsoft’s Twitter chatbot Tay, for example, which learned only too well how to mimic internet-speak and within a day ended up spouting racist comments to Twitter users. In order to live up to the hype chatbots have received over the past year, however, hands-free human-machine interfaces will have to understand human speech, with all its mistakes, pauses and accents. Since language varies considerably, bots must learn to accommodate the fluctuating tones and stresses of natural speech. While we’ve come a long way from the simple NLIs of yesterday (remember Clippy, anyone?), things like the mistakes in Skype’s real-time translation software illustrates just how far we have to go.
What are bot designers doing now to build a better bot?
June 5th: The AI Audit in NYC
Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.
Hello World: The rise of digital assistants
The chatbots of the first wave of the bot craze required the user to adapt their idiosyncrasies, instead of vice versa — an impediment that was a hallmark of the industry’s early learnings and one that alienated many first-time users. Today, however, digital assistants such as Alexa, Siri, Cortana, and others seek to supplement voice with available data in order to augment voice recognition technology and improve our understanding of complex speech patterns. These bots use both data mining and neural networks — mathematical systems that learn tasks by identifying patterns in vast amounts of data — to analyze and mimic human speech.
For example, Cortana mines its users’ emails and calendars. Soon, with the Microsoft Office 365 cloud service, Cortana should gain the ability to search files and find pertinent documents. Banking giants Wells Fargo and Visa are also working to create their own voice and biometric identification system that will use voice to engage in actions such as transferring funds and checking balances.
From here, engineers are hoping to integrate AI into different systems to allow bots to respond to the user’s personality. An early example of this technology is when Google attempts to automatically fill in your search query as you start typing, which is done using a combination of your location, previous searches, and knowledge about your search history and interests. However, there’s still a long way to go before bots really know a user’s preferences and behavior and can anticipate and suggest potential needs. Without such “learning,” a bot cannot truly engage in meaningful two-way communication. As the bot revolution continues, users will be looking for bots that are proactive and intuitive, able to learn about the individual with whom they are communicating.
Multithreading: The holy grail of bot communication
Devices or programs that are multithreaded, or able to remember multiple situations, may be the key to improving conversations with AI bots. Today, a user usually must finish a single use case before starting another one. This is unlike normal language, however, as most people don’t always finish an old conversation before beginning another one. To truly mimic natural language, a user should be able to engage in multiple conversations on multiple topics with an AI bot at the same time.
One current strategy for achieving a true NLI is reinforcement learning: Instead of teaching bots to mimic conversation, this strategy may be able to help to actually comprehend language. Reinforcement learning is perhaps most well-known for its role in AlphaGo, a computer developed by a subsidiary of Alphabet called DeepMind, which mastered the complex board game Go – and beat one of the best human players in a highly publicized match last year. Because of the vast number of possible moves and the need to employ creativity to win, the game was one of the hardest challenges for artificial intelligence. While the technique is still in its infancy, it can drastically change the way bots interact with one another to solve problems and could alter the way we teach bots to converse — and even allow them to participate in conversation as if they were a native speaker!
There’s a bot for that
The bot revolution is reaching saturation. These days, it seems like there’s a new bot announced every hour — for example, on Facebook alone, there are over 30,000 bots, and more created every day. Most of these bots, however, are one-trick ponies; they can tell you the weather, access your calendar, or play chess, but they can’t multitask well. As we strive to make better and better bots, the answer can’t always be “there’s a(nother) bot for that.”
Instead, to streamline communication with our devices, the future will likely be one in which a couple of bots come out on top, and that bot will become the intermediary between ourselves and the program we want to activate. For example, instead of interacting with a bank bot to check your credit card balance, a travel bot to book a flight for a work trip, and a third to submit an expense report for reimbursement, there will be one bot that can tackle this task from start to finish. This means that bots must become experts not only at communicating with humans, but, more importantly, with each other.
In the end, it will likely be a combination of many different strategies that enable bots to seamlessly communicate both across programs and devices and with humans. However, one thing is sure: The bots are coming, and they’re likely to change a lot about how we interact with our world.
Claus Jepsen is the chief architect at Unit4, a leading provider of enterprise solutions empowering people in service organizations.