Watch all the Transform 2020 sessions on-demand here.
This is an exciting time for those of us in computer vision — we’re seeing it merge with AI to enable all kinds of new possibilities. At the LDV Vision Summit in New York a few weeks ago, I came away with five key insights about where computer vision will impact AI:
1. Smart assistants will battle it out over vision
AI needs data with which to learn and process, and as we move closer to more “human”-like AI, it will increasingly need visual data. “This is one of the reasons all the major companies are at war to own the visual data of our activities,” said LDV Capital’s Evan Nisselson. “To do that, they need to own the camera.” Amazon recently added a camera to its Alexa-powered Echo, for example, and Google (Lens) and Facebook recently made new recent augmented reality announcements.
2. Optics alone could be enough to direct self-driving cars
June 5th: The AI Audit in NYC
Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.
We are seeing debate over whether self-driving cars need LiDAR or can depend solely on optical solutions. Tesla CEO Elon Musk, for example, doesn’t think that LiDAR, a bulky and expensive device that uses lasers to maps its environment in real time, is necessary for fully-autonomous driving. Wheras Humatics CTO Gregory Charvat said at the vent that cars “need more than just optical sensor platforms [cameras], they also need LiDAR, radar, and high-precision radio navigation more precise than differential GPS.”
LiDAR and radar work by pinpointing actual objects in the surrounding environment by range and angle, whereas deep learning-based camera solutions need to run images through algorithms and are ultimately still predictions. Optical solutions are nevertheless better at actually identifying what something is — for example, a pedestrian versus a bunch of pixels that look like a Christmas tree, as Auto X Founder and CEO Jianxiong Xiao showed during a demo of his company’s impressive and low-cost self-driving solution that only uses cameras.
Technology pros and cons aside, car companies typically work five years in advance, so the necessary hardware would need to be purchased now to make a 2021 deadline. For now, LiDAR and more advanced forms of radar are still expensive ($80,000 is considered cheap for the former) and bulky. Meanwhile, operating all these optical and sensor technologies in a fused way needs supercomputers small enough to fit in a car.
3. Vision could teach machines better than machine learning
As a few of the demos at LDV reminded us, machines don’t just learn through neural networks and machine learning. There are other ways they can learn to identify and analyze the world around them. Google Research scientist Tali Dekel demonstrated a technique that used computer vision to identify and then enlarge deviations from straight lines on a roof or the subtle presence of purplish color on fruit to, say, determine if there are structural problems in an old home or which tomatoes are riper than others. It seems simple enough, and yet it’s the type of thing that computer vision is better at than humans.
4. Machine vision can help with medical diagnoses
When a pathologist has, on an average day, 500 slides, each containing tens and hundreds of thousands of individual cells that need to be analyzed for, say, the presence of cancer, it’s easy to miss a diagnosis. “This is an impossible task for a human to do as effectively as a computer, simply because we’re not able to look carefully at every single cell,” said Andrew Beck, cofounder and CEO of PathAI. “We think computers can be really good at getting the perfect diagnosis every time.”
According to an American Medical Association study, just under half of the pathologists agree on a correct diagnosis. Citing another study focused on breast cancer lymph node biopsies, Beck showed the difference between the hotspots found by a computer versus a human pathologist; the former highlighted many additional areas that turned out to contain cancer cells. “We provide pathologists with both the raw image, so they’re still looking at the data they’re used to, as well as the image processed by the learning system, which essentially identifies the areas of cancer, enabling a physician to focus in on those areas,” said Beck. The breast cancer study found that without AI, this kind of biopsy only has an accuracy rate of 85 percent. With the AI-aided solution, the error rate plummeted to .5 percent.
5. The field of computer vision is getting easier and easier to jump into
The commoditization of better cameras, sensors, and deep learning software libraries such as Google TensorFlow has significantly expanded access to computer vision, and we are seeing many new startups emerge as a result. In the Vision Summit’s two startup competitions, we saw everything from a technology that generates demographic insights out of Google Street View images to an app that assesses the damage and calculates repair costs of a car that’s just been in an accident — from nothing more than a picture.
“What’s emerged is this incredible commoditization of so many parts of computer vision and machine learning that used to require teams of PhDs to develop in terms of infrastructure,” said Cornell Tech Professor and Summit coorganizer Serge Belongie, “but now it’s possible for individual hackers or developers on small startup teams to bring that kinds of functionality to any kind of product.”
Even so, commoditization still isn’t 100 percent plug and play. As Albert Wenger, Managing Partner at Union Square Ventures, told me, “It’s one of those curves where it’s easy to get 80 percent, and then extremely hard to get the rest done.”
So there’s still a lot of work to be done, which is a good thing for anyone interested in helping build the next big visual technology — whether it’s for business, health, or pleasure.
Ken Weiner is CTO at GumGum.