VentureBeat: You hinted at transparency and privacy just now, both of which I’m sure are important areas for Salesforce. Your customers, of course, don’t want their data compromised, and they want to understand what’s happening under the hoods — like how AI is arriving at its conclusions. So what’s the work you’re doing there — what does it look like today?
Socher: You touched upon a couple different things — all of which are very important to us.
Interpretability and opening up the “black box” is going to become a more important research area, but it’s on the spectrum — nobody asks why an object detection algorithm for consumer packaged goods (CPG) classified, for example, a can of Spam as a can of beans. They don’t ask why it classified something as this versus that because the behavior doesn’t change — they just want to automate a certain process, and they expect it to be in a ballpark range and have some error bars on it. And so interperability, for people in some industries, is less important.
Now, if you were to make a vision classification system in medicine, you’d want to know why it’s telling you that you need to get major surgery. Even within the field of computer vision specifically, there’s a spectrum — you don’t necessarily care about some things as much as other things in terms of the interpretability of the algorithms.
June 5th: The AI Audit in NYC
Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.
VentureBeat: In that health care example you just brought up, you’d also probably want to know a bit about the datasets that were used to train the AI model, right?
Socher: Yes. The datasets come in on the fairness question. Of course, you want to know if an AI system has been trained on people from a certain ethnicity, or certain age groups, and so on. There’s a lot of complexity in the training data when considering bias and fairness. But I think it’s also an interesting angle for interpretability — like finding training examples that most closely resemble a test case and are most likely what led the algorithm to make a certain kind of decision.
On the other end of the spectrum, we have machine learning classifiers where we ask humans to change their behavior. For instance, Salesforce offers lead opportunity scoring for salespeople. A salesperson on any given day might call, like, 5,000 people, and we try to answer the question: Who should they really call? Our tools provide a ranked list. But when we tell salespeople this, a lot of them feel that we’re telling them how to do their job. So you have to give them reasons why one call is the right one or is ranked higher than others.
In some cases, then, interpretability is extremely crucial, because the AI feature will not see adoption if it doesn’t have that aspect to it. In other cases, there is an unfortunate discrepancy between the most interpretive algorithms and the most accurate.
It’s going to be an interesting ethical question. We make trade-offs as a society, but I that at some point we have to say, OK, these autonomous systems aren’t perfect. Self-driving cars could save like 10,000 lives a year, but you might have five or a dozen different algorithms that are responsible for having killed like 5,000 people a year because they weren’t 100 percent accurate. Neither are humans, though — humans are even worse from an accuracy standpoint. It’s a really interesting, sociological, philosophical, ethical, kind of question for us to ask now.
VentureBeat: It kind of gets at the question of algorithmic fairness, right?
Socher: That’s true. On the fairness side, Salesforce thinks about that a lot — again, we are a platform company, and we release tools for other companies to build their own AI.
One of the features that we announced at Dreamforce last year was Einstein Prediction Builder, where, based on any set of columns, you can predict another column. That sounds kind of boring, but it’s close to 80 percent of enterprise and business machine learning applications. These are predictions like: Will this person pay their loan back? Should they get a mortgage? We had to be very careful, because we don’t know what kinds of columns are going in there as input. Someone could build a racist or sexist kind of classifier.
I’m very excited that this feature comes required with a Salesforce Trailhead on ethical AI — to at least create awareness for admins who are building these kinds of systems to think about what potential issues there could be in their datasets and what kinds of biases might be in the training data. You don’t want to have a loan or credit classifier that takes into account gender and, like, doesn’t give a woman startup money for their company because it hasn’t seen in the training dataset as many women starting companies.
There’s no silver bullet in this in the space. There are some interesting and complex algorithms ideas if you know which classes or columns of people you want to protect — you can try to make sure that the bias in the data doesn’t get further amplified in the algorithm. But it’s an ongoing conversation that has to happen. Kathy Baxter, the architect of Ethical AI Practice at Salesforce, has this really great saying: Ethics is a mindset, not a checklist. Now, it doesn’t mean you can’t have any checklists, but it means you do need to always think about the broader applications as it touches human life and informs important decisions.
VentureBeat: Where does regulation come into play? Microsoft’s Brad Smith recently called on Congress and tech leaders to prevent misuses of facial recognition. What’s Salesforce’s position?
Socher: On the subject of regulation, we have to acknowledge that it doesn’t make sense to regulate all of AI — it makes sense to regulate AI applied to certain human endeavors, like self-driving cars drugs, drug discovery, radiology, and pathology. It’s a pretty complex space and it’s hard to figure out where to draw the line — to figure out where we are imposing our viewpoints culturally, politically, and technologically
There’s a lot of things that can go wrong with respect to facial recognition, for sure. There’s a lot of really bad research, sometimes even from good universities — like thinking you can classify whether somebody is gay or not from a photo of their face. As soon as you start making important decisions based on facial recognition, you can do some terrible things — like using it and AI software to make judgments in the judicial system. It’s obviously going to be biased.
So hopefully, yes, there will be regulations against that application of the technology.
There’s a silver lining, of course, which is it’s easier to change one algorithm to make certain decisions than it is to change, for example, 10,000 store managers who don’t promote woman as often at a supermarket chain or something like that. But it will require research and analysis and interpretability, fairness, the right datasets, and all of that together to make sure that positive part of the future can actually happen.
VentureBeat: Do you think part of the problem is that we have unrealistic expectations of these AI systems? Is it that the public isn’t aware of their limitations?
Socher: The marketing team and I worked together and making sure we’re not talking about the brain. And, you know, like, human AGI and all of that, which is exciting, but it’s still science fiction.
This is a little tangential, but I don’t think there’s a credible research path towards AGI — like, we don’t even have the missing pieces.