Micron: Why a memory chip maker is moving into AI processing

VentureBeat: What makes this hard enough that it might be a 10-year project?

Pawlowski: What’s hard is it will be a 10-year project. Coming up with a programming paradigm and the community starts using it, so you can start bringing the programming community along — I don’t know if you know Steve Wallach, [from Tracy Kidder’s book] Soul of a New Machine, years ago. He just retired, but he was working for me for a while. Every one-on-one we had, he said, “If I ever teach you anything, it’s this. The thing that’s easiest to program is the solution that wins. Every time.” The bottom line is, you have to bring over the programming community. You can’t just go do fancy hardware and leave them behind, because they won’t touch it. That’s the hard problem.

We’re not a software company yet. Intel wasn’t a software company when I started. They’re like Nvidia. They have a number of software engineers that in some cases exceed their hardware engineers. You just don’t see them.

Above: Micron is moving into AI.

Image Credit: Dean Takahashi

VentureBeat: You acquired one piece here with the Fwdnxt folks. Was it a pretty comprehensive piece, or do you need more? Do you still need to find a lot of partnerships?

June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

Pawlowski: We’re going to need a lot of partnerships and data scientists. They have an inference engine architecture they’ve developed over five years, 10 years, 12 years. Different companies and different academic settings. The guy who founded it was a professor at Purdue. They’ve been optimizing that architecture. They have a fairly good compiler that takes an Open Network Exchange frontend and then maps it down to their hardware.

What I need are data scientists. I need applications. I also think we’re going to need a dynamic runtime/scheduler. If you really have this model of — if I wrote a network on hardware today, on an Intel processor, three years from now you could still run that same program. Everything is abstracted through the instruction set. What I want to do here is abstract the network, which means we’re going to need some type of dynamic runtime. That’s going to say, “OK, this thing has 8,000 multiply and accumulate units. This has 1,000. I can spread that thing out a little farther. Oh, these 150 units died. I don’t want to schedule anything on those, but I still want to be able to use the part.”

There’s a couple of entities out there that have been looking at solving the dynamic runtime problem that I think is going to be pretty important. Especially — I’ve heard estimates. The guy who used to run Litho at Intel, I ran into him a year ago in the airport. He said that they believe that when they get to sub-5nm, they’re looking at 30% of the devices are going to be out of spec at manufacture.

VentureBeat: You mean defective, or — ?

Pawlowski: Just out of spec. We guardrail the crap out of things, assuming that it’s going to have a seven-year lifetime and things are going to degrade. Well, in this particular cause, you can’t even guard that. It’s not working even to the spec within the guardrail.

I found a paper that was done by some people in Brazil that showed that if you have — assume you can do 512 cores. You get 20% degradation. The overall degradation from peak performance is about 4%. A 32-core chip is dead. A 64-core chip, only one of the cores is active. They’re just assuming a random distribution with these values. Having that dynamic runtime for these large-scale applications, if we go to finer geometries than seven, is going to be something that’s equally important.

Memory systems have done redundancy for years. They test. If a block is bad, they’ll swap in a redundant block. If there are more bad blocks than redundant blocks, OK, this becomes a keychain.

VentureBeat: Does this signal a lot of competition with the likes of Intel and Nvidia?

Pawlowski: It’s going to be more cooperative. It’s hard to compete with Intel and Nvidia in the datacenter. Nvidia has the training locked up. Even when people come in with new solutions — at least one startup told me that the hyperscalers told them, “It’s so hard to move our training algorithms from the GPU. It’s doing so well. They’re still giving us performance gains. Don’t spend your time on this.” And the last I heard, the last statistic I heard, was a significant portion of the inference was still run on Xeon.

We’ve been focusing — if we’re going to do anything in the datacenter, it’s to help our customers like Nvidia and Intel. But if there’s any innovation that can occur from a memory storage point of view, let’s look at it out on the edge. That’s where we’ll get the greatest efficiencies and economies of scale.

Sanjay Mehrota, CEO of Micron, at Micron Insights.

Above: Sanjay Mehrotra, CEO of Micron, at Micron Insights.

Image Credit: Dean Takahashi

VentureBeat: Has the Moore’s law part been OK? Are you on schedule?

Pawlowski: It’s been a challenge, but that hasn’t stopped us from being able to continue to scale. Quite honestly, I had to live the Moore’s law thing forever. Thou shalt not say anything bad about Moore’s law! That was the eleventh commandment. When people ask me — it was the slowing and stopping of Dennard scaling that really forced the innovation. Now, we may not get double the transistors every two years. Maybe every three or four years. But we’ll grow in the third dimension. That really hasn’t stopped us. It’s just a question of what’s the most economical way to do it. Engineers find really creative solutions to hard problems.

VentureBeat: Intel reinforced today that they’re going to do 7nm in 2021 with the graphics chip. They seem to be back on schedule.

Pawlowski: I hope so. When I left, and that was only five years ago — it was amazing how fast that four-year lead evaporated.

VentureBeat: In that sense, it seems like the whole industry is moving in lockstep, then.

Pawlowski: I think the industry is continuing on their treadmill: “We can still see a path to scaling.” Like I say, I don’t know if it’s the aggressive two years of keeping up with Moore’s prediction, but think of where we’ve come in terms of capability because of Moore’s law over the last 40 years. It’s just incredible. I still think there’s scaling to be had. We’ll take advantage of it just like anybody else.

VentureBeat: The Fwdnxt deal, as far as what it gives you — is it more on the software side, or is there chipmaking talent there as well?

Pawlowski: Not really chipmaking talent, no. They have architecture talent. They have hardware architecture talent. They’ve translated FPGA. In terms of being able to take that and make an ASIC and do the frontend and backend, that’s not been their expertise, but now they’re in the place to do that. They bring the software and the architecture, and the architecture of not only the hardware, but the architecture knowledge of convolutional neural networks and how they can — if somebody presents them with a problem, how they can tune that network and then use their data to train it to get the accuracy levels they’re looking for. Once they achieve the kind of accuracy they want, they map that trained algorithm onto the FPGA to do the classification side.

VentureBeat: So it does give you a variety of options as to what you want to do.

Pawlowski: It does. I’m totally looking at it as — we’re learning so much about how these things interact and how these different networks are evolving. The nice thing is, I can put a network with a million parameters on it — I can put a 100-gig network parameter. It runs slower, but I would be able to understand how those large networks are going to evolve and what we would do.

I was on the panel talking about how we’ve been doing some work with CERN. Just what we’ve learned in the prototyping we’ve been doing with them is phenomenal. They’re throwing data at it at such a fast right, and they need insights so quickly. They’re not about — accuracy is good, but they don’t need something like a cancer patient, where you need 99.999% accuracy. They’re asking, “Is this 70 or 80% good this is something interesting? No? Throw it out. We have more stuff coming at us. Eventually we’ll get something that hits that threshold and that’ll be interesting.” They’re getting 40 million collisions a second.

VentureBeat: What’s your general description of the problems this can solve?

Pawlowski: Both of these problems, the health care and CERN, basically they’re taking 2D sensor images and constructing a 3D model. On the CERN one, particles collide and they create a shower of other particles. What they want to do is quickly take the measurements of those particles and say, “Do all the energies add up?” If the energy was X and you get Y, which was less than X, then there’s some energy that wasn’t accounted for, and that’s interesting science, because the law of conservation of energy says nothing should have been created or destroyed. Once they do that, they want to be able to take different images and construct a 3D model of what that decay looked like, because it doesn’t all show in the same 2D image.

Above: Micron has acquired Fwdnxt to build AI solutions integrated with memory.

Image Credit: Micron

Visualizing tumors takes many, many 2D images, X-rays, and whatnot, and creates a 3D volumetric model. We’re using the same 3D convolutional neural network styles. They’re different networks, because they have different inner layers that do different things, but we’re taking those and solving a similar problem of creating a 3D representation.

VentureBeat: I don’t know if you’ve heard of a company called MediView. They come out of the Cleveland Clinic, and they just raised $4.5 million in venture capital. They take an MRI of the patient’s body, where everything is inside, and then they put the data into a Microsoft HoloLens. The doctor can then visualize it all in 3D. You put the scalpel into the patient and he sees it going in through the HoloLens, that it’s going where he wants it to. He’d never otherwise have that view inside. Before, he had to guess based on all these 2D screens he’s looking at.

Pawlowski: That’s fantastic. Years ago, the head of surgery at Oregon Health Sciences University said, “You need to come here.” I was at Intel at the time. They actually scrubbed us up, took us into a double hernia surgery, and he said, “I want to show you how we do surgery now.” They were scoping everything. It wasn’t like this person was entirely laid open, because otherwise I would have not gone into that surgery. We finished and he said, “Now, let me show you how we train our surgeons.” It was like the Stone Age in terms of the tools. This would be a perfect teaching tool.

1 2 View All

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

The insights you need without the noise