Skip to main content

Databricks launches AutoML Toolkit for model building and deployment

Image Credit: Databricks

Watch all the Transform 2020 sessions on-demand here.


Databricks today introduced its AutoML Toolkit, an automated end-to-end machine learning service made to accommodate developers with a range of experience.

Available from Databricks Labs, the AutoML Toolkit can automate things like hyperparameter tuning, batch prediction, and model search. AutoML Toolkit is built on existing Databricks tools like MLflow, an open source machine learning platform that integrates with frameworks like TensorFlow and Amazon SageMaker. AutoML Toolkit executions are automatically tracked using MLflow.

The toolkit also utilities Apache Spark, an open source project created by Databricks founders and turned over to the Apache Spark Software Foundation in 2014.

The AutoML Toolkit differs from other AutoML solutions in that it allows data scientists and engineers with varying levels of expertise to work together, Databricks head of ML project management Clemens Mewald told VentureBeat in a phone interview.


June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.


Mewald previously worked on Google’s TensorFlow and KubeFlow project teams.

“Sometimes there are people who are super familiar with low-level code and want full access, and then another person on the same team may be less familiar with code or maybe happy with a UI-based solution. So the different levels of solutions that we provide in the AutoML space address a lot of these different needs and expertise levels,” he said. “Because they’re both on the same technology stack, that allows you to move between them, If you choose to. So you can basically start at the highest level of abstraction and not write any code at all. And then once you’re done and you need more flexibility, you can go one level down and get access to more of the knobs and levers that you may need.”

Some forms of automated machine learning were previously available for Apache Spark.

Machine learning that can automate the creation and deployment of machine learning models began to grow in popularity with the introduction of Google’s AutoML in 2017. Since then, public cloud leaders like Azure’s Machine Learning service have also introduced solutions for automated machine learning.

Building on a previously established partnership, Databricks’ AutoML offering also integrates with Azure Machine Learning.

Databricks has introduced a series of changes to support its AutoML offerings in recent months.

With the 1.1 release of Databricks Runtime 5.4 ML in June, Databricks got automated hyperparameter optimization with Hyperopt integration. In April, Databricks open-sourced Delta Lakes, a collaborative initiative for creating data lakes that support reliable machine learning projects.

In February, Databricks raised $250 million for its data and AI platforms with funding from Andreessen Horowitz, Microsoft, and the NEA.