Skip to main content

Datasaur, a semi-automated text data-labeling tool, raises $1 million

Watch all the Transform 2020 sessions on-demand here.


Datasaur, a company building a text data-labeling platform, today announced it has raised a $1 million seed round from angel investors like Segment CTO Calvin French-Owen. Coming out of stealth today, Datasaur was founded in February 2019 and uses semi-automated labeling and some pretrained models to speed up the data-labeling process and fuel the improvement of natural language processing (NLP) models.

Datasaur was founded by Ivan Lee, who has spent the past seven years working as a product manager on AI ventures at companies like Yahoo, most recently for Apple’s Siri team. Before working at Apple, Lee sold mobile gaming startup Loki Studios to Yahoo in 2013.

As part of the Winter 2020 batch, Datasaur will present next month at Y Combinator’s Demo Day in San Francisco.

“As a PM, I came to appreciate just how powerful AI was, but I also recognized that I was constantly trying to get more labeled data for my engineers. It was this insatiable appetite. We were spending millions of dollars gathering this data, but it was a tedious job, it was an inefficient process, and I saw a lot of these companies reinventing the wheel when it came to how they should set up their labeling processes,” Lee told VentureBeat in a phone interview.


June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.


The funding will go toward launching the Datasaur NLP platform, which was in closed beta until today, and adding functionality that helps managers do things like delegate assignments or detect bias in data sets.

Early Datasaur users include businesses, academics, and researchers working with the Indonesian government to flag online news articles and guard against election tampering.

Datasaur is going up against a number of data-labeling startups, like Labelbox, which last month raised $25 million, and CloudFactory, which raised $65 million last fall.

Datasaur screenshot

Above: Datasaur user interface

Image Credit: Datasaur

But Lee expects Datasaur will be able to compete by focusing solely on software for labeling text data. He believes companies in any industry hungry for insights from text data will increasingly find data-labeling tools essential.

“We’re seeing a lot of companies [that] need to set up their own labeling processes, and so we want to help bring them the same efficiencies that any of these other services have been able to build,” he said.

Datasaur currently has 10 employees in Sunnyvale, California and Indonesia.