AWS catches up to competition with AI services for video and text understanding

Watch all the Transform 2020 sessions on-demand here.

Amazon Web Services launched a slew of machine learning-based services today that are aimed at making it easier for customers to embed intelligent capabilities in their applications. One new service offers video analysis, while a trio of language understanding APIs offer automatic transcription, translation, and document processing.

These tools are designed to make it easier for customers to reap the benefits of machine learning without requiring the expert knowledge necessary to build systems themselves. The services join Amazon’s existing suite of pre-built AI capabilities for customers, including its Lex language understanding service, Polly text-to-speech offering, and Rekognition image recognition service.

It’s a move by AWS to catch up with its major competitors Microsoft and Google, which already offer similar services, as do other cloud providers that AWS competes with.

A new Rekognition Video service will let customers automatically analyze footage that they have in the cloud to detect important entities, sentiment, celebrities, and more. It also offers the ability to provide information that computer programs can use to track where people are inside a scene.

June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

To help customers get all of the relevant video information into the cloud, AWS launched Kinesis Video Streams. It’s a service in general availability today that’s designed to help customers securely ingest and store video, audio, and other time-encoded data like radar.

The announcement comes roughly a week after AWS announced updates to its Rekognition service that support recognizing faces in photos of crowds, along with real-time face-matching capabilities that make it possible to process large volumes of photos for matching with a central database of faces.

The new Transcribe service will, as its name implies, offer automatic transcription of long-form speech. It can process both high-quality recorded audio and recordings of phone conversations. AWS is starting the service with support for English and Spanish, and it plans to support additional languages soon.

Transcribe stands apart from other speech recognition services by focusing on generating transcripts with time stamps, as well as automatic punctuation generation that uses machine learning to make the resulting text more human-readable.

Customers will be able to translate text that they have in AWS, whether processed by Transcribe or brought in through other means, with a new Translate service. It offers automatic, machine learning-based translations for any text fed into it.

AWS also launched a service to provide applications with deeper understanding of content that they’ve been fed. Comprehend pulls out entities like people and places, plus key phrases and how positively users feel about the content in a document.

While that may not sound like much, that information can be used to help classify an otherwise difficult-to-process pile of documents, which has been a tough problem for computers to solve.

All of this comes as part of the AWS re:Invent conference in Las Vegas. Earlier today, the company announced a new SageMaker service that’s designed to make it easier for developers to build custom machine learning models without deep expertise.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Is your ai infrastructure ready for what's next?