Voice recognition: has AI just beaten a human?

Microsoft’s artificial intelligence (AI) system can now understand conversational human speech better than a trained transcriptionist, and is less prone to error.

The AI’s error rate in understanding speech moved down to about 5.9% from 6.3%, which puts it slightly below the human error rate.

Error rate refers to the number of times a human, or machine, mishears words.

“We [improved] on our recently reported conversational speech recognition system by about 0.4%, and now exceed human performance by a small margin,” the report stated.

This news comes only a month after Microsoft achieved the 6.3% error rate. It is learning fast.

The research found, however, that the error rates of human transcribers can vary between 4.1% to 9.6%, depending on how well they concentrate on the transcription.

The near-perfect accuracy, regardless, is somewhat of a breakthrough and should have significant impacts on Microsoft’s AI tools, including its virtual personal assistant Cortana.

Although it is unclear exactly how the real-world applications, where background noise and multiple speakers are significant issues, will take form.

Perhaps, simply, less speech-enabled error when interacting with smartphones or, in the future, autonomous cars.

As reported by The Verge citing a statement from the company, Microsoft’s chief speech scientist Xuedong Huang said that they had “reached human parity,” and called the improvement in speech recognition “an historic achievement.”

This human parity was achieved by optimising “convolutional and recurrent neural networks”, using 2,000 hours of voice recorded data.

Microsoft’s AI voice recognition announcement reflects a the focus the company has placed on the technology.

Indeed, last month Microsoft CEO Satya Nadella laid out the organisation’s 4 pillar plan for democratizing AI, and said that its cloud platform Azure is becoming the first AI supercomputer.

Nick Ismail

Nick Ismail is a former editor for Information Age (from 2018 to 2022) before moving on to become Global Head of Brand Journalism at HCLTech. He has a particular interest in smart technologies, AI and... More by Nick Ismail

Voice recognition: has AI just beaten a human?

Nick Ismail

Related Topics

Related Stories

Why synthetic data is pivotal to successful AI development

Why AI needs a kill switch – just in case

AI vs AI – are cybercriminals or organisations winning?

What the AI Opportunities Action Plan could mean for the UK

Related Stories

Why synthetic data is pivotal to successful AI development

Why AI needs a kill switch – just in case

AI vs AI – are cybercriminals or organisations winning?

Bridging the execution gap – why AI is the new frontier for corporate strategy