Hey, Siri — Are We There Yet? The State of ASR & Synthesized Speech for Captioning & Description

Handouts Media

Presented at 10:30am in Cotton Creek I on Friday, November 22, 2019.

#29459

Speaker(s)

  • Jena Wallace, Content Marketing Specialist, 3Play Media

Session Details

  • Length of Session: 1-hr
  • Format: Lecture
  • Expertise Level: Beginner
  • Type of session: General Conference

Summary

We'll cover two different speech technologies – ASR and synthesized speech - and where they succeed and fail when it comes to accessible video. We expect to see continuous improvements in ASR for captioning and transcription. However, with current technology, we have a starting point of (about) 80% accuracy, which is not acceptable for captioning.

Abstract

What is the current state of speech technology? Do we still need humans? In this presentation, we will look at where speech technology succeeds and where it fails when it comes to captioning and description. We’ll discuss whether automatic speech recognition (ASR) will be sufficient for closed captioning – or even for live captioning and if synthesized speech is a real option for audio description output.

We’re often asked these questions, and it’s no wonder: the more organizations can rely on automated technology, the cheaper these expensive accessibility requirements will become. While utilizing speech technology has never been a more viable option, there are still weaknesses that make humans a necessity.

Keypoints

  1. Why speech recognition for captioning and transcription is harder
  2. Current ASR capabilities for captioning and transcription
  3. Pros and cons of synthesized speech for audio description

Disability Areas

Cognitive/Learning, Deaf/Hard of Hearing, Vision

Topic Areas

Accessible Educational Materials, Assistive Technology, Legal, Uncategorized

Speaker Bio(s)

Jena Wallace

Jena Wallace is an accessibility advocate with over a decade of experience in the media accessibility industry.

Jena’s lived experience with single-sided deafness inspired her to pursue a career in accessibility. With 8 years of hands-on experience as an offline closed captioner, Jena is deeply knowledgeable about captioning & subtitling workflows, processes, and best practices. She managed the entirety of Captionmax’s North American prerecorded captioning division before transitioning to content marketing at 3Play Media, where she focuses on storytelling around the people and technologies transforming accessibility and disability advocacy.

Jena holds a B.A. in English from the University of Minnesota-Twin Cities. She is committed to sharing knowledge and continuous learning about the nuances, technicalities, innovations, and ethics surrounding accessibility solutions.

Handout(s)

AHG 2019_ Hey, Siri! Are we there yet_ (1)AHG 2019_ Hey, Siri! Are we there yet_