Presented at 10:30am in Cotton Creek I on Friday, November 22, 2019.
#29459Speaker(s)
- Casey Pearson, Events Specialist, 3Play Media
Session Details
- Length of Session: 1-hr
- Format: Lecture
- Expertise Level: Beginner
- Type of session: General Conference
Summary
We'll cover two different speech technologies – ASR and synthesized speech - and where they succeed and fail when it comes to accessible video. We expect to see continuous improvements in ASR for captioning and transcription. However, with current technology, we have a starting point of (about) 80% accuracy, which is not acceptable for captioning.
Abstract
What is the current state of speech technology? Do we still need humans? In this presentation, we will look at where speech technology succeeds and where it fails when it comes to captioning and description. We’ll discuss whether automatic speech recognition (ASR) will be sufficient for closed captioning – or even for live captioning and if synthesized speech is a real option for audio description output.
We’re often asked these questions, and it’s no wonder: the more organizations can rely on automated technology, the cheaper these expensive accessibility requirements will become. While utilizing speech technology has never been a more viable option, there are still weaknesses that make humans a necessity.
Keypoints
- Why speech recognition for captioning and transcription is harder
- Current ASR capabilities for captioning and transcription
- Pros and cons of synthesized speech for audio description
Disability Areas
Cognitive/Learning, Deaf/Hard of Hearing, Vision
Topic Areas
Accessible Educational Materials, Assistive Technology, Legal, Uncategorized
Speaker Bio(s)
Casey Pearson
Casey is the Events Specialist for 3Play Media. She is the primary contact for AHG. The speaker for this proposal is Lily Bond.