Tools for Automating The Captioning of Video

Bios & Handouts

Scheduled at 2:15pm in WB II on Wednesday, November 15.

#9081

Speaker(s)

  • Joseph Polizzotto, Access Technology Specialist Instructor, High Tech Center Training Unit
  • Joshua Hori, Accessible Technology Analyst, UC Davis

Session Details

  • Length of Session: 2-hr
  • Format: Lecture
  • Expertise Level: Intermediate
  • Type of session: General Conference

Summary

Creating synchronized video captions can be a time-consuming and tedious process. In this session, we will demonstrate how to use a range of free to low-cost tools that can speed up this process. We discuss the use of speech recognition tools to produce a "raw" transcript and the use of a forced aligner to synchronize a transcript with a video.

Abstract

Many transcribers and video editors may experience a pain point in their production workflow when it comes to creation of a closed caption file (e.g., SRT, VTT, SBV). While many paid tools and services exist that can help with automatic synchronization of audio and text, why not consider using a fast and accurate tool that is open-source? And why not try to use speech recognition to quickly generate a transcript for an audio file while you're at it?

In this session, we will demonstrate how to use a range of speech recognition tools (IBM Watson, Google Speech, and Dragon) that can be used for generating a "raw" transcript of a video, and we will compare the accuracy of each tool. Second, we will demonstrate how to use Aeneas, a free tool, which can quickly and accurately synchronize a transcript with a video. We will discuss best practices and provide handouts.

Keypoints

  1. Compare accuracy of a variety of speech recognition tools for generating a video transcript
  2. Share best practices for segmenting a transcript in a captioning workflow
  3. Demonstrate use of Aeneas for synchronizing audio files with text files

Disability Areas

Deaf/Hard of Hearing

Topic Areas

Alternate Format, Faculty Instruction/Accessible Course Design, Uncategorized, Web/Media Access

Speaker Bio(s)

Joseph Polizzotto

Joseph Polizzotto is access technology specialist instructor at the High Tech Center Training Unit (HTCTU) of the California Community Colleges, where he trains college faculty and staff on alternate media and assistive technology. His recent research interests include automation of closed captioned files and EPUB 3 reading systems. Joseph graduated with a B.A. in History from UC Santa Cruz in 2000 and received an M.A. in TESOL from San José State University in 2004.

Joshua Hori

Joshua is the Accessible Technology Analyst for the Student Disability Center at the University of California, Davis campus. The past 10 years he has been involved with alternate media, web accessibility, and accessible technology implementations across the campus for students with disabilities, but believes they should be used by all students. He has also served as tech consultant for the UC Davis MIND institute in research on mobile apps and organization for students with learning disabilities and students on the spectrum. He co-authored a Chapter concerning assistive technologies in the book: The Guide to Assisting Students With Disabilities: Equal Access in Health Science and Professional Education.

Handout(s)