Challenges
The time-intensive nature of transcription has made many oral history collections an undervalued format in digital initiatives. Meeting accessibility standards involves not only transcribing recordings but also presenting them in an intuitive, keyboard navigable digital interface. OHD developer Devin Becker’s solution in the Oral History as Data (OHD) template displays the audio at the top of the page, followed by a visualization of the entire recording displaying the colored tags, a key to the tags, a search bar for keyword queries and the transcription below. This allows researchers to follow along with the timestamped transcript as the audio plays.

Despite this advancement, the initial transcription process has remained a significant hurdle. Since OHD’s development in 2016, machine learning speech to text capabilities have improved considerably. Earlier free options were either so poor that they were negligible to working from scratch, while other options were prohibitively expensive. Completely human driven transcription has its own challenges: it’s tedious, slow moving work that isn’t going to be the highlight of a student worker’s CV, and, without close supervision, can result in lapses in quality not dissimilar to poor machine learning.
✺
Similar lapses in identification and vulnerability to bias were also at play when student workers created tags for OHD transcriptions. Tags—ranging from locations, people, or even abstract emotions—were created by student workers as they identified them throughout the transcription process. This approach led to multiple challenges: an uncontrolled vocabulary, knowledge gaps amongst student workers, who may lack the historical, scientific or regional knowledge to identify the tags within dialogue and tags that are not apparent if transcribing a collection of recordings linearly.

Linear listening, creating tags by listening to an oral history collection from beginning to end, may mislead transcribers by establishing repeating themes throughout an early recording that doesn’t occur across the collection and missing themes that only begin to appear in later recordings. The name of this presentation, distant listening, is an alternate approach which text mines transcriptions and generate tags before the student worker copy editing process with the goal of producing richer, more accurate tagging, ultimately allowing researchers to more easily identify connections across entire oral history collections.
✺