Challenges
The time-intensive nature of transcription has made many oral history collections an undervalued format in digital initiatives. Accessibility standards require both accurate transcriptions and a keyboard navigable interface which the OHD template, created by Devin Becker, addresses.
Despite this front end advancement, the initial transcription process has remained a significant hurdle. While machine learning speech-to-text has improved, earlier free tools were inadequate, and manual transcription can be tedious and prone to errors without close supervision. Tagging, handled by student workers, also faced issues like uncontrolled vocabulary, knowledge gaps, and biases from linear listening.
✺
Linear listening, or creating tags by listening to an oral history collection from beginning to end, may mislead transcribers by establishing repeating themes that don’t occur across the collection or missing themes that only begin to appear in later recordings. The name of this presentation, distant listening, is an alternate approach which text mines combined transcripts and generates tags before the student worker begins the copy editing process, with the goal of producing richer, more accurate data, ultimately allowing researchers to identify more connections across entire oral history collections.
✺