Media transcription at cloud scale: opportunities for museum collections?

Media transcription at cloud scale: opportunities for museum collections?Ali HaberfieldBlockedUnblockFollowFollowingDec 2Co-authored with Candice Cranmer (ACMI Collections)A sample automatically generated caption from “Flash Back: ACMI Collections Remix White Night 2017”As a museum of film, video games, digital culture and art, ACMI holds a large collection of video and audio files of great value to researchers, artists and the public..How do we make sure people can find the works that are most relevant to them, and how can we make sure that they can access and make use of them once they’re found?At its most basic, transcription creates an accompanying timecoded text for each digitised video or audio file in the collection, and has the potential to hugely improve both the accessibility and discoverability of a media collection like ACMI’s..Spoken in crisp, government-standard English with excellent audio quality, even from 60 years ago, they closely match what Amazon Transcribe expects words to sound like, and had an accuracy rate of above 90%.A successful transcription of the narrator voice-over of “Don’t Be A Fall Guy”When it comes to accented English, some clips transcribed clearly, and others didn’t, and our expectations about which would work and which wouldn’t were frequently wrong.Audio track quality (such as an echoing sound on the voice) or background noise has a much bigger impact on transcription accuracy than accent variation does..Taken from “Carmel: Storyteller”.How does the cost work out?The price of Amazon Transcribe is around 2c per minute, compared to a cost of around $1 per minute for commercial transcription services..These are after all, not just files of unattributed audio-visual content — they are the labour of someone’s creative output, personal story or family treasure acquired by the museum to ensure their legacy.Some of the questions we found ourselves asking to shape the next steps in this project were:Do the obvious advantages of providing access to audio-visual content through transcription services outweigh the need to ensure transcription is always 100% accurate?If so, would a percentage of accuracy per film suffice?.For example, could we use the word-by-word confidence rating returned by Amazon Transcribe to automatically flag videos that require manual checking, and those that could released without checking?How can we avoid a situation where works dominated by mid-20th Century government-standard English (overwhelmingly spoken by white men in our training films and documentary style content) continue a position of privilege by having the greatest reach in our transcribed collection?Automated transcription is a powerful blunt instrument to apply to our large video collection.. More details

Leave a Reply