automatic transcription of audio recordings?

5 Antworten [Letzter Beitrag]
chaosmonk

I am a member!

I am a translator!

Offline
Beigetreten: 07/07/2017

I have many hours of footage that I need to go though, extract interesting parts, and organize into clips that I can easily sort through later. I wonder if there is some free software that could help do this faster. Does anyone know of some free software that can automatically transcribe an audio recording and insert timestamps into the transcription? It does not need to be 100% accurate, as long as it is coherent enough for me to read through to find the interesting parts, and then easily find the spot in the video to extract and clean up the transcription. (The final output I would like is a bunch of clips, and a file containing a list of filenames of clips with transcriptions of each.)

jxself
Offline
Beigetreten: 09/13/2010

CMUSphinx

https://cmusphinx.github.io/2011/07/cmusphinx-power-video-subtitle-editing-tool/

The program in question: https://otsaloma.io/gaupol/

Granted it's for making subtitles but it seems to meet the criteria. It'd be text and the subtitles have times in them.

Or perhaps more generally, find a way to use the pocketsphinx plugin for the GStreamer multimedia framework. But that might be more work.

chaosmonk

I am a member!

I am a translator!

Offline
Beigetreten: 07/07/2017

Thanks. That would have worked, but it looks like Gaupol has since removed[1] that feature. Maybe I can try an older version.

I once tried to use CMUSphinx on its own to automatically transcribe speech, but found it too confusing and could not get it to do anything. Maybe I'll try again.

[1] https://github.com/otsaloma/gaupol/commit/bd49d8d40fbe669282bb770cb83d2e3a189afef5

andyprough
Offline
Beigetreten: 02/12/2015

You may want to look at these two projects:
https://github.com/raryelcostasouza/pyTranscriber
https://github.com/espy/transcribe

I don't know anything about them other than that they exist and claim to do what you are looking for.

chaosmonk

I am a member!

I am a translator!

Offline
Beigetreten: 07/07/2017

Thanks, neither of these is suitable though.

> https://github.com/raryelcostasouza/pyTranscriber

The UI is free software, but the actual work is done on Google's servers, which is a no-go for me.

> https://github.com/espy/transcribe

This looks like an in-browser equivalent to Gaupol. It facilitates manual transcription, but does not provide automatic transcription.

Today I found some relatively easy-to-use Python bindings[1] to pocketsphinx, which I was able to use to get a rough transcription of an audio recording. I'm going to see if I can work out a Python script that does what I need with the timestamps and everything. After a day of searching, it seems that most free software automatic transcription tools either use CMUSphinx, Google, or Mozilla's DeepSpeech[2] (which uses Google's TensorFlow software but does not appear to rely on Google's servers) as their speech-to-text backend. I may look into DeepSpeech, but right now CMUSphinx seems like the quickest way to get a usable result with software that works offline.

[1] https://github.com/bambocher/pocketsphinx-python

[2] https://github.com/mozilla/STT

Beko
Offline
Beigetreten: 08/31/2019

This https://simon.kde.org/ seems like it uses CMUSphinx. (idk if works)
I also found https://github.com/julius-speech/julius.

I haven't seen any reason to think either are SaaS, I may be wrong. Goodluck transcribing