The remote culture has its advantages. Especially when it comes to studying and working at the same time. It's super valuable to have access to the lecture recordings later for you to be able to do the study part outside office hours.
As a resourceful engineer, this also opens up new possibilities. If you think about it, do we need to listen to the two-hour lecture when we can just transcribe the lecture into text and browse through it quickly to see if there's any additional useful information besides the source materials?
The answer to this question is of course not.
What I did, I fired up a Google Colab Notebook and installed Open AI's whisper.
!pip install git+https://github.com/openai/whisper.git
I extracted the audio from the lecture videos with ffmpeg
and uploaded the audio files to Drive and then loaded these files into the Notebook.
from google.colab import drive
drive.mount('/content/drive')
Import Whisper and load the model. See the available configurations in the docs. The next step is to just iterate over the files to transcribe and enjoy the results via text search.
import whisper
import os
model = whisper.load_model("medium")
path = "drive/MyDrive/MY_DRIVE_PATH"
files = os.listdir(path)
files
# ['audio1025532515.m4a', 'audio1358476021.m4a']
for f in files:
result = model.transcribe(f"{path}/{f}")
with open(f"{f}.txt", "w") as f:
f.write(result['text'])
And that's it.
print(result["text"])
# Huomentapäivää. Aloitellaan tämän päivän luentoa....
Now I have access to the transcriptions in my Drive!
os.listdir(path)
# ['audio1025532515.m4a',
# 'audio1358476021.m4a',
# 'audio1025532515.m4a.txt',
# 'audio1358476021.m4a.txt',]
I thought that it would be nice to share as I found this useful, I hope that you do too! Thanks for reading.