Study Smarter With AI Tools
Transcribe Lecture Videos with Open AI Whisper

Software engineer @ Metosin Ltd Need help with a project, contact: first.last@metosin.com As a 𝐜𝐨𝐧𝐬𝐮𝐥𝐭𝐚𝐧𝐭, I help clients find technical solutions to their business problems and facilitate communication between the stakeholders and the technical team. As a 𝐟𝐮𝐥𝐥-𝐬𝐭𝐚𝐜𝐤 𝐝𝐞𝐯𝐞𝐥𝐨𝐩𝐞𝐫, I build technical solutions for client's problems from user interfaces, and backend services to infrastructure-as-code solutions. As a 𝐦𝐚𝐜𝐡𝐢𝐧𝐞 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 𝐞𝐧𝐠𝐢𝐧𝐞𝐞𝐫, I create, validate and deploy predictive models.
The remote culture has its advantages. Especially when it comes to studying and working at the same time. It's super valuable to have access to the lecture recordings later for you to be able to do the study part outside office hours.
As a resourceful engineer, this also opens up new possibilities. If you think about it, do we need to listen to the two-hour lecture when we can just transcribe the lecture into text and browse through it quickly to see if there's any additional useful information besides the source materials?
The answer to this question is of course not.
What I did, I fired up a Google Colab Notebook and installed Open AI's whisper.
!pip install git+https://github.com/openai/whisper.git
I extracted the audio from the lecture videos with ffmpeg and uploaded the audio files to Drive and then loaded these files into the Notebook.
from google.colab import drive
drive.mount('/content/drive')
Import Whisper and load the model. See the available configurations in the docs. The next step is to just iterate over the files to transcribe and enjoy the results via text search.
import whisper
import os
model = whisper.load_model("medium")
path = "drive/MyDrive/MY_DRIVE_PATH"
files = os.listdir(path)
files
# ['audio1025532515.m4a', 'audio1358476021.m4a']
for f in files:
result = model.transcribe(f"{path}/{f}")
with open(f"{f}.txt", "w") as f:
f.write(result['text'])
And that's it.
print(result["text"])
# Huomentapäivää. Aloitellaan tämän päivän luentoa....
Now I have access to the transcriptions in my Drive!
os.listdir(path)
# ['audio1025532515.m4a',
# 'audio1358476021.m4a',
# 'audio1025532515.m4a.txt',
# 'audio1358476021.m4a.txt',]
I thought that it would be nice to share as I found this useful, I hope that you do too! Thanks for reading.




