Switch to English?
Yes
Переключитись на українську?
Так
Переключиться на русскую?
Да
Przełączyć się na polską?
Tak

Transcription Project — automation of creating transcriptions from video

This is a Python script designed for the automatic extraction of audio from video files and subsequent transcription using Vosk, one of the most accurate speech recognition models. The project aims to process video lectures, allowing for the automatic generation of text transcriptions for the creation of educational materials.

Functionality:

1. Extracting audio tracks from video files.
2. Converting audio files to mono format with a frequency of 16000 Hz for better recognition.
3. Full transcription of audio to text.
4. Detailed logging of all stages of the process.
5. Deleting temporary files to save space on the server.

Key technologies:

• Vosk: for automatic transcription.
• MoviePy: for extracting audio tracks from video.
• Pydub: for processing and normalizing audio files.
• TQDM: for displaying processing progress.

Resolved tasks and challenges:

• The audio quality issue was resolved by converting to mono and normalizing the frequency.
• High server load due to large video volumes was addressed by automating the deletion of temporary files after transcription.
• Performance optimization through the use of a progress bar to track the current status.

Results:

This project provided the client with a tool for the quick and automatic creation of lecture transcriptions. This significantly reduced the time for video processing and allowed for the provision of ready text materials for further use.

Tags (hashtags):

#python #transcription #speech-to-text #audioextraction #automatedworkflow #vosk #pydub #moviepy #audioprocessing #audiotranscription
Work details
Budget 200 USD
Added 30 September 2024
227 views
Freelancer
Roman K.
Ukraine Kyiv  47  0

Available for hire Available for hire
46 Safes completed
1 arbitration
On the service 7 years