Creation of a pipeline for the knowledge base of the AI training course
The client has access to an AI course on a separate platform. A pipeline has been created that:
1. Downloads all video files from the platform or from YouTube (part of the course videos were there);
2. Extracts the audio track from each video;
3. Cuts the audio into parts of up to 15 minutes and sends it via the AI API for transcription.
As a result, we obtained 100+ PDF files on each topic, cleaned of filler words and unnecessary information.
Next, a separate space was created in Notebook LM, where all the processed information was uploaded.
Now, with this knowledge base, it can be used in any necessary way: asking questions, making presentations, developing tests for employee training, creating audio files on the required topic for convenience in learning, etc.
Stack:
Python
API Gemini
Notebook LM
#python #gemini #notebook
1. Downloads all video files from the platform or from YouTube (part of the course videos were there);
2. Extracts the audio track from each video;
3. Cuts the audio into parts of up to 15 minutes and sends it via the AI API for transcription.
As a result, we obtained 100+ PDF files on each topic, cleaned of filler words and unnecessary information.
Next, a separate space was created in Notebook LM, where all the processed information was uploaded.
Now, with this knowledge base, it can be used in any necessary way: asking questions, making presentations, developing tests for employee training, creating audio files on the required topic for convenience in learning, etc.
Stack:
Python
API Gemini
Notebook LM
#python #gemini #notebook