Autonomous AI RAG system with vector knowledge base

AI & Machine Learning

Job 2 of 2

Designed and implemented from scratch an asynchronous RAG system (Retrieval-Augmented Generation) for intelligent analysis and search of complex technical documentation or internal company knowledge bases.

What was implemented in the project:
• Asynchronous backend: High-performance API on FastAPI with runtime validation of input data through Pydantic v2.
• Vector core: Native semantic search in the Qdrant database using cosine similarity metric with local embeddings (vector dimension — 384, float32).
• AI orchestration: The agent's logic is built on graph structures LangGraph (StateGraph) with a single thread-safe state, allowing easy addition of regeneration cycles or response validation nodes.
• Chunking strategy: Implemented intelligent text chunking (400 characters) with overlap (100 characters), completely eliminating context loss at sentence boundaries and removing model hallucinations.

The system is flexible: it can work with both local models (via Ollama) and cloud APIs (Gemini, Claude, OpenAI). The entire infrastructure is fully containerized using Docker Compose and ready for deployment on a server.