Digital co-founder.
Goal: Create an intelligent AI partner for the owner of a construction and development company. The system was to combine an offline knowledge base (Obsidian) with the power of cloud AI (OpenAI GPT-4o). Key requirements:
- RAG (Retrieval-Augmented Generation): Responses must be based solely on the internal regulations and documents of the company.
- Bidirectional communication: The agent must not only "read" the database but also "write" to it (create new files/regulations upon command in Telegram).
- Resource efficiency: Smart indexing to avoid re-reading unchanged files.
My Contribution / Solution:
The solution is built on a Self-hosted n8n (Railway), vector database Supabase, and Google Drive cloud storage. The architecture consists of 3 complex workflows:
1. Workflow "Smart Indexer" (ETL Pipeline):
Google Drive (Recursive Search): A complex algorithm for searching files (.md, .txt, .pdf) across the entire drive has been implemented, traversing nested folders and filtering out "foreign" files.
Incremental Sync (Cost Savings): Logic for comparing metadata has been developed. The workflow compares files from the drive with the file_tracker table in Supabase (SQL). Only new or modified files are sent for processing (Embedding). This saves up to 90% of OpenAI tokens.
Vectorization: Text is broken into chunks, converted into vectors (OpenAI Embeddings), and stored in Supabase.
2. Workflow "Brain" (Conversational AI Agent):
AI Agent (LangChain): Uses the GPT-4o model with a custom system prompt "Digital Co-Founder".
Long-term Memory: Connected to Postgres Chat Memory (in Supabase), allowing the bot to remember the context of dialogues indefinitely.
Vector Store Tool: A search tool has been implemented that uses a custom SQL function match_documents to find the most relevant answers in the knowledge base.
3. Workflow "Hands" (File Generator Tool):
Autonomous content creation: The agent can invoke this sub-workflow to create new documents.
Smart Parsing (JavaScript): A sanitizer script has been written that parses the AI response (even if it comes in a non-standard format) into filename and content.
Write-back: The file is uploaded to Google Drive, after which it is automatically synchronized with the client's local Obsidian via Google Drive Desktop.
Result:
The client received a fully autonomous knowledge management system:
"Live" Database: Any change in an Obsidian note automatically goes into the "brain" of the bot.
Strategic partner: The owner can consult the bot regarding strategy, and the bot responds based on the history and context of the company, rather than general phrases.
Routine automation: The bot works as a secretary — creating drafts of contracts, ideas, and plans directly in the owner's working folder.
Reliability: Issues with server timeouts and data duplicates have been resolved through SQL optimization and Railway settings.
#n8n #OpenAI #RAG #Supabase #VectorDatabase #PostgreSQL #Obsidian #KnowledgeManagement #WorkflowAutomation #JavaScript #Railway #SelfHosted #GoogleDriveAPI #AIagent
- RAG (Retrieval-Augmented Generation): Responses must be based solely on the internal regulations and documents of the company.
- Bidirectional communication: The agent must not only "read" the database but also "write" to it (create new files/regulations upon command in Telegram).
- Resource efficiency: Smart indexing to avoid re-reading unchanged files.
My Contribution / Solution:
The solution is built on a Self-hosted n8n (Railway), vector database Supabase, and Google Drive cloud storage. The architecture consists of 3 complex workflows:
1. Workflow "Smart Indexer" (ETL Pipeline):
Google Drive (Recursive Search): A complex algorithm for searching files (.md, .txt, .pdf) across the entire drive has been implemented, traversing nested folders and filtering out "foreign" files.
Incremental Sync (Cost Savings): Logic for comparing metadata has been developed. The workflow compares files from the drive with the file_tracker table in Supabase (SQL). Only new or modified files are sent for processing (Embedding). This saves up to 90% of OpenAI tokens.
Vectorization: Text is broken into chunks, converted into vectors (OpenAI Embeddings), and stored in Supabase.
2. Workflow "Brain" (Conversational AI Agent):
AI Agent (LangChain): Uses the GPT-4o model with a custom system prompt "Digital Co-Founder".
Long-term Memory: Connected to Postgres Chat Memory (in Supabase), allowing the bot to remember the context of dialogues indefinitely.
Vector Store Tool: A search tool has been implemented that uses a custom SQL function match_documents to find the most relevant answers in the knowledge base.
3. Workflow "Hands" (File Generator Tool):
Autonomous content creation: The agent can invoke this sub-workflow to create new documents.
Smart Parsing (JavaScript): A sanitizer script has been written that parses the AI response (even if it comes in a non-standard format) into filename and content.
Write-back: The file is uploaded to Google Drive, after which it is automatically synchronized with the client's local Obsidian via Google Drive Desktop.
Result:
The client received a fully autonomous knowledge management system:
"Live" Database: Any change in an Obsidian note automatically goes into the "brain" of the bot.
Strategic partner: The owner can consult the bot regarding strategy, and the bot responds based on the history and context of the company, rather than general phrases.
Routine automation: The bot works as a secretary — creating drafts of contracts, ideas, and plans directly in the owner's working folder.
Reliability: Issues with server timeouts and data duplicates have been resolved through SQL optimization and Railway settings.
#n8n #OpenAI #RAG #Supabase #VectorDatabase #PostgreSQL #Obsidian #KnowledgeManagement #WorkflowAutomation #JavaScript #Railway #SelfHosted #GoogleDriveAPI #AIagent