Freelance projects

Freelance projects

Correct the mistakes in the work of the AI agent with the vector database.

AI & Machine Learning, Databases & SQL — incorrectly specified categories?

130 USD

Project translated automatically. Log in or register, to view the original

Hello.

I am creating an application. The system is built in such a way: The user makes a request in the application chat - the AI Agent searches for information in the database and provides a response. If the user's request goes beyond the database, then the agent refers to Gemini.

The database is Supabase. Data from two government websites is uploaded there via API.

A flow is built in the backend using n8n (the same agent). It runs through the database and provides answers and comments.

In Supabase, tables with data from 2 websites are created. They are combined into 1 table document_chunks.

The problem is that the AI Agent finds information only in 1 table and does not seem to see the data from the second one. Even if a request is made and the data that needs to be found is copied (we know that this data is there), the agent does not find it.
It is also necessary to build an algorithm for searching for the agent. The information base is huge, and I assume the agent does not know what is relevant and what is not. There is a lack of a "clusters" search system.

Update #1 from 24 January

Regarding different metadata. this is not an error, but an architectural decision. One table stores two fundamentally different types of documents:

Legislation (Sejm) - has the structure "Article/Paragraph".

Court decisions (SAOS) - have the structure "Case Number/Court". The fact that some rows have certain fields while others are NULL is normal practice for mixed databases. This allows the Agent to understand whether it is citing a law or a court ruling. They cannot be unified because they are different entities.

Regarding "we record some vectors, we search for others".

Both for writing to the database (parsing) and for searching (agent), the same model is used — Google Gemini Embeddings.

The vector dimension is fixed — 768.

If the sizes were different, the database would have thrown a transaction error at the stage of writing or searching, and the system would not work at all.

Regarding "multi-stage search" - here I probably agree, because I found no other options, it probably needs to implement hybrid search

Vectors are formed by the Google Gemini Embeddings model via API.

Technically, this is implemented through the Google Gemini Embeddings node in n8n, which sends text to Google servers, receives a vector array embedding in response, and passes it for writing to the database.

The incoming request is transformed into a vector using the Google Gemini model.

A function is called in Supabase that compares the query vector with the document vectors in the database.

Results are sorted by similarity coefficient (from highest to lowest).

The database returns a set number of Top-K most relevant text fragments, regardless of whether it is Sejm law or SAOS court decision, which are then passed into the AI context.

Physically, the data is in the database. I checked this through direct queries to the database — records of SAOS court decisions are loaded correctly.

The agent does not see these decisions at the moment of forming the response. The AI agent cannot correctly match a human question with the legal texts of court decisions during the search. It sees Sejm laws but "skips" SAOS rulings, and this is the main problem.

All logic and structure of the project are implemented standardly within your Supabase account (PostgreSQL). There is nothing external or hidden there.

The search functions are standard SQL functions, they are located directly in the database schema.

The vector size is clearly defined in the data types of the columns in the document table.

Any developer with access to the database can obtain this information in a minute by simply reviewing the Schema structure and the list of functions through an SQL query or the Supabase interface.

Rows where sejm_id exists and saos_id is NULL are legislative acts. They come from the Polish Sejm database. Since this is a law, it has a number in the Sejm register, but it is not a court decision, so it cannot have saos_id.

Rows where saos_id exists and sejm_id is NULL are court decisions from the SAOS database. Each ruling has its unique number in the court system. But a ruling is not a law, the Sejm did not adopt it, so it does not have and cannot have sejm_id.

Proposals 5 Rejected 4

Yevhenii N.

0 0

Projects -
Rating -
Rating 351

Budget: 990 PLN Deadline: 2 days

Good morning,
I have analyzed the problem with your AI agent and identified the main cause — vector competition between two different types of documents in one space. The documents of the Sejm (structured articles) always win against the SAOS rulings (unstructured legal text) in standard vector searches.
Solution
Architecture with two independent search streams:

Query classifier — determines the type of search (law/precedent/both)
Parallel search — separate queries with optimal configuration
Intelligent merging — merging with cross-reference (rulings → laws)

API cost optimization
Current system: ~$0.20-$0.25/query
New system: ~$0.03-$0.04/query
Reduction: 85% (with 1000 queries/month = savings of ~800 PLN/month)
Scope of work
I will perform remotely (9-10h):
Analysis and technical plan (2-3h)
SQL functions + configuration (2-2.5h)
Query classifier (2h)
Basic n8n workflow (3-3.5h)

You will perform independently (savings of ~4-5h):

Data migration (ready SQL script with instructions)
Import workflow + configuration of credentials
Testing according to the checklist
Fine-tuning parameters

You will receive: SQL scripts, n8n workflow JSON, step-by-step instructions, checklist, Telegram support
Pricing
Standard calculation: 1000-1350 PLN (9-10h × 25-30 EUR)
Proposal: 990 PLN
Discount of ~30% from the standard rate
Optionally: full "turnkey" implementation +200 PLN (TeamViewer debugging, extended testing)

If the proposal is OK, we can start with the analysis (2-3h) — you will receive a specific plan and confirmation of feasibility.
Best regards,
Eugeniusz

Roman Z.

7 0

Budget: 500 PLN Deadline: 1 day

I am among the top 5 developers in the category of "Artificial Intelligence and Machine Learning" among ~2100 specialists on the platform.
I guarantee:
- Fast and high-quality task execution
- Strict adherence to deadlines
- Regular communication throughout the entire process
I would be happy to discuss the details of your project in private messages.
The price is conditional.

Hlib U.

4 0

Budget: 500 PLN Deadline: 3 days

Good day. I looked at the description of the problem, a typical case.

It seems that part of the data is either not indexed due to embeddings or simply does not appear in the search. Plus, the agent currently lacks proper selection logic, so it gets lost in the large database and does not understand where to search for what.

I have worked with Supabase, vector search, and RAG systems. I can find out why the second part of the data is not being found. Rebuild the search logic. Make it so that Gemini connects only when there is truly no answer in the database.
As a result, the agent will start to consistently find data and respond adequately.

I am ready to look at your current implementation and suggest a solution.

Jeo Vincent C.

4 2

Budget: 5000 PLN Deadline: 15 days

Hello,

I’m very interested in helping improve your AI Agent’s search capabilities and ensure it reliably retrieves both Sejm legislation and SAOS court decisions. I have extensive experience building and optimizing backend systems with large, multi-entity databases, including Supabase/PostgreSQL, and integrating vector-based AI search workflows.

I can implement a robust, hybrid search algorithm that accounts for multiple document types, improves relevance ranking across large datasets, and introduces clustering or entity-aware retrieval to ensure the AI Agent does not skip any source. I’m comfortable working with n8n workflows, embedding models like Google Gemini, and designing scalable solutions for high-volume, structured data.

I’m confident I can enhance the search logic, improve result accuracy, and ensure the AI Agent fully leverages your combined dataset.

Best regards,
Jeo Vincent Carretas

Symon Baikov

24 1

Budget: 1000 PLN Deadline: 5 days

Hello, Maximilian! Understanding your situation, I propose an integrative approach to solving the problem with the AI agent. First, I will conduct a detailed audit of the existing flow in n8n and the structure of your Supabase database. Relying on my experience in system architecture design, I will quickly identify and resolve the issue with searching among "document_chunks." Additionally, I will develop an optimal clustering algorithm that will enhance the accuracy and relevance of the agent's results. I will configure the system for seamless and scalable operation. Shall we discuss the project further?

Andrii Piatushka

46 0

Projects 48
Rating -
Rating 2 188

Budget: 500 PLN Deadline: 1 day

Good day. Without a real understanding of what you have done there, it is difficult to say the real deadlines and costs. Write to me in private messages, we will discuss in more detail. For now, just preliminary questions.

Are you sure that you have two tables? Or maybe you have two databases? From the picture, nothing is clear, except that there is one table. What does the column chunk_index mean in this table?

What is the dimensionality of the embeddings? What are the sizes of the chunks? Why sub-bases and not specialized solutions?

Maximilian D
Warszawa, Poland

Projects 31
Contests 1
Rating 4.9
Rating 1 102

Correct the mistakes in the work of the AI agent with the vector database.

Yevhenii N.

Roman Z.

Hlib U.

Proposals concealed

Proposals are currently absent

Jeo Vincent C.

Symon Baikov

Andrii Piatushka

Proposals concealed