Python script for bulk uploading documents from two APIs
For the mobile application to work, a database is required, which the LLM will use after the user's request in the chat. The database must be in Supabase. If there is no answer in the database, then the LLM will refer to an external source via API.
1. Project Description
Develop an asynchronous Python script for bulk uploading documents:
SITE API to upload all available documents of all types for the entire period.
SITE API: upload all documents starting from the year 2000.
Store the full text, all fields, and metadata of each document. The target database is Supabase (PostgreSQL).
2. Functional Requirements
Tasks for the performer on Python 3.8+:
A. Asynchronous Bulk Upload via API
Use
aiohttp+asynciofor asynchronous operation.Parallelization of requests (10–50 simultaneous connections, without exceeding the allowable API limits).
Support for pagination (
pageSizemaximum for SAOS — 100, for Sejm clarify in the documentation).For SAOS implement filtering
judgmentDate >= 2000-01-01.Save all received fields — meta and full text of the decision.
B. Saving in Supabase
Batch saving of data (batch insert, up to 1000–5000 records at a time).
Use
supabase-pyfor integration.Develop SQL schema:
Separate tables for sejm_documents and saos_judgments.
Store metadata in JSONB.
Indexes for key fields (date, court, document type).
C. Reliability and Process Control
Automatically resume progress after a failure (checkpoint file).
Retries on failures (up to 3 times, with exponential backoff).
Detailed logging — execution time, received objects, errors.
D. Vectorization for Search
After import — chunking each document (25 chunks/document; size to be discussed, approximately 1500–2000 characters).
Store chunks in a separate table (
document_chunks), referencing the original document.(Optional) Form for further vectorization via LLM API (Gemini Flash 2.5 or another equivalent).
3. Input Data
Sejm API: all documents (according to documentation), all types, all years.
SAOS API: all courts, court decisions since 2000.
Target DB: new Supabase project (PostgreSQL), account and keys provided by the client.
Expected quantity: 160,000+ , 520,000+.
4. Data Structure (SQL Schema)
Table: sejm_documents
sqlCREATE TABLE sejm_documents (
id BIGSERIAL PRIMARY KEY,
source_id TEXT UNIQUE NOT NULL,
document_type TEXT,
title TEXT,
content TEXT,
metadata JSONB,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP,
is_processed BOOLEAN DEFAULT FALSE
);
Table: saos_judgments
sqlCREATE TABLE saos_judgments (
id BIGSERIAL PRIMARY KEY,
source_id TEXT UNIQUE NOT NULL,
court_type TEXT,
case_number TEXT,
judgment_date DATE,
text_content TEXT,
metadata JSONB,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP,
is_processed BOOLEAN DEFAULT FALSE
);
Table: document_chunks
sqlCREATE TABLE document_chunks (
id BIGSERIAL PRIMARY KEY,
document_id BIGINT REFERENCES sejm_documents(id) ON DELETE CASCADE,
chunk_index INT,
chunk_text TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
5. Deliverables (what must be included in the result)
Asynchronous Python script (+ detailed instructions for running)
SQL scripts for creating the necessary tables
README for deploying the project from scratch
.env.examplefor configuring Supabase keysLog file, checkpoint file for monitoring progress
(Optional for MVP): chunking test documents for RAG/LLM indexing verification
6. Requirements
Documentation for the script!
Easy launch for the owner without programming experience.
Client's review of cooperation with Illia Antipiev
Python script for bulk uploading documents from two APIsIlya, thank you for your work. The project was not easy, but you managed. It took much more time than planned. Completing the task at 100% required patience and many corrections as well as changes from our side. In any case, thank you!
Freelancer's review of cooperation with Maximilian D
Python script for bulk uploading documents from two APIsThank you for the collaboration! All materials and accesses were provided on time and updated as necessary. The client was accommodating when they had to postpone execution for personal reasons. A small downside - sometimes I have to answer questions from a person who lacks context. So I had to repeat myself 😔 but credit should be given - they find some mistakes.
-
20 days2323 USD
169 20 days2323 USDGood morning,
I propose the development of an asynchronous Python script for bulk loading documents from the Parliament and SAOS API, storing them in Supabase, and preparing them for further vectorization for LLM. The script will support batch insert, checkpoints, retries, and detailed progress logging.
I offer a **quote of 8,500 PLN net** with an estimated completion time of **3–4 weeks**.
In my work, I will use asyncio + aiohttp for parallel data fetching, supabase-py for database integration, and I will also develop an SQL schema and a document chunking system, ensuring easy deployment and complete documentation.
I have experience in Python, asynchronous bulk data fetching scripts, and working with PostgreSQL/Supabase, which allows for a stable and scalable implementation of the entire process.
…
I would be happy to schedule an online meeting to present the implementation plan, the method of chunking documents, and to consult on integration with LLM and API.
-
2 days224 USD
216 2 days224 USDHello!
I have experience in Python, asyncio, and Supabase, worked with big data and APIs, ready to perform your project efficiently and quickly.
-
1 day273 USD
1562 7 0 1 day273 USDGood day!
My name is Roman, and I am among the top 5 developers in the category of "Artificial Intelligence and Machine Learning" among ~1600 specialists on the platform.
I guarantee:
- Fast and high-quality execution of the task
- Clear adherence to deadlines
- Regular communication throughout the entire process
I would be happy to discuss the details of your project in private messages.
-
Winning proposal7 days224 USD
2248 63 2 2 Winning proposal7 days224 USDHello
I can complete your project
I will write good documentation
For easier deployment, I can create a Docker container
-
7 days232 USD
758 31 0 7 days232 USDGood day!
I have commercial experience working with Python for 3+ years.
I have worked with Supabase and created automation scripts. I am ready to complete your project.
I suggest improvements from my own experience - to use a circuit breaker + retry for API requests. Also, instead of SQL scripts for creating tables, use migrations. Regarding data optimization, I have a few ideas that I would be happy to discuss.
I have only a few questions about the AI part; I do not fully understand what is required.
-
2 days224 USD
205 2 days224 USDDear Maximilian,
My name is Mikhail, and I am a developer with extensive experience in web application development, automation, and data collection. I would be happy to offer my services for the successful completion of your project. From your specifications, I understand that the best solution for you would be to write a script in Python using the following stack: requests/selenium, sqlalchemy, asyncio/threading. I am a professional in the field of automation and have written many projects related to parallel parsing; it does not matter how complex the resource is from which data needs to be extracted, it will be extracted with maximum speed and quality. To bypass API protection, I will use proxies, and at the end, I will provide a filled database created through sqlalchemy and all the code, and if necessary, I will connect neural networks. I am confident that I can implement your ideas and bring the project to a successful conclusion.
I would be glad to have the opportunity to discuss your project in more detail and answer any questions you may have.
-
3 days224 USD
1430 14 3 1 3 days224 USDHello, I have reviewed your task and I am interested in its implementation, I would like to collaborate with you. I invite you to a personal meeting for a more detailed discussion.
-
1 day213 USD
1310 6 0 1 day213 USDGood day, I am ready to take on your project. I have skills in Python.
-
6 days246 USD
475 2 0 6 days246 USDready to help you out
i think maybe you can use go instead of python
it looks better for this usecase
-
11 days218 USD
981 6 3 11 days218 USDGood day, I will do everything as you say. I hope for cooperation, write in private messages!
-
1 day224 USD
162 1 day224 USDHello.
I was interested to learn about your project. I am confident that I can do effective and quality work that meets your requirements and expectations. I have over 8 years of experience. I am ready to discuss the details and start working. I look forward to your response.
-
2 days224 USD
316 1 0 2 days224 USDHello, I have extensive experience in web development. I am ready to do it quickly and efficiently.
Message me privately – we will discuss the details.
-
3 days273 USD
656 9 0 3 days273 USDGood evening, Maximilian!
In general, the task is clear, but for an accurate response regarding deadlines and price, I would like to clarify some questions that arose after analyzing your task.
Please write in private messages – we will discuss the details and your wishes.
-
25 days1093 USD
4272 25 0 25 days1093 USDHello! I propose to implement it in Go, divide it into sprints, and start with the simple (exclude vectorization, chunks, retries in the early stages), because otherwise, the project may not be completed. Minimum starting price.
-
1 day224 USD
172 1 1 1 day224 USDHello! I am ready to complete this project with extensive experience in developing various applications.
-
7 days224 USD
12784 4 2 7 days224 USDHi,
I'm excited to apply for the role involving asynchronous document ingestion and Supabase integration. With deep experience in Python (3.8+), aiohttp, asyncio, and supabase-py, I can deliver a robust, scalable ETL pipeline tailored to your API and database needs.
I’ve previously built similar systems for high-volume document processing, including pagination, batching (1k–5k inserts), checkpoint-based recovery, and JSONB metadata storage in PostgreSQL. I also understand the importance of chunking and structuring documents for future vector-based search and LLM integration.
You’ll receive a fully-documented, production-ready solution — complete with schema scripts, .env templates, logs, and retry logic — designed for ease of deployment even by non-developers.
Looking forward to contributing to your project.
… Best regards,
Jeo Vincent Carretas
-
2 days243 USD
1251 35 1 3 2 days243 USDHello, I am the one you need.
I have extensive experience in web development.
Message me privately to discuss the work.
-
2 days224 USD
342 2 days224 USDGood day.
I am ready to complete your task quickly and efficiently.
Advantages:
- Ease of use.
- Free support for 2 weeks after the order is completed and error corrections.
- I start working on the day the order is accepted and complete it in the shortest possible time.
… I will be happy to cooperate.
-
1 day224 USD
3096 50 1 1 day224 USDGood day
I am ready to do your work
I will be happy to help with your task quickly and efficiently
Current freelance projects in the category Databases & SQL
Work with BAS CORPHello! Currently, the following tasks need to be completed: Adjustment of the printed form in BAS KORP according to the client's parameters. Specifically, the task is not to display certain fields. Updating the classifier of professions in BAS KORP for a number of… Payment Systems Integration, Databases & SQL ∙ 10 hours 25 minutes back ∙ 7 proposals |
Need a 1C specialist for refinements and development.I am looking for a 1C specialist for freelance collaboration. I am currently working with a contractor who provides support and maintenance for the 1C system. However, due to the contractor's workload, there is a need for prompt execution of additional tasks, improvements, and… Databases & SQL ∙ 1 day 17 hours back ∙ 11 proposals |
Restoring Instagram after a banOn March 3, 2026, my Instagram was blocked due to a violation of rules. I believe there has been some mistake, as the page was personal with photos and videos of my personal trips. I did not post any comments anywhere. I have sent several appeals. I cannot download my files… Databases & SQL, Web Programming ∙ 2 days 4 hours back ∙ 4 proposals |
Eliminate the issue of incomplete data import from Excel files of Nova Poshta specifications into 1C:
45 USD
Goal: There is a processing in 1C for uploading the specifications of Nova Poshta. For unclear reasons, it has stopped loading some tabular data. We need to find the reason and eliminate the problem of incomplete data import from Excel files of Nova Poshta specifications into… System & Network Administration, Databases & SQL ∙ 5 days 11 hours back ∙ 16 proposals |
Accounting, planning, and sales system for a mushroom farm
603 USD
Here is the complete, final text of the Technical Assignment (TA). It combines all your requirements: 16 chambers, 20 contractors, a schedule by days, accounting for containers, profitability calculation, and a mandatory division into three grades of mushrooms. You can fully… Databases & SQL, Client Management & CRM ∙ 7 days 15 hours back ∙ 59 proposals |