Collection and parsing of information
It is necessary to parse and collect structured information from the websites of Ukrainian and foreign higher education institutions, colleges, and online course platforms (Abiturients, Mudra, Coursera, Udemy, etc.) with subsequent preparation of Excel files according to the provided structure.
Volume:
Ukrainian colleges — ~650+ entries
Ukrainian higher education institutions — ~350+ entries
Foreign higher education institutions — all from the source website
Online courses in Ukrainian — ~9,000+ entries
Online courses in foreign languages — ~26,000+ entries
Reviews — min. 180,000 entries
Deadline: 6 days, with phased submission.
We are looking for a specialist with experience in parsing large volumes of data and preparing valid datasets.
Client's review of cooperation with Luka Grachov
Collection and parsing of informationUnfortunately, we could not complete the task due to the illness of the performer.
-
310 2 1 Hello!
My name is Semen, I am the manager of the company Wanord. We specialize in parsers, collecting large datasets, and preparing structured datasets (Excel/CSV) according to a ready-made technical specification.
📌 What is needed:
To collect and structurally prepare data from the websites of Ukrainian and foreign universities/colleges and online course platforms (Abiturients, Mudra, Coursera, Udemy, etc.) by forming Excel files according to a specified structure. The volumes are tens of thousands of records + at least 180,000 reviews, with phased delivery over 6 days.
🔧 What we will do:
We will analyze your target structure of Excel files and agree on the format of the fields (types, mandatory fields, encoding, languages).
…
We will develop separate parsers for:
Ukrainian colleges (~650+);
Ukrainian universities (~350+);
Foreign universities (full list from the source website);
Online courses (UA ~9,000+, foreign ~26,000+);
Reviews (180,000+).
We will implement stable data collection considering limits/bot protection (IP rotation, pauses, error logging).
We will perform data cleaning and validation (duplicates, empty fields, date formats, encoding).
We will prepare Excel files strictly according to your structure + basic quality check (spot-check, random sampling).
We will deliver the results in phases: first part of universities/colleges, then online courses, and finally the block of reviews.
💼 Experience:
We have experience in parsing large volumes (hundreds of thousands+ of rows), building stable parsers that work under load, and preparing datasets for analytics/ML. We can send examples privately.
💰 Estimated budget: $1500–2300
⏱️ Deadline: up to 6 days with phased delivery (provided access to all sources and final agreed structure of files).
We are ready to take on the project and immediately move on to clarifying the structure of Excel and the plan of stages. Write to me in private messages — send the template files and sources, we will agree on the final budget and delivery schedule.
-
316 1 0 Hello!
I have experience in developing Python scripts for data collection. I am ready to complete this project.
Please write in private messages.
-
1964 25 1 Hello, I am engaged in parsing on a regular basis. I can help you with solving your task. Write to me in private, we will agree on the details.
Minimum price per stage
-
852 51 0 1 Ready to cooperate.
Exact price and terms after more detailed information.
-
201 1 1 Hello! I can write a simple and reliable asynchronous parser to work faster. I can write the data wherever it is convenient for you, but in the end, it should result in an Excel file of the required structure.
-
8971 367 0 Good day
I am interested in your project.
I would like to discuss everything in more detail.
-
248 Good day!
I would like to clarify that 6 days is too short a timeframe for a project of this scale and volume of data. From my experience, tasks related to parsing large amounts of information require significantly more time for quality implementation.
I have extensive experience in browser automation using tools like Selenium and Playwright. I have created complex parsers for various platforms, including dynamic and protected websites. For example, I developed a complex parser for THREADS (X) that works with obfuscated dynamic HTML. Using computer vision, we collected data on the number of subscribers, likes, comments, reposts, and direct messages, analyzed the virality of posts, accumulated them in a database, and created an analytical dashboard. The system included more than six separate scripts, such as auto-liking, commenting, and monitoring personal topics.
I also have experience in building reliable pipelines for cleaning and storing large datasets, as well as integrating with APIs. I work with tools to bypass restrictions and ensure the stability and scalability of solutions.
If assistance is needed in preparing valid and structured datasets and proper parsing, I am ready to discuss timelines and stages of work to ensure the quality of the result.
…
Best regards