Parser of public company data from 3 open sources
Automated data collection on public sector organizations
Task:
Develop a tool for collecting and structuring data on public organizations from open online sources, with subsequent export to a convenient Excel format.
Integration sources:
YouControl (analytics and company profiles in Ukraine)
Clarity Project (tenders and transparency of public procurement)
Zakupivli.pro (Prozorro – public procurement)
Main work stages:
Search and collection of basic data
organization name
EDRPOU code
region
address
responsible persons
Collection of contact information
phones
email
other available communication channels
Formation of a unified database
field unification
data cleaning and structuring
export to .xlsx (Excel)
Result:
The client received a convenient Excel file with a complete database of organizations that can be used for analytics, partner search, monitoring, or marketing tasks.
Technologies:
Python (Requests, BeautifulSoup, Selenium)
pandas, openpyxl
Data processing and normalization
Task:
Develop a tool for collecting and structuring data on public organizations from open online sources, with subsequent export to a convenient Excel format.
Integration sources:
YouControl (analytics and company profiles in Ukraine)
Clarity Project (tenders and transparency of public procurement)
Zakupivli.pro (Prozorro – public procurement)
Main work stages:
Search and collection of basic data
organization name
EDRPOU code
region
address
responsible persons
Collection of contact information
phones
other available communication channels
Formation of a unified database
field unification
data cleaning and structuring
export to .xlsx (Excel)
Result:
The client received a convenient Excel file with a complete database of organizations that can be used for analytics, partner search, monitoring, or marketing tasks.
Technologies:
Python (Requests, BeautifulSoup, Selenium)
pandas, openpyxl
Data processing and normalization