Switch to English?
Yes
Переключитись на українську?
Так
Переключиться на русскую?
Да
Przełączyć się na polską?
Tak
This simple to use, yet highly effective system for managing parsers has been implemented for populating content on websites for business listings in the USA and the UK. It consists of 2 logical components:
* a module for launching and managing parsers
* a website as a user interface.

Main features:
* over 450 implemented parsers of varying complexity
* ensuring high performance of parsers through parallelization of their work
* built-in detection protection system: proxy rotation, absence of headless flag due to a virtual display, anti-detect tools for browsers controlled by Selenium
* control of parser operations: launching a parser with parameters (excluding or including intervals of states, provinces, pages, etc.), launching all parsers in a category, premature stopping of parsing
* monitoring of parser operations: number of successfully processed points, number of blocked proxies, overall operational status
* ability to download log files for problem analysis
* ability to update the list of parsers without the need for a complete stop of all.

Technical stack:
* Frameworks: FastAPI
* Libraries: Bootstrap, pymysql, Pillow (there have been tasks involving image parsing)
* Parsing infrastructure: multiprocessing, requests, BeautifulSoup, Selenium, undetected-chromedriver, xvfb
* Other tools: Docker and docker-compose, Sentry
Work details
Added 12 May
44 views
Freelancer
Oleksandr Y.
Ukraine Kyiv
No reviews

Available for hire Available for hire
On the service 1 month 18 days