Switch to English?
Yes
Переключитись на українську?
Так
Переключиться на русскую?
Да
Przełączyć się na polską?
Tak

Viktor Gayoha

Offer Viktor work on your next project.

Ukraine Chernovtsy, Ukraine
1 month 6 days back
Available for hire available for hire
2 Safes completed
2 months 1 day back
2 clients
age 29 years
on the service 2 years

Rating

Successful projects
No data
Average rating
No data
Rating
783
Data Parsing
64 place out of 770
Python 1
597 place out of 4476

Language proficiency level

Українська Українська: fluent
Русский Русский: advanced
Polski Polski: intermediate
English English: pre-intermediate

Skills and abilities

Portfolio


  • Parsing a protected SPA website, Bypassing Cloudflare and anti-bot systems

    Data Parsing
    Objective: Gather 100% accurate data on over 1000 exhibitors (name, country, booth number, hidden emails and phone numbers, categories) from the official Salone del Mobile website.

    Main challenges:

    Aggressive anti-bot protection (Cloudflare): Standard requests (requests/httpx) returned 403 Forbidden. Regular headless browsers (Selenium, Playwright) and even frameworks like undetected-chromedriver were instantly blocked.

    Complex SPA architecture (React / Next.js): The website did not have standard HTML links. All navigation occurred exclusively through React event handlers (onClick), making traditional URL collection impossible. Additionally, contact details were hidden in non-semantic tags (for example,
    ).

    My solution:
    To achieve perfect accuracy and bypass protection, I developed a custom hybrid approach:

    Connection via Chrome DevTools Protocol (CDP): Instead of launching a new instance of an automated browser, my script used Playwright to connect to an already running, "live" session of Google Chrome (http://localhost:9222). This provided a 100% "trust factor" of a legitimate user (along with real cookies, history, and Canvas fingerprints). Cloudflare was bypassed without any solved captchas.

    Intelligent navigation: The script visually mimicked human behavior — intercepting dynamic locators, physically clicking the mouse to trigger React states, and using the site's internal router to return to the list while maintaining pagination.

    HTML parsing: The captured page state was processed through BeautifulSoup and complex regular expressions (Regex) for accurate extraction of "broken" or poorly formatted links and phone numbers.

    Technologies used:

    Python 3.12

    Playwright (Sync API): interaction with the DOM and connection via CDP.

    BeautifulSoup4 & Regex: precise searching and data extraction.

    Pandas: structuring and exporting data into clean CSV (UTF-8 with BOM) and Excel.

    Result:
    The script autonomously collected and perfectly formatted data for over 1200 companies. The created architecture allows for scalable parsing without the risk of getting banned by IP.
  • Scraper for generating B2B leads (Corporate databases)

    Data Parsing
    Objective: Develop an automated web scraper in Python to collect structured contact and financial data of potential B2B clients from public business directories.

    My solution and technical implementation:

    Parsing HTML tables: The script efficiently navigates through directory pages and extracts the necessary information from the complex tabular structure of the websites using the BeautifulSoup library.

    Operational stability: To prevent blocking by target servers, custom HTTP headers were configured to mimic requests from a real browser. This ensured uninterrupted data collection during long sessions.

    Deep data cleaning: The collected "raw" information often contained extraneous characters and formatting artifacts. Using the Pandas library, I implemented logic for automatic cleaning of key metrics. For example, the fields "Company Revenue" and "Number of Employees" were programmatically cleaned of text and converted into strict numerical values.

    Preparation for CRM: The final dataset is automatically exported in a valid CSV format with the correct column structure.

    Technologies used:
    Python, BeautifulSoup, Pandas, HTTP Headers Configuration.

    Result:
    The client received a fully automated lead generation tool. The output is a perfectly clean CSV file that can be instantly imported into any CRM system without the need for additional manual processing or formatting error corrections.
  • Extended E-commerce parser (Selenium and bypassing anti-bot protection)

    Data Parsing
    Objective: Develop a robust web scraper to collect real-time product data from dynamic e-commerce platforms (such as eBay) for price monitoring and analytics.

    Main challenges:

    Dynamic content: Data was loaded through complex JavaScript/AJAX requests rather than being simply present in HTML.

    Anti-bot systems: Platforms used advanced algorithms to block automated actions.

    Unstable layout: The structure of the pages (DOM) could change, causing regular hard-coded parsers to break instantly.

    My solution:

    Bypassing protection: I used Selenium with flexible stealth configurations for the webdriver. To make the script appear like a real person, I added natural behavior simulation (random delays between clicks, scrolling), which allowed data collection without the risk of being blocked.

    Code resilience (Fallback Selectors): I implemented a system of dynamic fallback selectors. If the online store slightly changed its design or layout, the script did not crash with an error but automatically switched to a backup method of element searching and continued working.

    Automatic navigation: I set up reliable pagination, allowing the autonomous collection of hundreds of listings from multiple pages in a single run.

    Deep data cleaning: Raw data from online stores often contains junk. I applied regular expressions (Regex) to clean the text (for example, extracting the pure price without currency and spaces) and used Pandas to sort the final dataset by ascending price.

    Technologies used: Python, Selenium (Stealth), Pandas, Regex (Regular expressions).

    Result:
    The client received not just a script, but a reliable tool. The output consisted of perfectly formatted, sorted, and production-ready CSV files that could be immediately uploaded to analytical systems or databases.

Reviews and compliments on completed projects 2

Quality
Professionalism
Cost
Contactability
Deadlines

Incredibly satisfied with the collaboration! Very cool approach, the performer does not just wait for instructions, but shows initiative and finds optimal solutions to complex issues. Always in touch, responds instantly, communication is top-notch. A professional who truly understands their craft. Completed everything quickly, efficiently, and thoughtfully. I will definitely reach out again!

Quality
Professionalism
Cost
Contactability
Deadlines

Thank you very much!
Excellent performer - did everything quickly and clearly
Super support - accommodating - we received even more than was specified in the terms of reference
We will work together again!

Profile deleted | Safe Safe | Response review

Activity

  Latest proposals 10
Parsing PDF bank statements
68 USD
PDF book parser (text + images)
225 USD
Development of an AI assistant for automated call monitoring and analytics
394 USD
Telegram Scipt
150 USD
Telegram chatbot for booking detailing studio
68 USD
It is necessary to collect and launch 10 websites using AI.
56 USD
Parsing product images for an online store
188 USD
Парсинг даних товарів з сайту постачальника
45 USD
Automation/Software for reading bank PUSH notifications (P2P, crypto, banks)
101 USD
Create a parser with Allegro for the niche of special equipment.
338 USD