Industrial tennis data parser Sofascore – work from freelancer's portfolio Елисея | example from category Data Parsing (№1958396)

Switch to English version?

Yes

Переключитись на українську версію?

Так

Переключиться на русскую версию?

Да

Przełączyć się na polską wersję?

Tak

Login
Registration
- Welcome to Freelancehunt
  
  Work risk-free, saving time and money
  
  Login Registration

#Parsing #Python #Automation #DataScience #Sofascore #Scraper

Created a modular library [see photo 1] and a set of Python scripts for automated data collection of all tennis matches and players from the Sofascore website.

Features:
- Collects all historical and upcoming matches within a date range (id, statistics, points, odds, player strength).
- Parses information for each player and their rating.
- Built-in anti-bot protection: automatic proxy rotation, dynamic user-agent, cookies.
- Multithreading: configurable via settings, speeds up collection (16,400 matches/hour [see photo 2] and 42,000 players/hour [see photo 3]).
- Smart retry system and automatic re-fetch of missing data (403, 429) [see photo 4].
- All settings are managed through the config.py file (dates, proxies, threads, delays).
- Export: clean CSV files, fully compatible with pandas, ready for ML and analytics.
- Logs, progress bar, ETA (remaining time), speed output per minute/hour.
- Detailed documentation in Russian and English, with code and console run examples.

Result:
The project was successfully implemented for the client, with a fully automated data collection and update process, ensuring high speed and stability even with large volumes.

Stack: Python 3.11+, curl_cffi, pandas, threading, proxies.

←
Work 2 out of 3
→

Added 12 July 2025

259 views

Publish a similar project

Yelisey H.

Dnepr 7

Available for hire

7 Safes completed

On the service 1 year

←
Work 2 out of 3
→