Tennis matches data parser for Melbet (melbet.com) — dataset col
I developed a synchronous data parser for tennis matches from the betting website Melbet (melbet.com), designed specifically to collect a clean dataset for future machine learning models.
The parser runs in near real time and uses Selenium WebDriver to navigate tennis sections and event pages. It iterates through the required DOM nodes and extracts structured data: tournament, players, start time, markets and odds. The parsing speed is controlled (tunable delays between requests and page transitions) to keep the process stable and avoid overloading the website.
Collected data is cleaned, validated and stored in an MS SQL Server database using a normalized schema (matches, tournaments, markets, odds). On top of that, I implemented CSV export so that the data can be easily used for analytics and for training ML models (e.g. for odds or match outcome prediction).
I designed and implemented the whole solution: database schema, synchronous crawling logic with rate limiting, Selenium error handling, data mapping into SQL tables and the CSV export module.
Tech stack: C#, .NET, Selenium WebDriver, MS SQL Server, ADO.NET / ORM, CSV export, ML-ready dataset preparation.
The parser runs in near real time and uses Selenium WebDriver to navigate tennis sections and event pages. It iterates through the required DOM nodes and extracts structured data: tournament, players, start time, markets and odds. The parsing speed is controlled (tunable delays between requests and page transitions) to keep the process stable and avoid overloading the website.
Collected data is cleaned, validated and stored in an MS SQL Server database using a normalized schema (matches, tournaments, markets, odds). On top of that, I implemented CSV export so that the data can be easily used for analytics and for training ML models (e.g. for odds or match outcome prediction).
I designed and implemented the whole solution: database schema, synchronous crawling logic with rate limiting, Selenium error handling, data mapping into SQL tables and the CSV export module.
Tech stack: C#, .NET, Selenium WebDriver, MS SQL Server, ADO.NET / ORM, CSV export, ML-ready dataset preparation.