Web scraping script for sports data from Sofascore
127 USDWhat is needed is web scraping python code. A library/function I can use to achieve following things:
- Get upcoming tennis matches data: odds, players names, rankings and standard match data they list on the website
- Get historical tennis matches data: statistics, point-by-point, tennis power, odds, players names and rankings, standard match data they list on the website
- Individual tennis player data.
- ATP and WTA Rankings (tennis rankings)
All this data is probably found neatly through jsons in the website’s network activity. I need someone to create python functions/library that fetches the data and returns in example a dataframe with those fetched jsons stored in columns as a string or json.
I have some code that does this kind of thing for the football matches already. But that code fetches only around 500-1500 historical matches and then gets 403 error. This tennis data scraping solution is needed to work fast and be reliable, fetching 300000 matches in matter of hours/days. And to get upcoming/scheduled matches in seconds/minutes. My current football scraping solution is using curl_cffi, different header setups (language and browser setups), cheapest rotating residential proxy I found and a few second pauses between requests. The current code’s stealth features is done by me, so there should be a lot to improve.
Please ask if you need any more info about this project.
Client's review of cooperation with Yelisey H.
Web scraping script for sports data from SofascoreGreat end result. Good work.
Freelancer's review of cooperation with Joachim Virta
Web scraping script for sports data from SofascoreI am sincerely grateful to Joachim for this project. Clear technical requirements, fast and friendly communication, and deep involvement in the process — working together was truly enjoyable. The customer always gave timely feedback, provided all necessary access, and made decisions quickly, which allowed me to fully focus on the technical side and deliver the best result. I would be glad to work with Joachim again in the future and definitely recommend him as a reliable and understanding client.
Thank you for the opportunity to work on an interesting and meaningful task!
-
5 days521 USD5 days521 USD
Hello Joachim,
I've carefully analyzed your project to scrape tennis data from Sofascore. This is a challenging task, and I have the expertise to build the robust, high-speed solution you need.
The 403 error you are encountering with your current script is a clear sign that Sofascore is successfully "fingerprinting" and blocking your requests. A simple curl_cffi approach with basic headers and proxies is often not enough for a target this sophisticated.
To solve this, I will build a professional-grade scraping library in Python using a much more powerful architecture:
Core Engine (Playwright): I will use Playwright, not just a requests library. This allows us to automate a real browser instance, making our script's behavior nearly indistinguishable from a human user.
…
Advanced Anti-Fingerprinting: I will implement stealth techniques to avoid detection. This includes creating custom browser "contexts" with randomized user agents, screen resolutions, and other browser-level properties that anti-bot systems look for.
Intelligent Request Management: The script will be designed to mimic human Browse patterns, not just make rapid-fire requests. It will also handle proxy rotation intelligently to minimize the risk of IP blocking.
As a PCAP™ certified Python developer, I specialize in building these kinds of reliable data extraction systems. The final deliverable will be a clean Python library with functions like get_upcoming_matches(), get_historical_data(match_id), etc., that return the data in a DataFrame as you requested.
Total Estimate:
Timeline: 5 days
Price: $450 USD
Your budget of €110 is unfortunately not sufficient for developing a system that can bypass a modern anti-bot solution at the scale you require. My price reflects the development of a professional tool that will be fast, reliable, and capable of handling your data needs.
I am ready to build a scraper that truly works.
-
2 days127 USD
71 2 days127 USDHi Joachim.
I have to say that I have recently worked on a web scraper for a college project, not just that I have also created a Streamlit application for the scraped data to view and analyze the data using dynamic charts.
Python is one of my storng skills since I have been building so many data projects using it.
for your project, I can scrap the data, clean it, and give you a Streamlit app that you can view live whenever it updates without the need for further coding. Or if you don't need the app and want me to work on the unfinished app you alredy have I can do it too.
feel free to contact me to start working on the project.
-
5 days521 USD
146 5 days521 USDHi, Thanks for the detailed breakdown. I’ve worked on similar scraping projects, including high-volume sports data collection and proxy-based stealth automation. For Sofascore, I can build you a Python library with well-structured functions that return upcoming matches, historical match stats, player info, and rankings—all through fast and resilient scraping.
My plan is to use `httpx` with `curl_cffi` and session rotation, matching headers and device fingerprints precisely. To avoid 403 errors on long scraping runs, I’ll optimize proxy handling and retry logic. The functions will output clean pandas DataFrames with raw JSON payloads per record so you can explore or store the data easily.
I’d also propose caching session tokens or cookies when possible to reduce authentication overhead and detect anti-bot behavior early. If you have your current football script, I’d be glad to improve on it directly.
Could you confirm if you already have a proxy provider, or would you like me to recommend a better one based on your volume?
Looking forward to working with you.
Best,
… Daniel
-
Winning proposal2 days127 USD
709 7 0 Winning proposal2 days127 USDGood afternoon!
I am ready to implement your project in Python using requests to the internal API of Sofascore.
✅ What I will do:
Write a stable Python script (library/function) that will parse data on tennis matches from Sofascore (odds, statistics, ATP and WTA rankings, etc.).
Provide a convenient interface for obtaining data in JSON format and pandas DataFrame.
…
Solve the problem with blocking (error 403) by using reliable header rotation and proxy (if needed — I am ready to offer tested solutions with residential proxy).
⚙️ Stack:
Python (requests, pandas)
Internal API requests JSON directly to the server (without browser automation)
🚀 I guarantee:
Data stability and accuracy
High speed (hundreds of thousands of requests per day)
Clean and understandable code with comments
Cost: 110 EUR
Deadline: 3 days
Available for clarifying details!
-
2 days127 USD
1338 17 0 2 days127 USDGood evening, I can help you with your project, I also have experience in this field. Write to me and we will discuss all the details.
-
1 day127 USD
639 22 0 1 day127 USDHello..
I did something similar for football for the flashcore website. (it's in the portfolio). Approximately 2,000 matches per minute in speed. If this option suits you, I suggest discussing it in more detail in private messages.
Current freelance projects in the category Data Parsing
Database of websites on WooCommerceIt is necessary to compile a database of Ukrainian online store websites on WooCommerce with the contact information provided on the sites. Only active websites (indicator: updated catalog/content, working domain) Table format - website address, phone number, e-mail. Data Parsing ∙ 10 hours 10 minutes back ∙ 16 proposals |
Create a dashboard in https://airtable.com/ for the performance of advertising creatives from Facebook ads.Full specification https://docs.google.com/document/d/1_n_oYRNZWYxalUA---DM5AD1b5ZSrtePw5J4G42svGw/edit?usp=sharing Databases & SQL, Data Parsing ∙ 2 days back ∙ 17 proposals |
Creation of an Excel file for uploading products to the websites of other partners.I am interested in creating an Excel table with all parameters. Here is the website - https://heiztechnik.com.ua/ And the positions I am interested in to be transferred: Manual boilers: 1) TIS UNI 15-95 kW (10) pcs 2)TIS HARD 150-500 kW (7) pcs Pellet boilers: 1)TIS PELLET… Data Parsing ∙ 2 days 4 hours back ∙ 34 proposals |
A developer is required for parsing the catalog and automating data import.Detailed technical specifications in the attached document Please indicate the estimated cost and timeline in your response Do you have experience working with parsing large catalogs What possible difficulties or limitations do you see in this task Databases & SQL, Data Parsing ∙ 2 days 7 hours back ∙ 39 proposals |
Find a product feed (Google Merchant XML) for a website on OpenCart
16 USD
It is necessary to find a direct link to the active product feed (XML) of a competitor for Google Merchant Center Platform (CMS): OpenCart / ocStore Find the original feedRequirements for the result: Working link to the XML file Python, Data Parsing ∙ 2 days 12 hours back ∙ 22 proposals |