Switch to English?
Yes
Переключитись на українську?
Так
Переключиться на русскую?
Да
Przełączyć się na polską?
Tak
Post your project for free and start receiving proposals from freelancers within minutes after publication!

Website parsing

Translated180 USD

  1. 5093
     30  0
    Work example:
    Mobile app with admin
    7 days607 USD

    Estimate - 35,000 UAH, deadline - 7 days after access to the technical specifications and code example.

    For such a task, I would not create four separate scripts, but rather a general processing chain - loading, proxy, parsing, normalization, deduplication, hashing by 3 fields, exporting, and logging errors. Note that proxies and directory protection often take more time than the actual page parsing, so I will check this on the first 1-2 sites.

    Questions
    > which specific directories and what output is needed - CSV, database, API, or file for your system
    > should deduplication be done only within one site or across all 4 sources

    Similar examples from Ingello
    > https://business.ingello.com/prime-eva - similar in working with product data and operational automation
    > https://business.ingello.com/vorfahr - close in integrations, processing chains, and data
    > https://systems-fl.ingello.com - main page for system development

    To start, we need technical specifications, a code example, test proxies or requirements for them, the result format, and criteria for considering a record a duplicate. Overall, it's fine, we can keep it simple - first, we create a stable core, then we connect 4 sources =)

  2. 4126
     54  1

    4 days112 USD

    Hello! I am interested in the task "Website Parsing". I have experience with API integrations, data exchange, parsers, webhooks, and process automation. I will be able to carefully connect the necessary services, handle errors/retries, and create a solution that will work reliably after launch.

  3. 8288
     100  0

    1 day90 USD

    Good day
    I professionally engage in web scraping.
    I will complete everything efficiently and as quickly as possible.
    Feel free to reach out.

  4. 10389
     147  0

    8 days90 USD

    Good day. To assess, it is necessary to review the websites themselves. I indicate the minimum cost for a similar order.

  5. 1422    13  0
    8 days90 USD

    Hello, I can implement all 4 parsers according to the specifications and the provided reference.
    I will configure the operation through a proxy, transfer the deduplication logic, implement hashing on the necessary fields, and build a complete data processing pipeline.
    I have experience in developing complex parsers and data collection systems.

    As a result, you will receive ready parsers with a unified logic of operation, stable data processing, and the possibility of further scaling.

    After reviewing the specifications and the example code, I will be able to immediately assess the exact timelines and costs.

    Please let me know what stack the reference code is written in and which specific websites need to be parsed?

  6. 1520    2  0
    4 days112 USD

    Hello!

    Excellent and technically sound technical specification. The presence of reference code is a huge plus, as we won't have to guess the desired deduplication logic; I will simply integrate your ready-made algorithm into the new architecture.

    I specialize in complex web automation (Python) and building fault-tolerant data pipelines.

    Many developers will provide you with 4 separate scripts, which will be very difficult and expensive to maintain in the future. I propose to assemble this as a single modular pipeline, where each site catalog is just a separate module connected to a common core.

    How the architecture will be structured (Pipeline):

    Collection and Proxy (Extractor): We set up proxy rotation with a retry mechanism. If the catalog times out or bans the IP, the script will not crash with an error but will gracefully switch the proxy and continue working from the same point. To protect against Cloudflare or JS rendering, I use Playwright; for fast sites, I use asynchronous Aiohttp.

    Transformation (Transformer): Parsing the necessary fields and cleaning them from junk tags.

    Hashing: We generate a unique composite key based on the 3 specified fields (MD5 or SHA-256).

    Deduplication (Filter): I will transfer the logic from your reference code. I will implement hash checking "on the fly" (via generators) so that the script runs quickly and does not consume all the server's RAM when processing large catalogs.

    Two clarifying questions:

    Should deduplication work globally (looking for duplicates among all 4 sources) or isolated within each individual site?

    In what format should the pipeline deliver the final cleaned data (CSV, JSON, or direct write to your database)?

    I am waiting for links to the sites and your code example in private messages. I can start analyzing as soon as the details are agreed upon!

  7. 650    2  0
    1 day90 USD

    Good day!

    Developing parsers with pipeline logic is our specialized area, so the task is completely clear. Having a technical specification and a code example is a big plus: we will maintain a consistent style and transfer your logic without deviations.

    What we will implement:

    4 parsers according to the technical specification for directory websites.
    Integration of proxies (rotation + throttling for stable operation without blocks).
    Deduplication logic — we will transfer it from your reference.
    Hashing on 3 fields for duplicate control.
    Everything assembled into a single pipeline according to the described scheme.
    To provide an accurate price and timeline right away, please clarify:

    Are the 4 websites similar or different in complexity (JS rendering, anti-bot, authorization)?
    Are the proxies yours or should we connect our own?
    We will discuss the details in private.

  8. 172    1  1
    1 day112 USD

    Good afternoon. I am ready to complete this project; I have extensive experience in developing various applications.

  9. 3411    32  0
    3 days90 USD

    Hello! To assess the volume of work, please provide links to the websites in private, as well as a more detailed technical specification.

  10. 1984    25  1
    1 day112 USD

    Hello, I can create such a pipeline for you, I have experience. But we need to have more input data. Let's chat in private?

  11. 420    2  0
    5 days90 USD

    Hello! I am ready to implement 4 parsers according to your specifications. Please send the links to the websites in a private message.

  12. 1476    14  1
    5 days90 USD

    I will create parsers with proxies and deduplication logic as a pipeline in Python. I have experience integrating hashing for data uniqueness and working with code examples. Can you clarify which specific fields need to be hashed for deduplication?

  13. 727    6  0
    8 days180 USD

    detailed assessment after familiarization with the technical specifications
    _______________________________________________

  14. 234  
    2 days90 USD

    Hello. I can implement 4 parsers according to your specifications: I will rewrite the logic from the reference, set up proxies, add deduplication and hashing based on 3 fields, and also assemble everything into a single pipeline so that the data is processed sequentially and reliably. After reviewing the code example, I will clarify the details and propose the final architecture of the solution.

  15. 333  
    5 days90 USD

    Good day! I specialize in parsing with Python and Java, have experience with proxy rotation, deduplication, and pipeline architecture. I will implement 4 parsers based on your code example — I will transfer the deduplication logic, add hashing for 3 fields, and connect proxies. The code will be clean, with logging and error handling. Before starting, I will clarify the list of websites and possible protections (Cloudflare, JS rendering). I will deliver on time.

  16. 2147    33  0
    7 days90 USD

    Good day, I have created parsers for various websites. Code examples are not needed. I need the addresses of the websites, then I can provide a more accurate estimate of time and cost.

  17. 93816    1268  1   10
    7 days112 USD

    Hello. I have extensive experience in developing parsers. Can I see the websites for parsing?

  18. 1580    3  0
    7 days79 USD

    Hello!

    I have extensive experience in developing solutions for parsing and processing data (various sources, protection against blocking, automation). I am ready to implement the assigned task in the shortest possible time.

    I suggest discussing the details in private messages.

  19. 3926    15  0
    7 days607 USD

    Hello.
    I can develop a parser for you in the shortest possible time. Payment is hourly.
    The number of hours needs to be approved in advance.
    And it depends on which platforms/sites we are going to parse. Please send them to me in a private message.

    The last project I worked on was a parser for foreign platforms (olx, vinted, jofogas), with monitoring and the logic you mentioned, but in the format of a Telegram bot. Reviews are in my profile or at the link https://freelancehunt.com/project/vosstanovlenie-podderzhka-dorabotka-telegram-bota-dlya/1592141.html

    Feel free to write, I would be happy to do this for you.

  20. 1251    35  1   3
    1 day100 USD

    Hello, I am ready to do it. Please send the technical specifications in private, I will review them, and we will discuss the terms of cooperation.

  21. 6366    74  1
    1 day22 USD

    Good day. I have extensive experience in parsing. I need to look at the sources. I would be happy to collaborate.

  22. 315  
    6 days135 USD

    Hello, I am interested in the project. I work with Python, web scraping, Requests/BeautifulSoup/Selenium, data processing, and saving results in CSV/Excel. I am ready to consider implementing 4 parsers for your websites with proxy integration, deduplication, and hashing on the required fields. I can also review the code example and transfer the necessary logic to a new pipeline. For an accurate estimate, I would like to familiarize myself with the technical specifications, the list of websites, a code example, and the format of the final data.

  23. 1490    28  0
    2 days22 USD

    I can do it, write to discuss the details.................................................

  24. 108  
    3 days22 USD

    Hello!

    I am ready to implement all 4 parsers according to the technical specifications. I can transfer and adapt the deduplication logic from the reference project, set up the work through proxies, implement hashing based on the specified fields, and assemble everything into a single pipeline.

    If you provide the technical specifications and a sample code, I will be able to quickly assess the timeline and start working.

  25. 3219    84  0
    2 days45 USD

    Ready to take it on. Need to see the websites.
    Need to clarify the order details, write to me!
    I use python, uv, github, docker.

  26. 6824    164  1
    4 days90 USD

    Good evening, if you have extensive experience in parsing, I can start after agreeing on the technical specifications. Please message me privately.

  27. 471    1  0
    3 days67 USD

    Good evening. Send me the specifications and I will start implementing the parsers.

  28. 10123    117  0
    3 days101 USD

    Hello.

    I develop bots and parsers in NodeJS. I'm ready to take it on. Write to me, and we will discuss.

  29. 243  
    4 days34 USD

    Bogdan, greetings.

    I have reviewed your task. It's great when there is a ready-made specification and reference code, it immediately removes a lot of questions. I will write all 4 parsers in Python (Scrapy or BeautifulSoup, depending on how the sites provide the data).

    I will set up the entire pipeline as needed: I will connect a proxy for stable collection, and I will simply implement the deduplication and hashing logic based on your example across three fields.

    Please send me the links to the directories and your reference code in private messages. I will quickly review the structure and can immediately get to work.

  30. 3206    31  0
    2 days180 USD

    Greetings! Excellent, clear task, completely within my profile. I will implement parsers as a reliable, fault-tolerant pipeline in Python (Scrapy/BeautifulSoup).

    I will clearly transfer the logic of deduplication and hashing across 3 fields from your reference, and set up proxy rotation for uninterrupted operation. Since there is a ready-made technical specification and example code, I will do everything quickly and without unnecessary questions.

    I am ready to start immediately after reviewing the reference. Let's discuss the details!

  31. 702    1  0
    3 days90 USD

    Hello! I have extensive experience in writing parsers. I am ready to collaborate. I offer quality and fast work. Write to me.

  32. 673    5  0
    7 days45 USD

    Hello, I have worked on parsing a catalog of over 50,000 products for an eCommerce platform, using proxy rotation and deduplication by hashes - this is definitely suitable for your 4 catalogs!

    I’m curious about which specific catalogs need to be parsed and if there are any speed limitations for data collection?

    I suggest we get in touch; I will provide you with free technical consultation and we can create a development plan + I will tell you about my team! ✨

  33. Another 13 proposals concealed

Current freelance projects in the category Data Parsing

Database of websites on WooCommerce

It is necessary to compile a database of Ukrainian online store websites on WooCommerce with the contact information provided on the sites. Only active websites (indicator: updated catalog/content, working domain) Table format - website address, phone number, e-mail.

Data Parsing ∙ 1 day 12 hours back ∙ 20 proposals

Create a dashboard in https://airtable.com/ for the performance of advertising creatives from Facebook ads.

Full specification https://docs.google.com/document/d/1_n_oYRNZWYxalUA---DM5AD1b5ZSrtePw5J4G42svGw/edit?usp=sharing

Databases & SQLData Parsing ∙ 3 days 3 hours back ∙ 17 proposals

Creation of an Excel file for uploading products to the websites of other partners.

I am interested in creating an Excel table with all parameters. Here is the website - https://heiztechnik.com.ua/ And the positions I am interested in to be transferred: Manual boilers: 1) TIS UNI 15-95 kW (10) pcs 2)TIS HARD 150-500 kW (7) pcs Pellet boilers: 1)TIS PELLET…

Data Parsing ∙ 3 days 7 hours back ∙ 34 proposals

A developer is required for parsing the catalog and automating data import.

Detailed technical specifications in the attached document Please indicate the estimated cost and timeline in your response Do you have experience working with parsing large catalogs What possible difficulties or limitations do you see in this task

Databases & SQLData Parsing ∙ 3 days 10 hours back ∙ 40 proposals

Find a product feed (Google Merchant XML) for a website on OpenCart

16 USD

It is necessary to find a direct link to the active product feed (XML) of a competitor for Google Merchant Center Platform (CMS): OpenCart / ocStore Find the original feedRequirements for the result: Working link to the XML file

PythonData Parsing ∙ 3 days 15 hours back ∙ 24 proposals

Client
Bohdan Ostapov
Ukraine Ukraine  1  0
Project published
10 days 1 hour back
371 views
Tags
  • scrapy
  • Beautiful Soup
  • python