Write a parser for the collection of data on links.
You need to write a parser that will go through the list of published sites and collect contact information.1 .All websites are Finnish companies with the same type of activity.Website structure and design are different.The level of site protection is different.
2ndYou need to collect all the contact data that is available.(The order and availability of the data may be different, depending on the site.)Name of the department, if possible
Name - if possible
The duty, if possible
Telephone – Compulsory Data
Email - Compulsory Data
ThreeParser should search for data in the footer/header, also, enter the “contact”/”on us” section and search there, as often in the footer/header may not be data or there only the company’s mail, not the CEO, etc.
Contact data
- Contact details - (i.e. Department Name, Function, Name, Phone, Email)
- The location of the contact details may be both on the main page and on the individual intended page.The location of the contact data can be both in the site’s hat, in the footer, and anywhere else on the page.Only one phone and one email can be on the site.The outcome of results
The results should be in the form of a CSV file.
The data must be structured (e.g. the phone number and the mail number) must be linked to each other in order to understand which phone number belongs to which mail.If possible, the data must be filtered from excess data (doublets and irrelevant data)
by ITOG
The final product is a working parser with the source code and with the documentation, in which you can independently replace the links and that it performs the above tasks.Additionally
The task is attached to a file with a part of the links for the example, as well as screenshots of what data to collect on the site, as well as an example of how the received data should appear approximately.
Applications 5
-
Добрый день. Можно ссылки сайтов на которых нужно брать информацию?
-
Current freelance projects in the category Data Parsing
Consultation on parsing Instagram account subscribersHello. It is necessary to conduct a preliminary assessment of the feasibility of the following task. I have a list of Instagram accounts. The goal is to obtain contact information (primarily email addresses) of users who follow these accounts. Previously, I encountered companies… Data Parsing ∙ 2 days 4 hours back ∙ 12 proposals |
A specialist is needed to find contacts of decision-makers in Ukraine.It is necessary to gather a database (or ready database) of contacts of decision-makers (DMs) in companies in Ukraine. Information Gathering, Data Parsing ∙ 2 days 9 hours back ∙ 16 proposals |
Need to scrape data from LinkedInWe need to scrape data from LinkedIn based on our list. For each entry, we need to find and collect available data if it exists on the LinkedIn profile, including the profile picture on the LinkedIn social network, email address, links to social media, company website, and… Data Parsing ∙ 2 days 15 hours back ∙ 27 proposals |
Parsing and classification of dataWe are looking for a developer to implement a system for collecting and structuring data from open sources. We have a database of small business owners in the USA, which contains the person's name, company name, address, and state. It is necessary to build a process for… Web Programming, Data Parsing ∙ 2 days 16 hours back ∙ 41 proposals |
Svitlahata
17 USD
It is necessary to import 1819 products from the XML/YML feed of Prom.ua to OpenCart 3. A ready XML file is available, which contains product names, descriptions, prices, photos, specifications, manufacturers, and categories. Requirements: import all products to OpenCart… Content Management Systems, Data Parsing ∙ 3 days 19 hours back ∙ 34 proposals |