HTML scraper for a CSV file or a spreadsheet (e.g Excel) to run on Mac
I need a simple HTML scraper for a CSV file or a spreadsheet (eg Excel) to run on MacOS. It can be in Python.
The HTML files being the contents of the Land and Mortgage Register of a given real estate in Poland are scrapped.
HTML files are located on the built-in disk in the MacOS computer, so you do NOT need a bot entering the site https://przegladarka-ekw.ms.gov.pl/eukw_prz/KsiegiWieczyste/wywarkaKW and filling in reCAPTCHA.
The goal of the scraper:
Export data to one spreadsheet
Additional information and functions:
- input files are named according to the number of the Land and Mortgage Register. Because there are 3 files for one number, the number is followed by the characters "-1", "-2" and "-3" - depending on the number of the input file. You can change the names - if necessary. Example: KR1P/00445050/1-1; KR1P/00445050/1-2; KR1P/00445050/1-3.
- 3 HTML files (file 1 - main, file 2, file 3) contain data that is exported to multiple columns in one row of the spreadsheet. Each column is different data from these files. It is possible that there will be no data to download in one file - then the script inserts the "-" character in the given cells and the data will come from two files.
- For each issue of the Land and Mortgage Register, the structure of three HTML files is the same, although for different numbers they differ eg in content (text) and may differ eg in the number of rows (eg Three mortgages are entered, instead of one). The difference in the number of poems, therefore, consists in duplicating the first line - depending on the content of the given Land and Mortgage Register.
- some source files (about 20% of them) have a certain string (expression) that qualifies the HTML file so that it does not check the next two files for a given number of the Land and Mortgage Register. Then the script scraper only this one file and exports the data to an automatically created, new, one spreadsheet. This one file will be used for all of the situations described above. There is no need to create more than one file. The sheet will consist of one column. In each row, the numbers of Land and Mortgage Registers visible in file 1 will be imported.
I have a list of all expressions/words that appear in files so that on the equality sign, the script can adjust its operation.
Suggestions:
1. Semi-ready scripts are available at:
a) https://medium.freecodecamp.org/how-to-scrape-websites-with-python-and-beautifulsoup-5946935d93fe
b) BeautifulSoup
c) https://scrapy.org
When I receive an e-mail address, I will send:
- HTML source files
- final file (spreadsheet), which is a template - what should the file with the imported data look like. The target format is .csv or .xls
The matter is urgent and I need a script as soon as possible.
I'm asking for:
- valuation
- time limit for completion
- method of settlement
- e-mail address
Please check the video instruction and send me a valuation.
If any questions please ask me.
Общение предпочтительно на английском, но можно и на русском.
Current freelance projects in the category Data Parsing
Need a parser for the online store https://www.lcsc.com/It is necessary to regularly (once a month, or upon script launch) obtain up-to-date information about the products available in the store. https://www.lcsc.com/ from the catalog of all sections.… Data Parsing ∙ 22 hours 19 minutes back ∙ 39 proposals |
OpenCart — rental catalog of special equipment
135 USD
OpenCart — Equipment Rental Catalog Need to launch an equipment rental catalog on OpenCart. Theme: excavators cherry pickers forklifts generators cranes scaffolding other construction equipment. It is preferable that you already have a ready-made template or developments… Web Programming, Data Parsing ∙ 1 day 14 hours back ∙ 54 proposals |
Transfer the program - the server where the program was located has crashed (officially permitted parsing of government data)
46 USD
Hello! My client has encountered the case described below. We need help transferring to a new server and testing the program. It would be better to have a programmer who understands parsing. Software & Server Configuration, Data Parsing ∙ 1 day 18 hours back ∙ 29 proposals |
Website parsingImplementation of 4 parsers (directory websites) is required. There is a technical specification, and there is a code example as a reference. The tasks include: Writing a parser Integrating a proxy Deduplication logic (transfer the logic from the example) Hashing logic based… Data Parsing ∙ 3 days 11 hours back ∙ 44 proposals |
Collection (parsing) of product database from supplier websites (Excel / CSV)
226 USD
Collection of product database from supplier websites (Excel / CSV) Good day. A specialist is required to collect and structure data from several supplier websites, access to which will be provided.Task: A unified product database needs to be created in Excel (XLSX) or CSV… Web Programming, Data Parsing ∙ 4 days 18 hours back ∙ 108 proposals |