Parsing the database of designers, architects, and foremen - Kyiv
Task: to compile the most complete database of contacts of practicing interior designers, architects, and contractors in Kyiv from several open sources.
What has been done:
I wrote a Python script using browser automation (Selenium + Chrome). The program collected data in parallel from several platforms: the catalog bazadizainerov.com, ads on OLX.ua and Kabanchik.ua, company profiles from Google Maps.
From each profile, the following were extracted: name, phone, email, Instagram, description, and direct link. After collection, automatic deduplication was performed based on the phone number. The final result was exported to Excel.
Result:
- 4 sources processed.
- 2,600+ profiles reviewed.
- 1,400 unique contacts in the final database after deduplication.
- 1,200 contacts with phone numbers.
- 870 contacts with email.
Technologies: Python, Selenium, ChromeDriver, openpyxl, regex.
#parsing #python #selenium #web_scraping #automation #data_collection #scraping
What has been done:
I wrote a Python script using browser automation (Selenium + Chrome). The program collected data in parallel from several platforms: the catalog bazadizainerov.com, ads on OLX.ua and Kabanchik.ua, company profiles from Google Maps.
From each profile, the following were extracted: name, phone, email, Instagram, description, and direct link. After collection, automatic deduplication was performed based on the phone number. The final result was exported to Excel.
Result:
- 4 sources processed.
- 2,600+ profiles reviewed.
- 1,400 unique contacts in the final database after deduplication.
- 1,200 contacts with phone numbers.
- 870 contacts with email.
Technologies: Python, Selenium, ChromeDriver, openpyxl, regex.
#parsing #python #selenium #web_scraping #automation #data_collection #scraping