Order parser
I am looking for a Python developer to create a stable job vacancy parser from the Bundesagentur für Arbeit website (https://www.arbeitsagentur.de/jobsuche/.
The final tool should collect job vacancies from the list and job detail pages, apply mandatory filters, and return a structured table according to my columns.
The project includes a simple dashboard (web interface) so that I can initiate the collection myself (keyword, city/radius, publication period), click "Start," and receive a ready Excel/CSV/Google Sheet.
Important: there is a captcha (hCaptcha) on BA.
Materials that I will attach to the project
Job Page.png — shows where exactly to take: Position, Unternehmen, Straße und Hausnummer, PLZ/Ort, Ansprechperson (AP), Telefon, E-Mail, Homepage, Veröffentlichungsdatum, Referenz-Nr. (nr.), Einsatzort, Link auf Anzeige.
Filters BA.png — shows which filters must be enabled before parsing:
Exclude temporary work (exclude)
Exclude external job boards (exclude)
Publication date (period: 24h / 7 days / 14 days / 1 month)
Mailing Datenbank.xlsx — reference columns that need to be filled (you can take the column names directly from this file).
Columns of the output table (exactly as in the file)
– Datum Scraping
– Veröffentlichungsdatum
– nr. (Referenz-Nr.)
– Position
– Unternehmen
– Straße und Hausnummer
– PLZ, Ort
– Telefon
– Internet (Homepage)
– Ansprechperson (AP)
– E-Mail – AP Firma
– Einsatzort
– Link auf Anzeige
– all positions
If a field is missing on the page — put empty.
Filters (mandatory)
Temporary work = false (no temporary work in the result)
External job boards = hide (filter out external job boards)
Publication date = last X days (according to the parameter in the dashboard, 24h / 7 days / 14 days / 1 month)
Deduplication
Primary key: Referenz-Nr. (nr.)
Data quality requirements
Address split: Street / PLZ / Ort
Phone/Email — clean values (without "E+11", without extra characters)
Job URL — clickable URL of the job card (not internal ID)
Publication date: if "vor X Tagen" — convert to exact date
Technical requirements
Re-running does not create duplicates.
Dashboard (minimum)
Fields: keyword, city/radius, publication period (24h/7/14/30), checkboxes for filters temporary work/external
Button Start → after execution provides download Excel and — push to Google Sheet).
Acceptance criteria
Output Excel/CSV exactly according to the structure of "Mailing Datenbank.xlsx" (column names from the file).
Random check of 50 vacancies: addresses split, contacts clean, publication dates — exact dates, URLs open.
If there is no data — the corresponding column shows
empty.README with instructions and a brief report (how many collected, filtered out temporary work/external, number of duplicates, number of 429/5xx, how captcha was handled).
Phases and test
Test task (mandatory): collect 20 vacancies "Lagermitarbeiter/in" within 7 days, applying filters; submit Excel/CSV according to the structure of the file; mark missing fields as
empty.Phase 1 (contract): complete parser BA + deduplication + rule
empty+ output to Excel/CSV.
Client's review of cooperation with Sergey Andreyev
Order parserGood communication, quickly, quality - I recommend
Freelancer's review of cooperation with Oleksandra Kilimnik
Order parserI liked the collaboration, understanding, and problem-solving.
-
2283 25 1 Hello, I have been developing in Python for over 5 years, and I recently completed a project for parsing LinkedIn which also included a captcha. I am familiar with the website; there is an option using requests for speed, and another with an anti-detect browser for reliability. Generating in Excel is not a problem; we can use the pandas library. I also propose an interface using Flask; when the program runs, there will be a local web interface with buttons, a nice design, and logging to view the process. I would be happy to discuss the details and move forward with collaboration!
-
765 7 0 I am a Python developer with over 7 years of experience. It looks clear. I am ready to take it on!
-
991 12 1 I am ready to promptly and qualitatively complete your order. I have experience working with similar projects, always adhere to deadlines and technical specifications. I will be happy to collaborate!
-
8193 63 1 Good day,
I am ready to take on the project of developing a stable job parser from the Bundesagentur für Arbeit website. My task will be to create a tool that will automatically collect job listings, apply mandatory filters, and return a structured table according to your columns. I will also include a simple dashboard for easy initiation of the collection and obtaining results in Excel/CSV/Google Sheet format.
Considering the presence of CAPTCHA on BA, I will implement the appropriate handling.
I look forward to the opportunity to try my skills, my rate is $16 per hour. To begin, it is necessary to familiarize myself in more detail with all the materials and the task.
Arthur
-
172 1 1 Good day! I am ready to complete this project. Great experience in developing various applications.
-
10123 117 0 Hello.
I develop parsers in NodeJS. I am ready to take on the task. Write to me, we will discuss.
-
1512 15 0 1 Good day. I have looked. I already have a plan for implementation - feel free to reach out.
Current freelance projects in the category Databases & SQL
Accounting, planning, and sales system for a mushroom farm
607 USD
Here is the complete, final text of the Technical Assignment (TA). It combines all your requirements: 16 chambers, 20 contractors, a schedule by days, accounting for containers, profitability calculation, and a mandatory division into three grades of mushrooms. You can fully… Databases & SQL, Client Management & CRM ∙ 20 hours 26 minutes back ∙ 45 proposals |
External report 1C 8.3 — forecast of goods balances
22 USD
An external report (.erf) is needed for 1C:Enterprise 8.3 (configuration to be specified). What it should do: Extract product balances from the database Analyze sales history for the last 30 days Calculate the average sales rate for each product Determine how many days until the… Databases & SQL, Client Management & CRM ∙ 21 hours 2 minutes back ∙ 10 proposals |
Web Application & Database Security Audit for Custom CRM — BaaS / Database-as-API Specialist (PenetrProject Overview We operate a custom-built customer relationship management (CRM) platform that runs two service businesses on a single system. It is a modern JavaScript web application backed by a backend-as-a-service (BaaS) database and deployed on a serverless hosting… Databases & SQL, Testing & QA ∙ 1 day 9 hours back ∙ 9 proposals |
Database synchronizationSynchronization of Microsoft Access programs and CRM SalesDrive. Data transfer from CRM to Microsoft Access in the first stage (changing the funnel status). Data transfer from Microsoft Access to CRM in the second stage (changing the status in the program). Databases & SQL ∙ 1 day 15 hours back ∙ 10 proposals |
Setting up a backup system and optimizing server infrastructureObjective of the work: Ensure reliable data storage for the CRM system and application by implementing an automated backup system, as well as carry out a series of server improvements to enhance the stability, security, and performance of the infrastructure. DevOps, Databases & SQL ∙ 2 days 13 hours back ∙ 23 proposals |