Parser prom.ua
The script addresses the task of transforming the ordinary web catalog Prom.ua into a structured data source suitable for automation and analysis.
From a technical standpoint, it solves the problem of the lack of a public full API: instead, it uses a combination of HTML parsing and an internal GraphQL query (`/graphql`), which the site itself uses to load data on the product page.
It combines two levels of data: static HTML (product list, basic parameters) and dynamic GraphQL (delivery, payment, availability, regions, seller business logic). This eliminates the fragmentation of information, where part of the data is available only in the interface and part only through the API.
It also addresses the problem of scaling access to data: instead of manually opening pages, it implements automatic crawling of categories with pagination and sequential processing of products.
From an engineering perspective, it ensures normalization and unification of data: different response formats (HTML + JSON GraphQL) are brought to a unified structure and saved to a file.
Additionally, it serves as a monitoring tool: it allows tracking prices, availability, delivery conditions, and changes from sellers in an automated manner.
As a result, it is not just a parser, but a mini ETL pipeline (extract → transform → load) that transforms the web interface of the marketplace into a database suitable for analysis and automation.
From a technical standpoint, it solves the problem of the lack of a public full API: instead, it uses a combination of HTML parsing and an internal GraphQL query (`/graphql`), which the site itself uses to load data on the product page.
It combines two levels of data: static HTML (product list, basic parameters) and dynamic GraphQL (delivery, payment, availability, regions, seller business logic). This eliminates the fragmentation of information, where part of the data is available only in the interface and part only through the API.
It also addresses the problem of scaling access to data: instead of manually opening pages, it implements automatic crawling of categories with pagination and sequential processing of products.
From an engineering perspective, it ensures normalization and unification of data: different response formats (HTML + JSON GraphQL) are brought to a unified structure and saved to a file.
Additionally, it serves as a monitoring tool: it allows tracking prices, availability, delivery conditions, and changes from sellers in an automated manner.
As a result, it is not just a parser, but a mini ETL pipeline (extract → transform → load) that transforms the web interface of the marketplace into a database suitable for analysis and automation.