Technical assignment: Automated analytical report generation system (using n8n and AI)
Goal: Create an automated pipeline in n8n that collects and processes data from various sources, and then generates full-text analytical reports based on that data, mimicking the logic and structure of reports written by a human analyst.
Main tasks by stages
Stage 1: Data collection and integration
1.1. Statistical data: Implement the upload of tabular data (Excel/CSV) with statistical data and indicators, prices, production volumes, etc. Two modes should be supported:
Automatic: from email attachments, Google Drive / Dropbox folders.
Manual: through a web form or by running an n8n script with an attached file.
1.3. Internal reports: Add the ability to upload examples of completed analytical reports (PDF/Doc) to extract specific data from them or to use as a template.
Stage 2: Data processing and structuring
2.1. Cleaning and filtering: Automatically remove unnecessary rows/columns, standardize data formats (dates, numbers, currencies).
2.2. Aggregation and metric calculation:
Group data by specified parameters (regions, countries, prices, time periods).
Calculate key metrics: dynamics (growth/drop in %, YoY, MoM), shares, averages, anomaly detection, etc.
2.3. Preparing data for AI: Formulate final structured data (in JSON or text format) – "analyst's reference," which will contain all necessary figures, facts, and excerpts from news for writing the report.
Stage 3: Generating the text analytical report
3.1. Formulating a request (prompt) to AI: The n8n script should dynamically create an expanded prompt for the language model (LLM), including:
Context: "You are a financial analyst. Prepare a report for [Period] on the [Market Name] market."
Structured data: All figures and metrics calculated in Stage 2.
Template and instructions: "Follow the structure of the attached example report. Start with general conclusions, then detail by regions. Explain price growth using facts from the news. Conclude the report with a forecast, etc."
3.2. Interaction with the language model: Set up API integration with a generative language model (e.g., OpenAI, Anthropic Claude, or similar).
3.3. Creating text: Send the prompt and receive the finished, coherent text of the analytical report.
Stage 4: Formatting and exporting the result
4.1. Compiling the final document: Combine the generated text (from Stage 3) with key tables and charts (generated in Stage 2).
4.2. Export: Export the final report in one of the selected formats:
Google Docs (preferably, for easy editing)
PDF file
Sending text and tables via email.
Data storage
5.1. Database: Set up a database (PostgreSQL, but SQLite or Airtable may also be used) to store all collected and processed data.
5.2. Architecture: The database structure should allow for easy addition of new sources, data types, and conducting retrospective analysis over long periods.
Input data (provided by the client)
Examples of Excel/CSV tables with data.
Template and examples of target analytical reports.
A list of key fields, metrics, and parameters for analysis.
Access to necessary APIs, RSS sources.
Requirements for the performer
Deep experience with n8n and building complex, scalable scripts.
Experience in ETL (Extract, Transform, Load) processes and data automation.
Understanding of the logic of working with relational databases (PostgreSQL/SQLite).
Experience working with APIs of generative language models (LLM), ability to create effective prompts (prompt engineering).
Understanding the basics of statistics and data analysis. If you read to the end, please include the last two sentences from my text in your response, so I know you are not a robot.