AI lead analyzer from 8 RSS feeds
Goal:
Create an autonomous AI assistant for monitoring and qualifying freelance projects for a marketing agency (SEO/SMM/PPC). The key requirement is the aggregation of data from 8+ different RSS feeds, complete automatic deduplication of projects to save costs, and intelligent analysis of each unique lead using OpenAI before sending it to the client's Telegram group.
My Contribution:
The project had two fundamental problems:
Information Noise: Programming (Python, PHP) and design projects were landing in relevant marketing categories (e.g., "AI" or "Bots").
Mass Duplicates: The same project often appeared in 3-4 different RSS feeds simultaneously, leading to 3-4 identical notifications and, worst of all, 3-4 times the payment for analysis in OpenAI.
My contribution was in designing a complex, multi-stage architecture of a "pipeline" in n8n. I developed a "bulletproof" deduplication system that is the heart of this workflow. Instead of simple filtering, I combined "streaming" deduplication (within a single run) with "persistent memory" (n8n Data Tables), ensuring that no project is analyzed twice, regardless of when and where it came from.
Solution:
The final solution is a single n8n workflow that runs on a schedule every 10 minutes and consists of 5 logical blocks:
1. Collection and Aggregation Block:
The Schedule Trigger initiates 8 parallel RSS Read nodes, each monitoring its category (SEO, SMM, PPC, Leads, etc.).
The Merge (Combine All) node collects all 8 streams into one array of projects.
2. Preparation Block:
The Set (Edit Fields1) node standardizes the data and creates a fullText field (from title and content) for future analysis.
3. Deduplication Block (Key Stage):
Data Table (Get row(s)): Loads from "memory" (Processed_Leads) the complete list of guids of all previously processed projects.
Merge (Merge_Deduplicate): Uses the keepNonMatches mode. It compares the stream of new projects (Input 1) with the list of old guids (Input 2) and only passes on those projects that are not in "memory."
Remove Duplicates (Node 1): Removes duplicates within the current run (in case one project came from 2 RSS feeds simultaneously).
Remove Duplicates (Node 2): An additional "on-the-fly" check against n8n's internal memory, ensuring 100% uniqueness.
4. AI Analysis and Storage Block:
Message a model (OpenAI): Receives only unique projects. The GPT-4 prompt analyzes fullText and returns JSON with a score, reason, and "trash" marker.
Data Table (Insert row): Immediately records the guid of the just-analyzed project in "memory" (Processed_Leads), so it will never go through deduplication again.
5. Notification Block:
Code (JavaScript): A "sanitizer" node that cleans the title and reason of special characters (*, _, [ ]), which could break Telegram formatting.
Telegram (2 nodes): Sends a perfectly formatted, analyzed message with AI scoring to two recipients — me (for monitoring) and the client's working group.
Result:
A fully autonomous AI assistant has been created that monitors 8 sources 24/7. The client received a system that:
Guaranteed saves money: 100% of duplicates are filtered before sending to OpenAI, preventing unnecessary API costs.
Saves time: The client receives not a "raw" stream, but already analyzed leads with a score and a brief summary.
High relevance: The intelligent prompt in OpenAI further filters out "trash" (is_trash: true) that slipped through the RSS.
Reliability: Using Data Tables as persistent "memory" ensures that even when the workflow is restarted, the system does not send old projects.
#n8n #OpenAI #GPT4 #WorkflowAutomation #LeadGeneration #RSS #APIIntegration #DataTables #Deduplication #Telegram #JavaScript #Freelance #MarketingAutomation #SEO #PPC #SMM #Automation #LeadGeneration #Marketing
Create an autonomous AI assistant for monitoring and qualifying freelance projects for a marketing agency (SEO/SMM/PPC). The key requirement is the aggregation of data from 8+ different RSS feeds, complete automatic deduplication of projects to save costs, and intelligent analysis of each unique lead using OpenAI before sending it to the client's Telegram group.
My Contribution:
The project had two fundamental problems:
Information Noise: Programming (Python, PHP) and design projects were landing in relevant marketing categories (e.g., "AI" or "Bots").
Mass Duplicates: The same project often appeared in 3-4 different RSS feeds simultaneously, leading to 3-4 identical notifications and, worst of all, 3-4 times the payment for analysis in OpenAI.
My contribution was in designing a complex, multi-stage architecture of a "pipeline" in n8n. I developed a "bulletproof" deduplication system that is the heart of this workflow. Instead of simple filtering, I combined "streaming" deduplication (within a single run) with "persistent memory" (n8n Data Tables), ensuring that no project is analyzed twice, regardless of when and where it came from.
Solution:
The final solution is a single n8n workflow that runs on a schedule every 10 minutes and consists of 5 logical blocks:
1. Collection and Aggregation Block:
The Schedule Trigger initiates 8 parallel RSS Read nodes, each monitoring its category (SEO, SMM, PPC, Leads, etc.).
The Merge (Combine All) node collects all 8 streams into one array of projects.
2. Preparation Block:
The Set (Edit Fields1) node standardizes the data and creates a fullText field (from title and content) for future analysis.
3. Deduplication Block (Key Stage):
Data Table (Get row(s)): Loads from "memory" (Processed_Leads) the complete list of guids of all previously processed projects.
Merge (Merge_Deduplicate): Uses the keepNonMatches mode. It compares the stream of new projects (Input 1) with the list of old guids (Input 2) and only passes on those projects that are not in "memory."
Remove Duplicates (Node 1): Removes duplicates within the current run (in case one project came from 2 RSS feeds simultaneously).
Remove Duplicates (Node 2): An additional "on-the-fly" check against n8n's internal memory, ensuring 100% uniqueness.
4. AI Analysis and Storage Block:
Message a model (OpenAI): Receives only unique projects. The GPT-4 prompt analyzes fullText and returns JSON with a score, reason, and "trash" marker.
Data Table (Insert row): Immediately records the guid of the just-analyzed project in "memory" (Processed_Leads), so it will never go through deduplication again.
5. Notification Block:
Code (JavaScript): A "sanitizer" node that cleans the title and reason of special characters (*, _, [ ]), which could break Telegram formatting.
Telegram (2 nodes): Sends a perfectly formatted, analyzed message with AI scoring to two recipients — me (for monitoring) and the client's working group.
Result:
A fully autonomous AI assistant has been created that monitors 8 sources 24/7. The client received a system that:
Guaranteed saves money: 100% of duplicates are filtered before sending to OpenAI, preventing unnecessary API costs.
Saves time: The client receives not a "raw" stream, but already analyzed leads with a score and a brief summary.
High relevance: The intelligent prompt in OpenAI further filters out "trash" (is_trash: true) that slipped through the RSS.
Reliability: Using Data Tables as persistent "memory" ensures that even when the workflow is restarted, the system does not send old projects.
#n8n #OpenAI #GPT4 #WorkflowAutomation #LeadGeneration #RSS #APIIntegration #DataTables #Deduplication #Telegram #JavaScript #Freelance #MarketingAutomation #SEO #PPC #SMM #Automation #LeadGeneration #Marketing