Description
Build a NodeJS daemon (cron job) that collects instagram user audience and followers count stats and stores them in database.
Requirements:
- Audience statistics and followers statistics must be stored in separate MongoDB time series collections: audience_analytics and followers_analytics.
- Audience statistics ****is collected once per day per single user and updated in audience_analytics (regular) ****collection. A single audience analytics record is stored per user and gets updated every day.
- Followers statistics is collected every hour and inserted as new record into followers_analytics collection. A new record is inserted every hour for a single user.
Technical Notes
Facebook API
To collect following stats from Facebook api you need to acquire a permanent Facebook API Access Token with following permissions: **pages_show_list**, **instagram_basic**, **instagram_manage_insights**, **pages_read_engagement**, **pages_manage_metadata**, **public_profile**. [How to get long lived access token.](<https://developers.facebook.com/docs/instagram-basic-display-api/guides/long-lived-access-tokens/>)
00000000000000000 - replace with instagram user id
followers_analytics is collected from this Facebook graph api endpoint:
v14.0/00000000000000000/insights?metric=follower_count&period=day (change in followers for that day)
AND
v14.0/00000000000000000?fields=followers_count (total followers count at the moment of request)
audience_analytics is collected from this Facebook graph api endpoint:
v14.0/00000000000000000/insights?metric=audience_city,audience_country,audience_gender_age,audience_locale,online_followers&period=lifetime
Mongo collection structure
followers_analytics time series collection ****should have the following schema
{
"_id": ObjectId, // generated automatically by mongo
"timestamp": ISODate, //
"metadata": {
"user_id": Number, // instagram user id
},
"followers_count": Number, // total number of followers for that day.
"followers_change": [ // Array of followers change for current and previous day
{
"value": Number,
"end_time": String
},
{
"value": Number,
"end_time": String
}
]
}
followers_count - taken from v14.0/00000000000000000?fields=followers_count API response.
followers_change - taken from v14.0/00000000000000000/insights?metric=follower_count&period=day API response data[0].values Array.
Below is example of v14.0/00000000000000000/insights?metric=follower_count&period=day
Facebook API response:
{
"data": [
{
"name": "follower_count",
"period": "day",
"values": [
{
"value": 1,
"end_time": "2022-08-30T07:00:00+0000"
},
{
"value": 0,
"end_time": "2022-08-31T07:00:00+0000"
}
],
"title": "Follower Count",
"description": "Total number of unique accounts following this profile",
"id": "00000000000000000/insights/follower_count/day"
}
],
"paging": {
"previous": "<https://graph.facebook.com/v14.0/00000000000000000/insights?access_token=****&pretty=0&metric=follower_count&period=day&since=1661630445&until=1661803245>",
"next": "<https://graph.facebook.com/v14.0/00000000000000000/insights?access_token=*****&pretty=0&metric=follower_count&period=day&since=1661976047&until=1662148847>"
}
}
audience_analytics should have the following schema
{
"_id": ObjectId, // generated automatically by mongo
"timestamp": ISODate, //
"metadata": {
"user_id": Number, // instagram user id
},
"audience_city": Array, // Array of values from API response
"audience_country": Array, // Array of values from API response
"audience_gender_age": Array, // Array of values from API response
"audience_locale": Array, // Array of values from API response
"online_followers": Array // Array of values from API response
}
Below is example of v14.0/00000000000000000/insights?metric=audience_city,audience_country,audience_gender_age,audience_locale,online_followers&period=lifetime
Facebook API response:
{
"data": [
{
"name": "audience_city",
"period": "lifetime",
"values": [
{
"value": {
"London, England": 3,
"Almere, Flevoland": 2,
"Odessa, Odessa Oblast": 7,
"Kharkiv, Kharkiv Oblast": 40,
"Barcelona, Cataluña": 4,
"Sumy, Sumy Oblast": 2,
"Amsterdam, Noord-Holland": 4,
"Kamianets-Podilskyi, Khmelnytskyi Oblast": 3,
"Alanya, Antalya Province": 2,
"Cherkasy, Cherkasy Oblast": 2,
"Lisbon, Lisbon District": 2,
"Berlin, Berlin": 18,
"Khmelnytskyi, Khmelnytskyi Oblast": 4,
"Valky, Kharkiv Oblast": 3,
"Lviv, Lviv Oblast": 23,
"Kyiv, Kyiv": 86,
"Prague, Prague": 3,
"Merefa, Kharkiv Oblast": 3,
"Moscow, Moscow": 2,
"Cologne, Nordrhein-Westfalen": 3,
"Uzhhorod, Zakarpattia Oblast": 5,
"Rivne, Rivne Oblast": 2,
"Vladivostok, Primorsky Krai": 2,
"Dnipro, Dnipropetrovsk Oblast": 6,
"Warsaw, Masovian Voivodeship": 13,
"Hoogeveen, Drenthe": 2,
"Stockholm, Stockholm County": 2,
"Dikanka, Poltava Oblast": 3,
"Chernivtsi, Chernivtsi Oblast": 6,
"Kraków, Lesser Poland Voivodeship": 2,
"Ivano-Frankivsk, Ivano-Frankivsk Oblast": 7,
"Hannoversch Münden, Niedersachsen": 2,
"Korotych, Kharkiv Oblast": 2,
"Ternopil, Ternopil Oblast": 5,
"Wroclaw, Lower Silesian Voivodeship": 8,
"Istanbul, Istanbul Province": 3,
"Toronto, Ontario": 2,
"Vinnytsia, Vinnytsia Oblast": 7,
"Frankfurt, Hessen": 2,
"Mukacheve, Zakarpattia Oblast": 2,
"Gdansk, Pomeranian Voivodeship": 2,
"Lyubotyn, Kharkiv Oblast": 2,
"Chortkiv, Ternopil Oblast": 2,
"Poltava, Poltava Oblast": 11,
"Okhtyrka, Sumy Oblast": 2
}
}
],
"title": "Audience City",
"description": "The cities of this profile's followers",
"id": "00000000000000000/insights/audience_city/lifetime"
},
{
"name": "audience_country",
"period": "lifetime",
"values": [
{
"value": {
"DE": 38,
"BE": 2,
"RU": 7,
"PT": 5,
"DK": 1,
"HR": 2,
"FR": 3,
"UA": 252,
"HU": 1,
"QA": 1,
"BR": 2,
"SE": 4,
"SK": 1,
"GB": 6,
"IE": 2,
"GE": 2,
"CA": 4,
"US": 4,
"EE": 1,
"IL": 2,
"AE": 1,
"CH": 3,
"MX": 1,
"CN": 1,
"IT": 2,
"ES": 8,
"AT": 1,
"CZ": 5,
"PH": 1,
"PK": 1,
"PL": 32,
"RO": 1,
"TR": 7,
"NL": 11,
"BA": 1
}
}
],
"title": "Audience Country",
"description": "The countries of this profile's followers",
"id": "00000000000000000/insights/audience_country/lifetime"
},
{
"name": "audience_gender_age",
"period": "lifetime",
"values": [
{
"value": {
"F.18-24": 15,
"F.25-34": 117,
"F.35-44": 15,
"F.45-54": 1,
"F.55-64": 3,
"F.65+": 2,
"M.18-24": 19,
"M.25-34": 135,
"M.35-44": 33,
"M.45-54": 2,
"M.55-64": 2,
"M.65+": 2,
"U.13-17": 1,
"U.18-24": 10,
"U.25-34": 55,
"U.35-44": 3,
"U.65+": 1
}
}
],
"title": "Gender and Age",
"description": "The gender and age distribution of this profile's followers",
"id": "00000000000000000/insights/audience_gender_age/lifetime"
},
{
"name": "audience_locale",
"period": "lifetime",
"values": [
{
"value": {
"it_IT": 1,
"es_LA": 1,
"ru_RU": 163,
"pl_PL": 7,
"da_DK": 1,
"tr_TR": 2,
"fr_FR": 3,
"de_DE": 6,
"ar_AR": 1,
"en_GB": 7,
"ru_UA": 1,
"en_US": 133,
"uk_UA": 84,
"zh_CN": 1,
"es_ES": 2,
"pt_PT": 1
}
}
],
"title": "Location",
"description": "The locales by country codes of this profile's followers",
"id": "00000000000000000/insights/audience_locale/lifetime"
},
{
"name": "online_followers",
"period": "lifetime",
"values": [
{
"value": {
},
"end_time": "2022-08-31T07:00:00+0000"
}
],
"title": "Online Followers",
"description": "Total number of this profile's followers that were online during the specified period",
"id": "00000000000000000/insights/online_followers/lifetime"
}
]
}
Overal architecture

Acceptance criteria:
- Latest Yarn package manager is used.
- Latest stable NodeJS version is used.
- Raw MongoDB driver is used to connect to MongoDB.
- Profile Analytics daemon is build as a package inside yarn workspace.
- Analytics Data is stored in MongoDB inside a time series collection https://www.mongodb.com/docs/manual/core/timeseries-collections/
- Analytics daemon collects data every hour and uses queue to figure out which instagram profiles need to be updated. Profiles are queued for an update every hour. Queue can be split between multiple copies of Analytics daemon for scalability.
- Mongo Collection is used as a queue(if possible). If it is not possible to use mongo collection as a queue use Kafka or RabbitMq(Kafka is more preferable).
- Analytics data is collected via long lived instagram user access tokens (falling back to a shared token if the user token is not available)