Ranking algorithms in MongoDB

| A little context

Surfacing content has become a common task in the development of modern web applications. Deciding which content to surface and when to surface it is becoming increasingly complex. Users expect a more personalised and contextual feed/recommendations in the applications they use.

In this article I will create a dynamic, real-time content ranking algorithm that we will use to sort a feed.

| The App

For brevity, we will create a simple app with a single feed of user generated posts.

Users can:

  • View a post (V)
  • Like a post (L)
  • Share a post (S)
  • Follow another user (F)

| The Algorithm

The posts in the feed of the app will be sorted by our ranking algorithm.

Constraints

i) Posts with more views, likes and shares will rank higher than others. Each of these variables (views, likes, and shares) will be weighted differently - for instance one like will increase the ranking more than one view (as will one share).

ii) Newer posts will be ranked higher than older posts. We want this feed to be dynamic and forever changing (we are assuming that users are regularly posting new content to our app). With our current constraints we get a 'rich get richer' behaviour. Posts with more views, likes, and shares will be given a higher ranking and thus surface higher in our feed. This will offer those posts more opportunity for further views, likes, and shares increasing their ranking and so on.

By introducing a temporal aspect to our algorithm we can choose how we handle the ranking of a post as it ages. In our case we want the ranking to decay as the post gets older. This will promote the surfacing of new content over older content keeping our feed fresh.

iii) New posts with no views, likes, or shares must be given a chance to surface.

Taking the above constraints into consideration we can create our algorithm as follows: $$ f(t) = \frac{\left(0.9L{1}\ +\ 0.3V{1}+2S{1}+12f{1}+60n_{1}\right)}{0.6+0.002t^{2}} $$ Where:

L is number of likes

V is number of views

S is number of shares

f is a boolean (1/0) set to 1 if the user getting the posts follows the user who posted the post

n is a boolean (1/0) set to 1 if the post is less than 1 hour old


Talk about doing on server v database and how database makes more sense as data becomes larger and any pagination means database stops pulling all data unnecessarily.

Mention how we could take into consideration the rate of likes views and shares and make this contribute to our ranking (higher rate = higher vitality). Also how it could be cool to take lastUpdated as if that is below a certain value to increase the ranking more

test

const test = 'hi'

šŸ¤”

  1. https://techcrunch.com/2014/11/12/twitter-instant-timeline/?guccounter=1&guce_referrer=aHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnLw&guce_referrer_sig=AQAAAC-vzq1wt05Ez0WjvVoNJeHCeEklKELDwYcW_wxeV6ku72B4X5g_2p0WrfNi8e1d29DEXWfqWGi-jOxd45kbIIZdBP2teBko2rhfWBqjYYt5F824uS9JheDXCAiZR_TdCXJbNP7m7C58KpJXt_J3Ry05Ye-r64Ke5keDcihQLBgS