Research

notes

Explore existing real-time recommendation systems.

1. Scaling the Instagram Explore recommendations system

1.1 Retrieval

This stage retrieves the content that will be ranked in later stages
Narrows down the search from billions to hundreds
There are four types of candidate sources:

Two tower model:

Two tower model for retrieval:

Two tower model for history:

1.2 First stage ranking

Lightweight model that is less precise and less computationally intensive and can recall thousand of candidates.
Train the first stage ranker to predict the output of the second stage ranker.

1.3 Second stage ranking

Multi task multi label neural network
consumes user-item interactions

1.4 Final ranking

apply some rules to comply with business rules

2. How to design and implement an MVP

from the article :

To train item embeddings, we adopt the simple but effective word2vec approach, specifically, the skip-gram model. (This is also used by Instagram, Twitter, and Alibaba.) I’ve previously written about how to create embeddings via word2vec and DeepWalk and won’t go into details here.

To generate candidates, we apply k-nearest neigbours (à la YouTube’s implementation). However, exact kNN is slow and we don’t really need the precision at this stage. Thus, we’ll use approximate nearest neighbours (ANN) instead.

Identify potential algorithms suitable for cold-start recommendations

User cold start

Strategies to address user cold start problem:

Show popular or nearby items.
Ask user to pick a few interesting items during first time visit.
Use sequence of item IDs instead of user ID as your query.

Item Cold start

Strategies to address item cold start problem:

Illicit user interaction with the item as fast as possible
Content based filtering
- Item metadata
- Expert Knowledge
Random exploration

System cold start

Zero interaction between user and items.