System Design for Recommendations and Search // Eugene Yan // MLOps Meetup #78

Join us at our first in-person conference on June 25 all about AI Quality: www.aiqualityc...
MLOps Community Meetup #78! Last Wednesday we talked to Eugene Yan, an Applied Scientist at Amazon.
//Abstract
How does system design for industrial recommendations and search look like? In this talk, Eugene Yan shares how its often split into:
- Latency-constrained online vs. less-demanding offline environments, and
- Fast but coarse candidate retrieval vs. slower but more precise ranking
We'll also see examples of system design from companies such as Alibaba, Facebook, JD, DoorDash, LinkedIn, and maybe do a quick walk-through on how to implement a candidate retrieval MVP.
//Bio
Eugene Yan designs, builds, and operates machine learning systems that serve customers at scale. He's currently an Applied Scientist at Amazon. Previously, he led the data science teams at Lazada (acquired by Alibaba) and uCare.ai. He writes & speaks about data science, data/ML systems, and career growth at eugeneyan.com and tweets at @eugeneyan.
// Relevant links
eugeneyan.com
applyingml.com
www.oreilly.co...
-------------- ✌️Connect With Us ✌️ ------------
Join our slack community: go.mlops.commu...
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: go.mlops.commu...
Catch all episodes, Feature Store, Machine Learning Monitoring and Blogs: mlops.community/
Connect with Demetrios on LinkedIn: / dpbrinkm
Connect with Eugene on / eugeneyan
Timestamps:
[00:10] System Design for Recommendations and Search
[01:37] Why: Batch vs. Real-time
[02:05] Batch
Recommender (key-value DB)
Recommendations refreshed periodically
[02:21] Real-time
Recommender (REST/gRPC)
Recommendations generated in real-time
[02:37] Batch benefits
Pre-computed
Decouple compute from serving
Lower operational load
[03:25] Real-time benefits
Responsive to time-sensitive context
Reduce cost on non-visiting users
[06:50] Focus on real-time aka on-demand
[07:00] Offline vs Online aspect
[07:11] Offline aspect
Host batch processes such as training, index/graph building
Load data into feature stores
[07:23] Online aspect
Uses artifacts from the offline environment to serve requests
Candidate retrieval and ranking
[07:40] Retrieval
Fast but coarse
Searches millions of items to get hundreds of candidates
Approx NN. Graphs, etc.
[08:05] Ranking
Slower but more precise
Ranks hundreds of candidates
Adds more features
Classification or learning to rank
[08:49] Online Retrieval
[09:37] Offline Ranking
[10:50] Online Retrieval
[11:15] Offline Retrieval
[12:25] How: Industry Examples
[12:45] Building item embeddings for candidate retrieval (Alibaba)
[15:31] Building a graph network for ranking (Alibaba)
[17:06] Building embeddings for retrieval in search (Facebook)
[19:10] Building graphs for query expansion and retrieval (DoorDash)
[22:32] Unnecessary real-time over-engineering
[25:05] Real-time timely decision
[26:27] How: Industry Examples (Retrieval)
[26:43] Collaborative Filtering
[30:32] Candidate Retrieval at KZread (via penultimate embedding)
[32:06] Candidate Retrieval at Instagram (via word2vec)
[33:53] How: Industry Examples (Ranking)
[33:56] Ranking at Google (via sigmoid)
[35:00] Ranking at KZread (via weighted logistic regression)
[35:31] Ranking at Alibab (via Transformer)
[36:16] How: Building an MVP
[36:22] Training: Self-supervised Representation Learning
[37:20] Ranking: Logistic Regression
[37:21] Retrieval: Approximate nearest neighbors
[38:40] Ranking: Logistic Regression
[39:00] Serving: Multiple instances + Load Balancer (or SageMaker)
[39:38] From two-stage to four-stage
[41:54] Further reading
[43:44] Applied ML page
[52:52] Keeping the habit
[55:26] Recommended books for machine learning

Пікірлер: 40

@MLOps2 жыл бұрын
sorry for my audio quality I had the nice mic set up and was talking into it the whole time but zoom was set to receive audio input from my earpods....🤦‍♂️
@MLOps3 ай бұрын
Join us at our first in-person conference on June 25 all about AI Quality: www.aiqualityconference.com/
@ankitbhatia67362 жыл бұрын
Great content, no distractions, to the point. Thanks a lot.
@Public_Daniel2 жыл бұрын
Eugene is a legend, great interview!
@MLOps
Жыл бұрын
yes he is!
@50sKid7 ай бұрын
This was an amazing presentation and there's a reason it's your most popular video now. Thank you.
@ahsanshafiqchaudhry2 жыл бұрын
Very interesting talk! I like how questions are answered based on evidence/use-case i.e. how real time recommendation is a bit of an overkill.
@Rbtamaki6 ай бұрын
Really insightful. Thank you very much for putting the time and effort on the presentation. I really appreciated and learned from the video
@shilinwang1847 Жыл бұрын
IT WAS SO COOL AND INSIGHTFUL! MANY THANKS!
@fuzzywuzzy3186 ай бұрын
this is a singaporean channel! nice to see singapore high quality youtube content!!!!!!!!!!!!
@RenZhang882 жыл бұрын
@31:39 On this. I think, there is the last linear layer project the data into the number of videos to do the softmax. The weights of that layer associated with each video is the vector for each video. Intuitively, if the user vector has large dot product with this video vector, it will have large logit for the softmax thus most probably a match.
@leoxiaoyanqu2 жыл бұрын
Very great talk, lots of great explanations and diagrams all-in-one! Thanks for sharing!
@WangRuinju2 жыл бұрын
Great talk! Thanks for sharing!
@ApdullahYAYIK3 ай бұрын
A minor correction: Skipgram already uses Negative Sampling @MLOps
@ApdullahYAYIK3 ай бұрын
Sum of user scores for CFI2I and SWINGI2I should be at the nominator, please correct me if I am wrong.
@danielhe5392 жыл бұрын
Great details and examples, Eugene.
@bharatsharma29072 жыл бұрын
Great! Thanks for sharing
@MLOps
2 жыл бұрын
Thanks for watching
@ray811030 Жыл бұрын
You put the candidate retrieval and ranking model in the same machine(For example, using SM) Under the SM, user_id -> invoke ANN(db) to get candidates(a bunch of item_ids) -> invoke FS with item_id and user_id to get features separately -> invoke ranking model -> return a bunch of items with score in the sorted manner descendingly. Everything should be done within 200 ms p99
@ray811030
Жыл бұрын
Also, how can we expose our candidate generation and ranking services via generic APIs, so other users can mix-and-match as required? We’ll want to consider these in the long-term roadmap. I'm wondering sh
@goelnikhils Жыл бұрын
Hi Eugene, Thanks for the great video. One question has been troubling me is that for recommendation engine why we can't simply use a GNN to generate user and item embeddings and then use a similarity method such as cosine or dot product to rank items vis a vis a classical two tower model. For all the user, item meta data and other user-item implicit interactions (click, purchase etc.) and other contextual ranking signals embeddings can be generated. These embeddings can be concatenated and then do a dot product with item to rank and serve online. Do you see any challenges in this. Pls advise on priority as I am preparing for an int. Thanks in advance.
@maryamaghili11482 жыл бұрын
very interesting talk! thanks for sharing.
@hby4pi2 жыл бұрын
Great Content Man
@chineduezeofor2481 Жыл бұрын
Awesome interview
@Fordance100 Жыл бұрын
Great overview.
@bowang18252 жыл бұрын
Great talk
@advaitdubhashi9825 Жыл бұрын
Great session !!
@gpprudhvi2 жыл бұрын
Pretty clear and interesting!
@TheSiddhaartha Жыл бұрын
Which type of databases can be used for storing vetted content and ranking done through Deep Learning? Any video/article which recommends databases?
@apekshapriya1650 Жыл бұрын
Thanks for this wonderful talk! There is one point though which I would like to clear. At 14:50, when you talk about the request coming from a user, the user's browser history items is also seen to get the candidate sets. At that point of time, is the present item that a user is currently looking at is also being seen as the input?
@madhubagroy2 жыл бұрын
This is gold!
@TheEmanrese2 жыл бұрын
Great content!
@Gerald-iz7mv11 ай бұрын
TPP = The Personalization Platform?
@dinumahawar98195 ай бұрын
🎉🎉🎉❤❤❤
@bulgakovwork2022 Жыл бұрын
Is it possible to download this information from some resources?
@doj-i Жыл бұрын
🔥
@bibiworm2 жыл бұрын
Is it possible to get the slides? Great talk.
@MLOps
2 жыл бұрын
yess! eugeneyan.com/speaking/mlops-community-recsys/
@hby4pi2 жыл бұрын
Works better with .75 speed
@davidoh090522 күн бұрын
how do I poo poo it lol what does it mean

System Design for Recommendations and Search // Eugene Yan // MLOps Meetup #78

Пікірлер: 40

@MLOps

Жыл бұрын

@MLOps

2 жыл бұрын

@ray811030

Жыл бұрын

@MLOps

2 жыл бұрын

Келесі