Join Data Science Interview MasterClass (in 3 weeks) 🚀 led by FAANG Data Scientists | Just 8 slots remaining...

Requirement Gathering

Here’s an example of a dialogue between the candidate and interviewer. The main focus is to clarify and frame the problem by asking clarifying questions to the interviewer. Some aspects you could clarify are the key objective of the ML system, user experience, data sources and scalability.

[Interviewer] How do you design a product recommendation on Amazon

[Candidate] When you mean product recommendation, do you mean the recommendation served on homepage, search, or a product like this?

[Interviewer] Good question. Let’s presume that we want the recommendation served on search.

[Candidate] Understood. If a user searches for a product – let’s say Athletic Socks, the search result should provide a list of products relevant to the user. Do I have a valid understanding of the user experience?

[Interviewer] Yes, that is correct.

[Candidate] Are there any specific business metrics that we’re aiming to optimize with these recommendations? For instance, are we focusing on increasing overall sales, enhancing user engagement, or maybe reducing the churn rate?

[Interviewer] That’s an important point. Let’s say our main goal is to increase overall sales while also ensuring that users find the recommendations relevant.

[Candidate] Noted. To make the recommendations effective, we would need to understand the user’s preferences and shopping habits. What kind of user data would we have access to? For example, past purchases, browsing history, clicked items, or any demographic data?

[Interviewer] We do have access to past purchase history, browsing history, clicked items, and some basic demographic data.

[Candidate] Do we also have product data as well? For instance, product name, description, price, and ratings?

[Interviewer] Yes, you have data on product details.

[Candidate] How quickly should the recommender system provide an inference? I know that the inference speed for search systems in large-scale online platforms is usually between 200 to 300 milliseconds. Should I consider this in the design as well?

[Interviewer] That sounds right.

[Candidate] Great! Lastly, how real-time should the recommendations be? Should they instantly reflect recent user actions, or can they be based on a daily update?

[Interviewer] They don’t need to be updated in real-time but should ideally be updated at least once a day.

Problem Formulation

Problem Formulation

Design a product recommendation system that inputs a user’s search query and recommends a list of products relevant to the user based on user’s history and product metadata.

Scale

In designing a recommender system on Amazon, and any other large-scale platforms that require recommender system, consider the scale of users, products, load and latency. These are factors you should account for when designing a scalable recommender system. Here are the details for Amazon.

  • Monthly Active Users: ~350 Million
  • Number of Products: ~20 Million Products
  • Queries per Second: 1,000 to 2,000
  • Search Result Latency: 200 to 300 milliseconds