Thompson Sampling : Data Science Concepts

The coolest Multi-Armed Bandit solution!
Multi-Armed Bandit Intro : • Multi-Armed Bandit : D...
Table of Conjugate Priors:
en.m.wikipedia.org/wiki/Conju...
My Patreon : www.patreon.com/user?u=49277905

Пікірлер: 47

  • @Ludecan
    @Ludecan2 жыл бұрын

    Man what a good explanation! I was looking for bayesian regression and found your video on it, got it. Now I searched for thompson sampling and it's your channel again! You're saving my day hahaha. Very clear and insightful explanations. Thank you very much!

  • @komuna5984
    @komuna59842 жыл бұрын

    Your explanation made me say WOW!

  • @LuxSaJo
    @LuxSaJo2 жыл бұрын

    I really like your videos! Your explanations are so much better than the ones given by my professors!

  • @ritvikmath

    @ritvikmath

    2 жыл бұрын

    Thanks!

  • @xxshogunflames
    @xxshogunflames3 жыл бұрын

    Very neat, first time I come across Thompson’s sampling!

  • @mihajlom1k1
    @mihajlom1k12 жыл бұрын

    Very well explained video, helped me a lot!

  • @uraskarg710
    @uraskarg710 Жыл бұрын

    Very clear explanation! Thank you so much!

  • @paulseidel5819
    @paulseidel58193 жыл бұрын

    I like the very clear explanation with a reference to the math details for those who want that. Also appreciate the limitations at the end. Thinking of applications to portfolio optimization

  • @ritvikmath

    @ritvikmath

    3 жыл бұрын

    Thanks!

  • @Rfleck-lh8yl
    @Rfleck-lh8yl3 жыл бұрын

    This is pretty awesome, thanks for the great explaination!

  • @ritvikmath

    @ritvikmath

    3 жыл бұрын

    Thanks!

  • @cjkarr
    @cjkarr2 жыл бұрын

    Excellent video - thank you!

  • @ajayram198
    @ajayram198 Жыл бұрын

    Beautiful explanation! Had come across Thomson Sampling during Udemy 's online course on Recommender Systems.

  • @jimmywang6177
    @jimmywang61773 жыл бұрын

    Cool video! There are a lot of videos about DS implementation, I find this channel provides lots of math foundations behind the scene. While a good implementation is important, I believe the theoretical foundation is also very cool and would be crucial to a successful analysis.

  • @ritvikmath

    @ritvikmath

    3 жыл бұрын

    Thanks!

  • @niallmurray2915
    @niallmurray2915Ай бұрын

    great explanation!!!

  • @SliverHell
    @SliverHell Жыл бұрын

    Damn bro. You are good at this

  • @vyaslkv
    @vyaslkv2 ай бұрын

    loved the explanation I thought before this video I could never learn TS Thank you :)

  • @user-or7ji5hv8y
    @user-or7ji5hv8y3 жыл бұрын

    This is really interesting. Never heard of it.

  • @mohammadhassanmomenian2793
    @mohammadhassanmomenian2793 Жыл бұрын

    Amazing. Thanks 😉

  • @HaseebAli-gs5bf
    @HaseebAli-gs5bf3 жыл бұрын

    The posteriors that emerge given the formulas have a standard deviation of 1 after one visit. Does this result depend on the fact that the quality of the restaurants actually have a known standard deviation of 1?

  • @afandidzakaria6881
    @afandidzakaria68813 жыл бұрын

    Can you produce a video to explain about moving least squares method? Thank you in advance

  • @user-bt5il9zq8p
    @user-bt5il9zq8p10 ай бұрын

    Awesome!!!

  • @shipan5940
    @shipan59402 жыл бұрын

    how to pick the next visit to 1 or 2? could you explain that?

  • @tingsun3388
    @tingsun3388 Жыл бұрын

    Could you please also have a video on Importance Sampling?

  • @hEmZoRz
    @hEmZoRzАй бұрын

    Absolutely fantastic content once again, many thanks! However, I would have one important question: you never revisited the assumption that we know sigma,i beforehand, even though in practice it's an unobservable quantity. What should one do with it? Is estimating it from historical data (if such data are available) a big no-no?

  • @talhafaiz3597
    @talhafaiz3597 Жыл бұрын

    Can you please mention the source u studied for this video? Like a journal paper or a textbook u followed. It will help me a lot. Thanks

  • @cocowu4887
    @cocowu48872 жыл бұрын

    how do you get the initial posterior distribution of 20 and -12?

  • @dr.kingschultz
    @dr.kingschultz Жыл бұрын

    You are the best

  • @muhammadal-qurishi7110
    @muhammadal-qurishi71103 жыл бұрын

    What about CRF? r u able to do it?

  • @knivetsil
    @knivetsil2 жыл бұрын

    Is one shortcoming of this method that the variance of the posterior does not scale to the sample variance of the observations for that restaurant? Like, if I went to Restaurant A 50 times and Restaurant B 50 times, and my sample values from Restaurant A were distributed N(5, 1) but my sample values from Restaurant B were distributed N(6, 10), then you would think that my posterior for Restaurant B should have much wider variance than my posterior for Restaurant A. But Thompson Sampling doesn't seem to account for that, instead just scaling posterior variance by the number of observations per restaurant. Am I missing something here?

  • @Enerdzizer
    @Enerdzizer Жыл бұрын

    Is it correct to multiply by sigma squared in the posterior formula? Seems that we need to multiply by sigma only otherwise we can get wrong scale. We then get squared length instead of length

  • @nickmillican22
    @nickmillican223 жыл бұрын

    I wonder if anybody can bring some "Explore-Exploit" thinking to this. Here, Thompson sampling arrives at the optimal solution provided that the 'environment' (restaurant quality) is constant. But what about a changing environment (say, restaurants occasionally going under new management). In this case, it seems that time exploring should always remain higher than it would in a constant environment. Is there an analagous sampling routine for such situation?

  • @nickmillican22

    @nickmillican22

    3 жыл бұрын

    Been thinking about this. I may have a partial solution. Since, once sufficient data is available, the 'better' option might always out compete the 'lesser' option, a change to the environment that makes the lesser option the better will go undetected. So perhaps the goal is to increase the uncertainty in the posteriors in proportion to the number of future events. One way (I think) to do this would be to weight the data by something like [1/total planned visits to any restaurant]. In this way, much of the 'uninformation' of the prior is maintained--permitting increased exploration. But even if this is okay, what do you do if you plan to visit restaurants infinitely many times?

  • @charlessimmons3709
    @charlessimmons37093 жыл бұрын

    Do you have the article you mentioned (4:42) witht the table of prior/posterior distributions?

  • @constantin1481

    @constantin1481

    3 жыл бұрын

    He was possibly referring to this paper: statweb.stanford.edu/~cgates/PERSI/papers/conjprior.pdf

  • @ritvikmath

    @ritvikmath

    3 жыл бұрын

    Just linked! Sorry bout that

  • @charlessimmons3709

    @charlessimmons3709

    3 жыл бұрын

    @@ritvikmath Thanks!

  • @user-or7ji5hv8y
    @user-or7ji5hv8y3 жыл бұрын

    Actually you had the board covered the entire video. Couldn’t take a photo unobstructed this time.

  • @ritvikmath

    @ritvikmath

    3 жыл бұрын

    Sorry! Will try to remember that

  • @MsKisshello
    @MsKisshello2 жыл бұрын

    Do you know Top-two thompson sampling?

  • @karansaxena976
    @karansaxena9763 жыл бұрын

    Ritvik you are cool

  • @ritvikmath

    @ritvikmath

    3 жыл бұрын

    Wow thanks!

  • @MrTSkV
    @MrTSkV3 жыл бұрын

    8:47 "we sample from those posteriors" You mean "priors"?

  • @TonkatsuChickenTJ

    @TonkatsuChickenTJ

    2 жыл бұрын

    in the first visit the posterior is equal to the prior

  • @9okku
    @9okku2 жыл бұрын

    You can use this Python Code: # Thompson Sampling # Importing the libraries import numpy as np import matplotlib.pyplot as plt import pandas as pd # Importing the dataset dataset = pd.read_csv('Ads_CTR_Optimisation.csv') # Implementing Thompson Sampling import random N = 1000 d = 10 ads_selected = [] numbers_of_selections=np.zeros(10) # Ni(n) variance_posterior = [1e2]*d mean_posterior=[0]*d sum_sample=[0]*d numbers_of_rewards_0 = [0] * d total_reward = 0 for n in range(0,N): ad=0 max_sample=0 for i in range(0,d): sample=np.random.normal(mean_posterior[i], variance_posterior[i]) if sample>max_sample: max_sample=sample ad=i ads_selected.append(ad) numbers_of_selections[ad]+=1 reward=dataset.values[n,ad] sum_sample[ad]+=reward variance_posterior[ad]=1/(1/(1e2*1e2) + numbers_of_selections[ad]) mean_posterior[ad]=variance_posterior[ad]*sum_sample[ad] # Visualising the results - Histogram plt.hist(ads_selected) plt.title('Histogram of ads selections') plt.xlabel('Ads') plt.ylabel('Number of times each ad was selected') plt.show() print(numbers_of_selections[4]/sum(numbers_of_selections)*100)