How to Detect and Remove Outliers in the Data | Python

⭐️ Content Description ⭐️
In this video, I have explained on how to detect and remove outliers in the dataset using python. Removing outliers will be very helpful for data cleaning and preprocessing. The methods used are z-score, inter quartile range & percentile.
Text-based Tutorial: www.hackersrealm.net/post/det...
GitHub Code Repo: bit.ly/datascienceconcepts
🌐 Website: www.hackersrealm.net
🔔 Subscribe: bit.ly/hackersrealm
🗓️ 1:1 Consultation with Me: calendly.com/hackersrealm/con...
📷 Instagram: / aswintechguy
🔣 Linkedin: / aswintechguy
🎯 GitHub: github.com/aswintechguy
🎬 Share: • How to Detect and Remo...
⚡️ Data Structures & Algorithms tutorial playlist: bit.ly/dsatutorial
😎 Hackerrank problem solving solutions playlist: bit.ly/hackerrankplaylist
🤖 ML projects tutorial playlist: bit.ly/mlprojectsplaylist
🐍 Python tutorial playlist: bit.ly/python3playlist
💻 Machine learning concepts playlist: bit.ly/mlconcepts
✍🏼 NLP concepts playlist: bit.ly/nlpconcepts
🕸️ Web scraping tutorial playlist: bit.ly/webscrapingplaylist
Make a small donation to support the channel 🙏🙏🙏:-
🆙 UPI ID: hackersrealm@apl
💲 PayPal: paypal.me/hackersrealm
🕒 Timeline
00:00 Introduction to Detection of Outliers
02:01 Z-score Method
12:00 Inter Quartile Range Method
17:12 Percentile Method
#detectoutliers #mlconcepts #hackersrealm #removeoutliers #anomalydetection #deeplearning #machinelearning #datascience #model #project #artificialintelligence #beginner #analysis #python #tutorial #aswin #ai #dataanalytics #data #bigdata #programming #datascientist #technology #coding #datavisualization #computerscience #pythonprogramming #analytics #tech #dataanalysis #programmer #statistics #developer #ml #coder #dataanalyst

Пікірлер: 52

  • @asadnaeem123
    @asadnaeem123Күн бұрын

    Amazing tutorial. Bro, you made my day. Lots of love from Pakistan.

  • @HackersRealm

    @HackersRealm

    Күн бұрын

    Glad to hear that!!!

  • @pankajgoikar4158
    @pankajgoikar4158 Жыл бұрын

    You are amazing bro. Don't have words to thank you. you have cleared my many concepts. Lots of love from UK and god bless you. 😊

  • @HackersRealm

    @HackersRealm

    Жыл бұрын

    Thank you so much for your kind words ❤️

  • @grandson_f_phixis9480
    @grandson_f_phixis94802 ай бұрын

    Thank you very much sir!!

  • @negusuworku2375
    @negusuworku23755 ай бұрын

    This is very helpful. Excellent.

  • @HackersRealm

    @HackersRealm

    5 ай бұрын

    Glad you liked it!!!

  • @ocraking
    @ocrakingАй бұрын

    what an amazing video

  • @insight_generator
    @insight_generator5 ай бұрын

    This video helped me a lot. Thanks!

  • @HackersRealm

    @HackersRealm

    5 ай бұрын

    Glad it was helpful!!!

  • @DJnaidu22
    @DJnaidu223 ай бұрын

    Bruh I have a doubt..... please explain briefly..... These three techniques are used for trimming or capping outliers in the dataset...... But why don't we use only z-score to find outliers. Then what's the diff between these three techniques??

  • @ArniFuentes
    @ArniFuentesКүн бұрын

    Thank you so much!!!. A question: in what type of distributions can the box plot be used? For example, if the data follows a uniform distribution, does it make sense to find outliers? What do you recommend me?

  • @HackersRealm

    @HackersRealm

    12 сағат бұрын

    You can use box plot and check if there are any outlier for any distribution. If there is some outliers, do the processing, if not ignore it.

  • @ArniFuentes

    @ArniFuentes

    11 сағат бұрын

    @@HackersRealm thanks for your answer

  • @sushmitarawat6438
    @sushmitarawat643811 ай бұрын

    Too good....and simple thanks a lot☺️🙏🏼

  • @HackersRealm

    @HackersRealm

    11 ай бұрын

    Glad you like it sushmita!!!

  • @sushmitarawat6438

    @sushmitarawat6438

    11 ай бұрын

    @@HackersRealm could you suggest some paid internship which I can start off with the very next month

  • @HackersRealm

    @HackersRealm

    11 ай бұрын

    @@sushmitarawat6438 For ML based internship, it's better to compete in hackathons or contest to get internship.. You could checkout hackerearth, techgig, etc., for that

  • @sushmitarawat6438

    @sushmitarawat6438

    11 ай бұрын

    @@HackersRealm ok

  • @debangshubarua5345
    @debangshubarua5345 Жыл бұрын

    Good vedio... Do i need check for all the numeric columns one by one and perform capping operation??????

  • @HackersRealm

    @HackersRealm

    Жыл бұрын

    You can use a loop to do it for all numeric columns at once...

  • @massoudkadivar8758
    @massoudkadivar87585 ай бұрын

    Thank you so much, I have a question, do we need to do this process for each column one by one?

  • @HackersRealm

    @HackersRealm

    5 ай бұрын

    yes, that's correct, you can use loops to automate this.

  • @DJnaidu22
    @DJnaidu223 ай бұрын

    really a great explanation

  • @HackersRealm

    @HackersRealm

    3 ай бұрын

    Glad you liked it!!!

  • @vietttt0104
    @vietttt0104 Жыл бұрын

    Greate Tutorial!! Thanks a lot!! I have a question that How could I do it with the whole dataset? not a single one

  • @HackersRealm

    @HackersRealm

    Жыл бұрын

    you can iterate the columns and process the whole data

  • @aniketlode4808

    @aniketlode4808

    Жыл бұрын

    @@HackersRealm So to iterate it we will be using for loop passing each column name as I??

  • @HackersRealm

    @HackersRealm

    Жыл бұрын

    @@aniketlode4808 yeah

  • @mohamads9759
    @mohamads97593 ай бұрын

    Very Great.

  • @HackersRealm

    @HackersRealm

    3 ай бұрын

    Glad you liked it!!!

  • @titi-cu8dx
    @titi-cu8dx6 ай бұрын

    What about dealing with categorical columns in the context of outliers?

  • @HackersRealm

    @HackersRealm

    6 ай бұрын

    I don't think there will be outliers in categories

  • @adityachoudhari3596
    @adityachoudhari35962 жыл бұрын

    Yo bro I m also learning ai and ml concepts I just need to work one some project or get the training in this Plz tell me if you can help

  • @HackersRealm

    @HackersRealm

    2 жыл бұрын

    check the iris dataset analysis project in the playlist for start

  • @santoryuu989
    @santoryuu9892 жыл бұрын

    what do you think is the best method out of these three ?

  • @HackersRealm

    @HackersRealm

    2 жыл бұрын

    You can use any method as it's producing similar results, but instead of deleting samples, trim it in the range

  • @Serene__Soul98
    @Serene__Soul982 жыл бұрын

    Hii..my dataset has 19 columns and at least 10 colums shows outliers.. So do I have to perform this process for every column each time?

  • @HackersRealm

    @HackersRealm

    2 жыл бұрын

    Yes it's better to do the process in a loop and fix it for better results

  • @avashchand9623

    @avashchand9623

    2 жыл бұрын

    @@HackersRealm Can you kindly show this process too. Searching for it everywhere can't find it.

  • @HackersRealm

    @HackersRealm

    2 жыл бұрын

    @@avashchand9623 what process you're referring?

  • @aniketlode4808

    @aniketlode4808

    Жыл бұрын

    @@HackersRealm I think he is asking for the process of looping the columns

  • @nihalkausar2215

    @nihalkausar2215

    2 ай бұрын

    Pls after I have handled each column outlets how do I save it and which data frame should I continue using

  • @ricesweat9951
    @ricesweat99518 ай бұрын

    why you decided to use residual sugar as a column to find outliers? any tips and tricks on which columns should be used to find outliers within the dataset?

  • @HackersRealm

    @HackersRealm

    8 ай бұрын

    we can use boxplot or violinplot to find the outliers. You can see some dots outside the line which can be considered as outliers.

  • @karthika8610
    @karthika8610 Жыл бұрын

    Which method is the most preferred?

  • @HackersRealm

    @HackersRealm

    Жыл бұрын

    It's not about preference, it depends on where and which use case you're trying to solve

  • @madhulikasuman2803

    @madhulikasuman2803

    3 ай бұрын

    @@HackersRealm if there are 40% outlier then ?

  • @HackersRealm

    @HackersRealm

    3 ай бұрын

    @@madhulikasuman2803 it depends on the nature of data, need to understand the domain, and see why this is the case. We could do some data transformation like log transformation to change it

  • @Niyati_11
    @Niyati_117 ай бұрын

    My df is empty while finding the outliers. Any idea why it is so?

  • @HackersRealm

    @HackersRealm

    7 ай бұрын

    which cell you faced the issue?

  • @nihsacinan19
    @nihsacinan1910 ай бұрын

    8:35 outliers=26

Келесі