Machine Learning Tutorial Python - 6: Dummy Variables & One Hot Encoding

Machine learning models work very well for dataset having only numbers. But how do we handle text information in dataset? Simple approach is to use interger or label encoding but when categorical variables are nominal, using simple label encoding can be problematic. One hot encoding is the technique that can help in this situation. In this tutorial, we will use pandas get_dummies method to create dummy variables that allows us to perform one hot encoding on given dataset. Alternatively we can use sklearn.preprocessing OneHotEncoder as well to create dummy variables.
#MachineLearning #PythonMachineLearning #MachineLearningTutorial #Python #PythonTutorial #PythonTraining #MachineLearningCource #OneHotEncoding #sklearntutorials #scikitlearntutorials
Code in tutorial: github.com/codebasics/py/blob...
Exercise csv file: github.com/codebasics/py/blob...
Exercise solution: github.com/codebasics/py/blob...
Topics that are covered in this Video:
0:00 Introduction
0:47 How to handle text data in machine learning model?
1:38 Nominal vs Ordinal Variables
2:44 Theory (Explain one hot encoding using home prices in different townships)
3:39 Coding (Start)
3:51 Pandas get_dummies method
7:48 Create a model that uses dummy columns
12:45 Label Encoder
13:29 fit_transform() method
15:40 sklearn OneHotEncoder
19:59 Exercise (To predict prices of car based on car model, age, mileage)
Do you want to learn technology from me? Check codebasics.io/?... for my affordable video courses.
Next Video:
Machine Learning Tutorial Python - 7: Training and Testing Data: • Machine Learning Tutor...
Populor Playlist:
Data Science Full Course: • Data Science Full Cour...
Data Science Project: • Machine Learning & Dat...
Machine learning tutorials: • Machine Learning Tutor...
Pandas: • Python Pandas Tutorial...
matplotlib: • Matplotlib Tutorial 1 ...
Python: • Why Should You Learn P...
Jupyter Notebook: • What is Jupyter Notebo...
To download csv and code for all tutorials: go to github.com/codebasics/py, click on a green button to clone or download the entire repository and then go to relevant folder to get access to that specific file.
Tools and Libraries:
Scikit learn tutorials
Sklearn tutorials
Machine learning with scikit learn tutorials
Machine learning with sklearn tutorials
🌎 My Website For Video Courses: codebasics.io/?...
Need help building software or data analytics and AI solutions? My company www.atliq.com/ can help. Click on the Contact button on that website.
#️⃣ Social Media #️⃣
🔗 Discord: / discord
📸 Dhaval's Personal Instagram: / dhavalsays
📸 Codebasics Instagram: / codebasicshub
🔊 Facebook: / codebasicshub
📱 Twitter: / codebasicshub
📝 Linkedin (Personal): / dhavalsays
📝 Linkedin (Codebasics): / codebasics
🔗 Patreon: www.patreon.com/codebasics?fa...

Пікірлер: 659

  • @codebasics
    @codebasics2 жыл бұрын

    Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced

  • @celestineokpataku
    @celestineokpataku4 жыл бұрын

    I have watched only 4 mins so far i had to pulse and write this comment. I will say this is one of the best tutorial i have seen in data science. Sir you need to take this to another level. What a great teacher you are

  • @codebasics

    @codebasics

    4 жыл бұрын

    That for the feedback my friend 😊👍

  • @vaishalibisht518
    @vaishalibisht5185 жыл бұрын

    Wonderful Video. This so far the easiest explanation I have seen for one hot encoding. I have been struggling from very long to find a proper video on this topic and my quest ended today. Thanks a lot, sir.

  • @TheSignatureGuy
    @TheSignatureGuy4 жыл бұрын

    For anyone stuck with the categorical features error. from sklearn.compose import ColumnTransformer ct = ColumnTransformer([("town", OneHotEncoder(), [0])], remainder = 'passthrough') X = ct.fit_transform(X) X Then you should be able to continue the tutorial without further issue.

  • @muhammadhattahakimkeren

    @muhammadhattahakimkeren

    4 жыл бұрын

    thanks bro

  • @fatimahazzahra6181

    @fatimahazzahra6181

    4 жыл бұрын

    thanks a lot! it helps

  • @souvikdas3189

    @souvikdas3189

    11 ай бұрын

    Thank you brother.

  • @User_2337random

    @User_2337random

    10 ай бұрын

    Hey, thank for the code. I tried using your code but it gives me an error, despite of converting it (X) to an array, it gives me this error. " TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array. "

  • @TheSignatureGuy

    @TheSignatureGuy

    10 ай бұрын

    ​@@User_2337random I know you said "despite converting X to an array", but just double check you have used the .toarray() method correctly. The error message seems pretty clear on this one. This function may help confirm that a dense numpy array is being passed. import numpy as np import scipy.sparse def is_dense(matrix): return isinstance(matrix, np.ndarray) Pass in X for matrix and it should return True. Good luck fixing this.

  • @codebasics
    @codebasics4 жыл бұрын

    Exercise solution: github.com/codebasics/py/blob/master/ML/5_one_hot_encoding/Exercise/exercise_one_hot_encoding.ipynb Everyone, the error with catergorical_features is fixed. Check the new notebook on my github (link in video description). Thanks Kush Verma for giving me pull request for the fix.

  • @urveshdave1861

    @urveshdave1861

    4 жыл бұрын

    Thank you for the wonderful explanation sir. However I am getting an error as __init__() got an unexpected keyword argument 'catergorical_features' for the line for my code onehotencoder = OneHotEncoder(catergorical_features = [0]). Is it because of change of versions? what is the solution to this?

  • @bishwarupdey10

    @bishwarupdey10

    4 жыл бұрын

    _init__() got an unexpected keyword argument 'categorical_features' sir I get this error when I specify categorical features

  • @sejalmittal1326

    @sejalmittal1326

    4 жыл бұрын

    @@urveshdave1861 Have you got any answer for this? I am having the same error

  • @sejalmittal1326

    @sejalmittal1326

    4 жыл бұрын

    @@urveshdave1861 okay .. i will do that. thanks

  • @tanvisingh9298

    @tanvisingh9298

    4 жыл бұрын

    @@urveshdave1861 Hey I am also getting the same error. how did you resolve it?

  • @jhagaurav8292
    @jhagaurav82925 жыл бұрын

    Sir pls continue your machine learning tutorials ,yours tutorials are one of the best I have seen so far .

  • @codebasics

    @codebasics

    5 жыл бұрын

    sure Gaurav, I just started deep learning series. check it out

  • @samrahafeez5001

    @samrahafeez5001

    3 жыл бұрын

    @@codebasics Kindly explain the concept of dummies in deep learning as well

  • @snom3ad
    @snom3ad5 жыл бұрын

    This was really well done! Kudos to you! It's hard to find clear and concise free tutorials nowadays. Subscribed and hope to see more awesome stuff!

  • @tech-n-data
    @tech-n-data Жыл бұрын

    Your ability to simplify things is amazing, thank you so much. You are a natural teacher.

  • @noubaddi8567
    @noubaddi85673 жыл бұрын

    This guy is AMAZING! I have spent 2 days trying decenes of other methods and this is the only one that worked for my data and didnøt come as an error, this guy totally saved my mental sanity, I was growing desperate as in DESPERATE! Thank you, thank you, thank you!

  • @codebasics

    @codebasics

    3 жыл бұрын

    I am glad it was helpful to you 🙂👍

  • @venkatesanrf
    @venkatesanrf3 жыл бұрын

    Hi, Your explanation is very simple and effective Ans for practice session A)Price of Mercedes Benz -4Yr old--mileage 45000= 36991.31721061 B)Price of BMW_X5 -7Yr old--mileage 86000=11080.74313219 C) Accuracy=0.9417050937281082(94 percent)

  • @ANIMESH_JAIN04

    @ANIMESH_JAIN04

    Ай бұрын

    Same bro

  • @fathoniam8997

    @fathoniam8997

    28 күн бұрын

    same bro.... thx for replying so that i can check my results

  • @tanmaykapure81
    @tanmaykapure812 жыл бұрын

    This is the best machine learning playlist i have came across on youtube😃👍, Hats off to you sir.

  • @tushargahtori1570
    @tushargahtori1570 Жыл бұрын

    Even in 23 your video is such a relief..kudos to your teaching.

  • @mk9834
    @mk98344 жыл бұрын

    I was shocked after the first 5 minutes of the video and have never thought it would be so easy and fast! Thanks ALOT1

  • @codebasics

    @codebasics

    4 жыл бұрын

    Miyuki... I am glad you liked it

  • @timse699
    @timse6992 жыл бұрын

    You teach with passion! thank you for the series!

  • @ZehraKhuwaja65
    @ZehraKhuwaja659 ай бұрын

    I must say this is the best course I've come across so far.

  • @sreenufriendz
    @sreenufriendz5 жыл бұрын

    Anyone can be a teacher , but real teacher eliminates the fear from students .. you did the same !! Excellent knowledge and skills

  • @codebasics

    @codebasics

    5 жыл бұрын

    Sreenivasulu, your comment means a lot to me, thanks 😊

  • @shrutijain1628
    @shrutijain16283 жыл бұрын

    this ML tutorial is by far the best one i have seen it is so easy to learn and understand and your exersise also helps me to apply what i have learn so far thank you.

  • @codebasics

    @codebasics

    3 жыл бұрын

    Glad it helped!

  • @elinem5311
    @elinem53113 жыл бұрын

    thank you, this helped me so much with multivariate regression with many categorical features!

  • @Genz111-o4r
    @Genz111-o4r3 жыл бұрын

    I was confuse from where to start studying ml and then my friend suggested this series.... It's great :-)

  • @rishabhjain7572

    @rishabhjain7572

    3 жыл бұрын

    any other courses or source you are following? and any development you have begun ?

  • @sauravmaurya6097

    @sauravmaurya6097

    2 жыл бұрын

    want to know how much this playlist is helpful? kindly reply.

  • @carti8778

    @carti8778

    2 жыл бұрын

    @@sauravmaurya6097 its quite helpful if u are a beginner. Beginner in sense of {not from engineering or programming background }. U can accompany this with coursera’s andrew ng course.

  • @carti8778

    @carti8778

    2 жыл бұрын

    @@sauravmaurya6097 if u already know calculus and python programming (intermediate level) , ML would feel easy . After doing this go to the deep learning series bcz thats what used in industries.

  • @AruLcomments
    @AruLcomments4 жыл бұрын

    You are doing a wonderful job, people like you inspire me to learn and share the knowledge i gain. It is very useful for me. All the best.

  • @sarafatima2252
    @sarafatima22523 жыл бұрын

    definitely one of the best videos to learn from!

  • @ZOSELY
    @ZOSELY11 ай бұрын

    I wish I could give this videos 2 thumbs up! Great explanation of all the steps in one-hot encoding! Thank you!!

  • @world2blogs
    @world2blogs5 ай бұрын

    your are the best teacher on youtube , i have never seen before

  • @scriptfox614
    @scriptfox6144 жыл бұрын

    The import linear regression statement lol. Amazing tutorial. :D

  • @programmingwithraahim
    @programmingwithraahim2 жыл бұрын

    15:50 write your code like this: ct = ColumnTransformer( [('one_hot_encoder', OneHotEncoder(categories='auto'), [0])], remainder='passthrough' ) X = ct.fit_transform(X) X Ok so it will work fine otherwise it will give an error.

  • @AxelWolf26

    @AxelWolf26

    2 жыл бұрын

    what is the use of this " (categories='auto') " and " 'one_hot_encoder' "

  • @jollycolours

    @jollycolours

    2 жыл бұрын

    Thank you, you're a lifesaver! I was trying multiple ways since categorical_features has now been depreciated.

  • @adilmajeed8439

    @adilmajeed8439

    2 жыл бұрын

    @@jollycolours correct, the categorical_features parameter is deprecated and for the same following are the steps needs to be followed; from sklearn.compose import ColumnTransformer ct = ColumnTransformer([('one_hot_encoder', OneHotEncoder(), [0])], remainder='passthrough') X = np.array(ct.fit_transform(X), dtype=float)

  • @asamadawais
    @asamadawais2 жыл бұрын

    Simply excellent explanation with very simple examples!

  • @omharne1386
    @omharne1386 Жыл бұрын

    I will say this is one of the best tutorial i have seen in ML

  • @debaratighatak2211
    @debaratighatak22112 жыл бұрын

    I learned a lot from the exercise that you gave at the end of the video, thank you so much sir!

  • @wangangcwayi9420
    @wangangcwayi94204 жыл бұрын

    You have gift of explaining things even to the layman. Big Up to you

  • @codebasics

    @codebasics

    4 жыл бұрын

    Thanks a ton Wangs for your kind words of appreciation.

  • @thanusan
    @thanusan5 жыл бұрын

    Excellent video - thank you!

  • @ankitparashar7
    @ankitparashar74 жыл бұрын

    Merc: 36991.317 BMW: 11080.743 Score: 94.17%

  • @codebasics

    @codebasics

    4 жыл бұрын

    Your answer is perfect Ankit. Good job, here is my answer sheet for comparison: github.com/codebasics/py/blob/master/ML/5_one_hot_encoding/Exercise/exercise_one_hot_encoding.ipynb

  • @vishalrai2859

    @vishalrai2859

    4 жыл бұрын

    thanks for posting the answer bro

  • @mutiulmuhaimin9156

    @mutiulmuhaimin9156

    4 жыл бұрын

    Could we upvote this comment to the top? Been looking for this for quite some time now. This is important, and this comment matters.

  • @Augustus1003

    @Augustus1003

    4 жыл бұрын

    @@codebasics I used pandas dummy variable instead of using onehotencoding, because it is too confusing.

  • @clashcosmos4641

    @clashcosmos4641

    3 жыл бұрын

    Got the same answer using OneHotEncoder after correcting tons of errors and watching videos over and over.

  • @rooshanghous6912
    @rooshanghous69128 ай бұрын

    This is an amazing tutorial! saved me so much time and brought so much clarity!!! Thank you!

  • @shekharbabar2496
    @shekharbabar24964 жыл бұрын

    the best video series on ML sir ....Thank you very much sir....

  • @claude-olivierbatungwanayo9059
    @claude-olivierbatungwanayo90595 жыл бұрын

    Excellent as usual!

  • @mapa5000
    @mapa5000 Жыл бұрын

    You make it easy with your explanation !! Thank you !!

  • @srinivasreddy1709
    @srinivasreddy17094 жыл бұрын

    Hi Dhaval, your explanation on all the topics is crystal clear. Can you please make videos on NLP also

  • @lyejiajun
    @lyejiajun5 жыл бұрын

    @codebasics , I am following this tutorial implementation into my own project , however , i faced an issue , i cannot numpy.ndarry.drop(["Feature1","Feature1_None"...] i cant follow your example as i am unaware of which columns i should drop , any advice? I am using KNN algorithm , does it auto drop one column for me?

  • @vishwa4908
    @vishwa49084 жыл бұрын

    Awesome, you're explaining concepts in very simple manner.

  • @codebasics

    @codebasics

    4 жыл бұрын

    Vishwa I am happy to help 👍

  • @rajbir_singh0517
    @rajbir_singh05175 жыл бұрын

    Sir I have read the onehotencoder documentation , it says it has drop function which can be set and it will drop Column to prevent from dummy variable trap. Can you please provide some insight on it and in which version it is available

  • @ruSEXtreme
    @ruSEXtreme5 жыл бұрын

    Would you get the same prediction results if you keep all of the dummy columns and in the first deep layer, remove the bias scalar (b)?

  • @manasaraju8552
    @manasaraju8552 Жыл бұрын

    difficult topics are easily understood, Thank you so much for the content sir

  • @annette4718
    @annette47184 жыл бұрын

    This was ridiculously helpful. Thank you so much!!

  • @codebasics

    @codebasics

    4 жыл бұрын

    Netté, I am glad you liked it

  • @carlavirhuez4785
    @carlavirhuez47855 жыл бұрын

    Thank you! Very helpful!

  • @hiver6411
    @hiver64113 жыл бұрын

    the god of data science......Amazing explanation sir..kudos to your patience in explanation

  • @codebasics

    @codebasics

    3 жыл бұрын

    Glad it was helpful!

  • @flamboyantperson5936
    @flamboyantperson59366 жыл бұрын

    Please make regression video using preprocessing library with standaridization and normalization variables

  • @uvinodh90
    @uvinodh904 жыл бұрын

    Thanks for the excellent tutorial.... I see there is a decrease in score between this and the exercise data. Maybe due to an extra column in exercise data ? With increase in columns on X, Will the linearRegression score decrease ?

  • @bandhammanikanta1664
    @bandhammanikanta16644 жыл бұрын

    First of all, 1000*Thanks for sharing such content on youtube.. I got an accuracy of 94.17% on training data.

  • @codebasics

    @codebasics

    4 жыл бұрын

    Bandham, I am glad you liked it buddy 👍

  • @oshtontsen5428
    @oshtontsen54285 жыл бұрын

    Great, concise video!

  • @Adityasharma-zb7no
    @Adityasharma-zb7no5 жыл бұрын

    maximum how much distinct categorical variable we can apply onehotencoding?

  • @Hamad2802
    @Hamad28024 жыл бұрын

    sir i have a problem in a last some lines of your code suggest me when i have greater no of towns features of then how to write predict the house price need some hep

  • @byl2263
    @byl22634 жыл бұрын

    Thanks for the video! I want to ask why it has to be converted into array after OHE? Apologies if it’s a fun question. I’m real new in Python. Thanks!

  • @isaackobbyanni4583
    @isaackobbyanni45833 жыл бұрын

    Thank you for this series. Such great help

  • @codebasics

    @codebasics

    3 жыл бұрын

    Glad it was helpful!

  • @chamangupta4624
    @chamangupta46243 жыл бұрын

    Beautiful explanation, very helpful

  • @jayasreecarey7843
    @jayasreecarey7843 Жыл бұрын

    Many Thanks ! Great Explanation :)

  • @gokkulkumarvd9125
    @gokkulkumarvd91253 жыл бұрын

    How can I like this video more than 100 times!

  • @codebasics

    @codebasics

    3 жыл бұрын

    I am happy this was helpful to you.

  • @zigzag4273
    @zigzag42734 жыл бұрын

    If you don't mind me asking, when calculating the score do you pass the train set or the test set ?

  • @datasciencewithshreyas1806
    @datasciencewithshreyas18063 жыл бұрын

    One of the best explanation for Encoding 👌👍

  • @codebasics

    @codebasics

    3 жыл бұрын

    Glad it was helpful!

  • @dmitrikochubei3569
    @dmitrikochubei35694 жыл бұрын

    Thank you! great video!

  • @hildestromsvag
    @hildestromsvag2 жыл бұрын

    thank you! For making it possible to understand:)

  • @shekharbabar2496
    @shekharbabar24964 жыл бұрын

    There is one query when i plugs the same values from the dataframe for the car model the predicted value is much differ from the actual value of the car model. why this is happeneing...

  • @SrinivasA-vk7if
    @SrinivasA-vk7if29 күн бұрын

    Excellent video.., thank you so much.

  • @leelavathigarigipati3887
    @leelavathigarigipati38874 жыл бұрын

    Thank you so much for the detailed step by step explanation.

  • @codebasics

    @codebasics

    4 жыл бұрын

    Glad it was helpful!

  • @RA-pi1lg
    @RA-pi1lg5 жыл бұрын

    Thank you for great videos

  • @unamattina6023
    @unamattina60232 жыл бұрын

    so in dummy variable trap do we always have to keep 2 columns in our original dataset? let's say we have one another town named A_town and when we call get_dummies() and concat() methods we will have 4 more columns. and we need to drop some of the columns bc these 4 columns basically same and that will occur a problem for our model. which columns will be drop in these situation?

  • @ishraqhussain6938
    @ishraqhussain69383 жыл бұрын

    Sir, If we remove one of the dummy variable manually & fit the dataset to the LinearRegression class of sklearn.linear_model library then what the regressor will do? Can the regressor automatically detect that one of the dummy variable is already removed or the regressor removes one more dummy variable.

  • @maxb.w5170
    @maxb.w51702 жыл бұрын

    What would help inform a decision to drop one of the dummy variables? You mentioned the linear regression classifier will typically be able to handle and nonlinear interaction between most dummy variables. When should we drop one?

  • @giovannaluciagc
    @giovannaluciagc Жыл бұрын

    Thank you! it was really well explained

  • @muthaiyanthandapani7002
    @muthaiyanthandapani70023 жыл бұрын

    Hi Sir, any specific reason for first doing labelencoder before applying one hot encoder?. can we apply one hot encoder directly without doing label encoding?.

  • @swaruppanda2842
    @swaruppanda28425 жыл бұрын

    nicely explained👌

  • @kartikshrivastava6408
    @kartikshrivastava64084 жыл бұрын

    @codebasics I'm trying to plot scatter plot in PyCharm using plt.scatter(df.Mileage,df.SellPrice($)) But it is giving me below stated error , Traceback (most recent call last): return object.__getattribute__(self, name) AttributeError: 'DataFrame' object has no attribute 'Age' But when I remove ($) it doesn't give the error and plots it. why is this happening, same problem with Age(yrs), Shoud we remove (yrs) and ($) while using plotting plt.scatter(df.Mileage,df.SellPrice)

  • @eliashossain9849
    @eliashossain98494 жыл бұрын

    Could you explain to me, why is the train_test_split not in this program?

  • @pythonocean7879
    @pythonocean78795 жыл бұрын

    perfection

  • @abhinavb717
    @abhinavb717 Жыл бұрын

    I am getting 84% accuracy without encoding variable, but after encoding i am getting 94% accuracy on model. Thank you for your teaching. Doing great Job

  • @owonubijobsunday4764
    @owonubijobsunday4764 Жыл бұрын

    ❤🎉🎉 Thank you. You earned a subscriber

  • @chintansavaliya
    @chintansavaliya4 жыл бұрын

    I'm getting Error while get_dummies() function like this: TypeError: 'DataFrame' objects are mutable, thus they cannot be hashed

  • @ramanandr7562
    @ramanandr7562 Жыл бұрын

    Thank you sir🎉. You made my ML Journey Better.. 🤩

  • @flamboyantperson5936
    @flamboyantperson59366 жыл бұрын

    Thank you. Please could you make more videos I understand your timing problem but still please:))

  • @pranavakailash8751
    @pranavakailash87513 жыл бұрын

    This helped me a lot in my assignment, thank you so much code basics

  • @codebasics

    @codebasics

    3 жыл бұрын

    Glad it helped!

  • @narendarreddy6075
    @narendarreddy60754 жыл бұрын

    Sir , how can we create dummy variables for numerical values (ex:: if there are 1,2 ,3 classes in a ship or train).

  • @alimahmood4158
    @alimahmood41584 жыл бұрын

    hi there brother I have a lot of data related to the exercise with models engine type a lot of attributes and instances some of them are numerical data and some are categorical how can I apply one-hot encoding on that data please replay

  • @madmausiqui837
    @madmausiqui8372 жыл бұрын

    the categorical features is droped by sklearn so my program isnt running are there any changes to be be done in the program in oorder to maki it work

  • @sanjanatarekar5942
    @sanjanatarekar59422 жыл бұрын

    Hi, Since OneHotEncoder's categorical_features has been deprecated... Can you please mention here how to proceed?

  • @ankitlakshya450
    @ankitlakshya4502 жыл бұрын

    Hi if we want to do one hot coding for multiple categorical columns we need to give multiple index numbers ?

  • @sarveshamrute4982
    @sarveshamrute49823 жыл бұрын

    sir if we have output variable as a multiclass categorial variable, then after applying label encoding, will priority effect matter. I dont think it will because there will be no effect in algorithm maths, as its an output variable.

  • @Laila_657
    @Laila_6576 жыл бұрын

    Excellent!

  • @purnanandabaisnab2856
    @purnanandabaisnab28562 жыл бұрын

    nice teaching, really outstanding thanks a lot

  • @mohammadismailhashime5239
    @mohammadismailhashime52392 жыл бұрын

    Very nice explanation, appreciated

  • @yunusshaikh7478
    @yunusshaikh7478 Жыл бұрын

    Hello sir. Thank you for such wonderful content. I am curious to know how we can plot the graph and analyze this data and our predictions.

  • @weshallneversurrender
    @weshallneversurrender2 жыл бұрын

    The Data Science GOAT! One day I will send you a nice donation for all that you have contributed to my journey sir!

  • @armagaan007
    @armagaan0075 жыл бұрын

    Wait wait... I don't see the point 😕 The first half of the video does the same thing as one hot encoding(the second half of video)but second half is more tedious and takes more steps Then why not use the pd.get_dummies instead of onehotencoding??? What's the advantage of using onehot?

  • @codebasics

    @codebasics

    5 жыл бұрын

    I personally like pd.get_dummies as it is convenient to use. I wanted to just show two different ways of doing same thing and there are some subtle differences between the two. Check this: stackoverflow.com/questions/36631163/pandas-get-dummies-vs-sklearns-onehotencoder-what-is-more-efficient

  • @armagaan007

    @armagaan007

    5 жыл бұрын

    @@codebasics thank you :]... btw you make grt videos

  • @6223086
    @62230865 жыл бұрын

    thank you so much it has helped me in my work

  • @codebasics

    @codebasics

    4 жыл бұрын

    Hey Eugene, I am glad to hear that it helped you in your work. Stay in touch for more videos and share our channel if you really find it worth.

  • @user-bp7go9gr2s
    @user-bp7go9gr2s Жыл бұрын

    can we use either one for encoding ? or is there a deciding factor of when to use one hot encoder and when to use dummy encoding ?

  • @prashantgajjar1431
    @prashantgajjar14313 жыл бұрын

    Hi, can we know which interger values are assigned to the categories through 'LabelEncoder'. Here, only three categories are available, so it can be easily distinguished Encoded values. What if there are 10 categories available, and how to know the exact Encoded values with respect to Categories?

  • @rajeshmandapati4407

    @rajeshmandapati4407

    2 жыл бұрын

    yes sir same question

  • @nomanshaikhali3355
    @nomanshaikhali33553 жыл бұрын

    What is the work done by label encoder just showing the values into a 2D array but never used??

  • @divya7520
    @divya75204 жыл бұрын

    Even if I don't perform dummy variable drop method for 'homeprices.csv', am getting same accuracy & predicted price as with Onehotencoder. Is it because the no.of dummy variables category is less (say 3 here)? Or this method will be effective for large no. of dummy variables?

  • @kartikjha5704
    @kartikjha57048 ай бұрын

    @codebasics what if we have some new another category in test data set. How we can handly that.please reply.

  • @abhishekrawat3544
    @abhishekrawat35443 жыл бұрын

    why did we drop and separate X and Y columns?cant we apply linera regression on the same dataset

  • @nastaran1010
    @nastaran10107 ай бұрын

    Thank you so mush, very clear

  • @nikhilrajput5030
    @nikhilrajput50304 жыл бұрын

    model is also predicting, When we give two categories as 1. That is not true because either car is mercedes or it is BMW X5. Please sir help me out how to solve this error?

  • @HashimAli-tz8fw
    @HashimAli-tz8fw10 ай бұрын

    I achieved the same result using a different method that doesn't require dropping columns or concatenating dataframes. This alternative approach can lead to cleaner and more efficient code df=pd.get_dummies(df, columns=['CarModel'],drop_first=True)