Discussing All The Types Of Feature Transformation In Machine Learning

github: github.com/krishnaik06/Types-...
⭐ Kite is a free AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. I've been using Kite for a few months and I love it! www.kite.com/get-kite/?...
All Playlist In My channel
Interview Playlist: • Machine Learning Inter...
Complete DL Playlist: • Complete Road Map To P...
Julia Playlist: • Tutorial 1- Introducti...
Complete ML Playlist : • Complete Machine Learn...
Complete NLP Playlist: • Natural Language Proce...
Docker End To End Implementation: • Docker End to End Impl...
Live stream Playlist: • Pytorch
Machine Learning Pipelines: • Docker End to End Impl...
Pytorch Playlist: • Pytorch
Feature Engineering : • Feature Engineering
Live Projects : • Live Projects
Kaggle competition : • Kaggle Competitions
Mongodb with Python : • MongoDb with Python
MySQL With Python : • MYSQL Database With Py...
Deployment Architectures: • Deployment Architectur...
Amazon sagemaker : • Amazon SageMaker
Please donate if you want to support the channel through GPay UPID,
Gpay: krishnaik06@okicici
Telegram link: t.me/joinchat/N77M7xRvYUd403D...
Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more
/ @krishnaik06
Connect with me here:
Twitter: / krishnaik06
Facebook: / krishnaik06
instagram: / krishnaik06

Пікірлер: 63

  • @krishnaik06
    @krishnaik063 жыл бұрын

    Please take care everyone.

  • @sarangtamrakar8723

    @sarangtamrakar8723

    3 жыл бұрын

    you too sir.....

  • @shivu.sonwane4429

    @shivu.sonwane4429

    3 жыл бұрын

    Yes take care you and your team for people in home isolation 👇🏻 I've almostj recovered from COVID in home isolation. I'm sharing what helped me recover in case it helps someone. • Steam atleast 3 times a day • Plenty of fluids: Water (preferably warm), lemonade, coconut water • Salt water gargles • Vitamin C supplement • Plenty of rest • Meditation for peace of mind • Balanced diet • Regain smell: Smell ajwain, kapoor and cloves • Lie on your stomach periodically Monitor oxygen every 2 hours. Seek medical assistance if it's 92 or below. Pls add if I missed anything Add ajwain and kapoor into the water while taking steam and drink malvani kadha (Tulsi, adrak, jaggery, lavng, Black paper, ajwain, gavti cha,dalchini)

  • @shivaragiman

    @shivaragiman

    3 жыл бұрын

    @@shivu.sonwane4429 how can we monitor oxygen levels in home

  • @sunilsharanappa7721

    @sunilsharanappa7721

    3 жыл бұрын

    Using oximetry it measures the oxygen level (oxygen saturation) though it's not very accurate but it's good enough for home.

  • @SALESENGLISH2020
    @SALESENGLISH20203 жыл бұрын

    Pray your team members recover quickly. India needs good teachers.

  • @teegnas
    @teegnas3 жыл бұрын

    a very important video to review all feature important techniques at one go ... thanks for uploading!

  • @bhargavikoti4208
    @bhargavikoti42083 жыл бұрын

    As usual neatly explained..👍👍thank you for uploading 🙏

  • @mostafakhazaeipanah1085
    @mostafakhazaeipanah10852 жыл бұрын

    What A Useful and Informative Video. Most of the ML Courses are based on Algorithms which they forget the importance of Data Preparation

  • @geianmarkdenorte9874
    @geianmarkdenorte98743 жыл бұрын

    I am looking for these master krish! Take care too

  • @ayushsingh-qn8sb
    @ayushsingh-qn8sb3 жыл бұрын

    If I have applied some encoding technique , do I have to scale them ?

  • @dheerendrasinghbhadauria9798
    @dheerendrasinghbhadauria97983 жыл бұрын

    krish bhai....please upload a PDF of notes of video summary.... along with each video...

  • @ajaykushwaha-je6mw
    @ajaykushwaha-je6mw2 жыл бұрын

    Hi Krish, while transformation why we are not dividing our data in Train and Test ?

  • @umaanil3344
    @umaanil33443 жыл бұрын

    Sir what about that 'df_scaled' term? I am getting error at that point that df_scaled is not defined... Can you please explain

  • @tanujajoshi1901
    @tanujajoshi19013 жыл бұрын

    Hey Krish, Can you explain Generative Adversarial Networks (GANs) especially the coding part for a dataset other than an image dataset?? It would be of great help.

  • @priyayadav3990
    @priyayadav39903 жыл бұрын

    In transformation we transform distribution in Normal distribution.then after transformation we also need to perform Standardisation(Scale down).please tell me if I am wrong.

  • @nagrajkaranth123
    @nagrajkaranth1233 жыл бұрын

    Sir sudhanshu sir tested positive my god please I hope he get well soon

  • @shivaragiman
    @shivaragiman3 жыл бұрын

    Get well soon, you people need more to us 👍👍👍👍👍

  • @yashpandey5484
    @yashpandey54842 жыл бұрын

    Sir weather scalling is required after performing log transformation ??

  • @SomeoneElsesSomeoneElse
    @SomeoneElsesSomeoneElse2 жыл бұрын

    With respect to StandardScaler() If you split the dataset prior to scaling the features then don't you risk having skewed features? Put differently, if you train your model to learn that values of 1 get a certain weight and in your test set the data isn't standardized around the same mean as the train set then the model will invariably have worse accuracy unless the train set and test set features have the same mean, right? Shouldn't the test set samples of the full dataset removed only to serve as an "out-of-sample" test? Not two separate datasets?

  • @sandipansarkar9211
    @sandipansarkar92113 жыл бұрын

    great explanation

  • @Sivaramakrishnanv7
    @Sivaramakrishnanv73 жыл бұрын

    In the join button, i can see (6 months: ₹283.20) plan. you have not mentioned this plan in that join video.Can you pls explain here sir?

  • @kiyotube222
    @kiyotube2223 жыл бұрын

    Get we soon Sudh!!

  • @captainmustard1
    @captainmustard1 Жыл бұрын

    thank you sir, it is just an amazing video!!

  • @poojapatil7128
    @poojapatil71282 жыл бұрын

    I have completed my 1-year post-graduation program in data science from a leading institute, but the various techniques I learned from your videos in free, were not even mentioned in the curriculum. Thank you for your easy and detailed explanation.

  • @abhishek_dataman6348
    @abhishek_dataman63483 жыл бұрын

    Do we require to check this transformation techniques in all binary classification problems?!

  • @wahabali828
    @wahabali828 Жыл бұрын

    thank you very much sir

  • @write2ruby
    @write2ruby2 жыл бұрын

    Very Informative

  • @MdMahmudulHasanSuzan--
    @MdMahmudulHasanSuzan--2 жыл бұрын

    how can i perform scaling on a k-fold data?

  • @pseudounknow5559
    @pseudounknow55593 жыл бұрын

    Greetings from Poland

  • @cherubyGreens

    @cherubyGreens

    3 жыл бұрын

    Thanks mate!

  • @vidulakamat6564
    @vidulakamat65643 жыл бұрын

    While doing the transformation, do we need to transform both numerical and categorical (encoded) features or only numerical ones? If target is continuous, do we need to transform that as well?

  • @sunilsharanappa7721

    @sunilsharanappa7721

    3 жыл бұрын

    No, you shouldn't scale categorical data. If the feature is categorical, it means that each value has a separate meaning, so normalizing will turn this features into something different. There are several ways to deal with categorical data: a) Integer Encoding: Where each unique label is mapped to an integer. b) One Hot Encoding: Where each label is mapped to a binary vector. c) Learned Embedding: Where a distributed representation of the categories is learned. if the Target is continuous. Yes, you do need to scale the target variable if the target variable is having a large spread of values. --Sunil Sharanappa

  • @vidulakamat6564

    @vidulakamat6564

    3 жыл бұрын

    @@sunilsharanappa7721 thank you

  • @imtiazali-xu8gw
    @imtiazali-xu8gwАй бұрын

    Sir box cox transformation pe ak video banaye

  • @prakashkafle454
    @prakashkafle4543 жыл бұрын

    I pray for your team for speed recovery krish . We are also getting worst news day by day here in nepal ...

  • @ishantyagi2701
    @ishantyagi27012 жыл бұрын

    should standardization be applied to whole dataset or after we split into train test data?

  • @Craeson1

    @Craeson1

    Жыл бұрын

    It is generally best to apply standardization to the training set only, and then apply the same scaling to the test set. This is because the test set should represent unseen data, and you want to evaluate the model's performance on the test set as closely as possible to how it would perform on new, unseen data. Applying standardization to the entire dataset before splitting it into training and test sets could result in information leakage, as the model could learn about the test set during training.

  • @alihaiderabdi9939
    @alihaiderabdi99393 жыл бұрын

    praying for employees of ineuron, inshallah everyone will get well soon.

  • @ashiqhussainkumar1391
    @ashiqhussainkumar13913 жыл бұрын

    Tbh I don't prefer any lecture series except nptel. But seeing your 20-25 I personally feel this channel is a better resource for practical implementation of ML... Initially I didn't subscribe bcz I felt ur profile is looking young and u might not be knowing the way u taught 😁😁😁... Subscribed Thanks to you and to Nptel

  • @mosart03
    @mosart033 жыл бұрын

    Are we suppose to scale categorical features along with continuous features?

  • @sunilsharanappa7721

    @sunilsharanappa7721

    3 жыл бұрын

    No, you shouldn't scale categorical data. If the feature is categorical, it means that each value has a separate meaning, so normalizing will turn this features into something different. There are several ways to deal with categorical data: a) Integer Encoding: Where each unique label is mapped to an integer. b) One Hot Encoding: Where each label is mapped to a binary vector. c) Learned Embedding: Where a distributed representation of the categories is learned. --Sunil Sharanappa

  • @mdadilhussain2967
    @mdadilhussain29673 жыл бұрын

    I guess that you should first do fit_transform then train_test_split; As if you have first splited then according to train data you have calculated mean. Then applies same mean for test data, so test data won't have mean as zero. Please clear this doubt.

  • @fintech5816

    @fintech5816

    2 жыл бұрын

    Hi Adil, do you find the answer to your question? If yes, please share.

  • @70ME3E

    @70ME3E

    8 ай бұрын

    from an SO answer: "Normalization across instances should be done after splitting the data between training and test set, using only the data from the training set. This is because the test set plays the role of fresh unseen data, so it's not supposed to be accessible at the training stage. Using any information coming from the test set before or during training is a potential bias in the evaluation of the performance."

  • @sarthakphatate4595
    @sarthakphatate45953 жыл бұрын

    good

  • @mayurgupta4004
    @mayurgupta40042 жыл бұрын

    when we are using gaussian transformation that will convert our distribution to gaussian distribution where mean=median or standard gaussian distribution where mean=0 and variance=1

  • @sandipansarkar9211
    @sandipansarkar92112 жыл бұрын

    finished watching

  • @shubhamkondekar5382
    @shubhamkondekar53823 жыл бұрын

    Krish Naik is best

  • @satviksaxena3868
    @satviksaxena38683 жыл бұрын

    Hope the team will recover soon, Take Care !!

  • @ashutoshtiwari5222
    @ashutoshtiwari52223 жыл бұрын

    Sir app apna dyan rakhiye . 🥺😢

  • @venkatraaman4509
    @venkatraaman45093 жыл бұрын

    hai, for eg I have a feature regarding age, height, weight now I willing to make the gaussian transformation, here in my case ==>logarithm tx makes a good fit for age ==>reciprocal tx makes a good fit for height the question is may I use both features(applied with age(log tx) & height(reciprocal tx)) for my train data, kindly reply to me, sir

  • @venkatraaman4509

    @venkatraaman4509

    3 жыл бұрын

    @Krish Naik. sir kindly reply me

  • @me_debankan4178

    @me_debankan4178

    Жыл бұрын

    yeah , i have a same question , do you have any solution?

  • @nishanthviswajith1496
    @nishanthviswajith14963 жыл бұрын

    I know python programming. And I'm learning data science by self-study .. My problem is I have 4 years gap in employment. Will I get job in data science field? Need your suggestions.. I'm 26 yrs old

  • @anandbihari3135

    @anandbihari3135

    2 жыл бұрын

    Same story bro , yes u will get job as data scientist just focus on prep and projects. I took gap for preparation for upsc and rbi. In 2016 I got campus placement in amazon as sde . But after 4 year break and covid scene i started preparing for ds and was fortunate enough to start with Sky as data engineer for 10lpa. So sure u will also get placed

  • @nishanthviswajith1496

    @nishanthviswajith1496

    2 жыл бұрын

    @@anandbihari3135 skills required for a data engineer??

  • @208gamer4

    @208gamer4

    6 ай бұрын

    ​@@nishanthviswajith1496job lagi bro

  • @208gamer4

    @208gamer4

    6 ай бұрын

    ​@@nishanthviswajith1496Mca kar Raha hu koi scope hai bro

  • @foreignworker-2163
    @foreignworker-21633 жыл бұрын

    Pray for your team!

  • @pankajkumarbarman765
    @pankajkumarbarman7653 жыл бұрын

    1st view 💞💞❤️

  • @moonSTAR1893
    @moonSTAR18939 ай бұрын

    Hello. Important mistake in this tutorial, so I have to stop watching it. Problem: you e.g. use MinMax Scaler on whole X_train with differently scaled variables inside. Let's assume "age" is distributed 18-65 while "fare" goes from 5-2000. Scaling age with the global min/max of the dataset, distorts your features. In this case for age 20 you would get z = X-Xmin/Xmax-Xmin = (20-5)/(2000-5) = 15/1995= 0.0075. Instead in the per-feature scaling with just age you would get z = (20-18)/(65-18) = 0.0426 corresponding to a 5-fold numerical difference. The maximal age of 65 would get z = (65-5)/(2000-5) = 0.03 !!!! Meaning age would have maximal value of 0.03 instead of 1!