198 - Feature selection using Boruta in python

Ғылым және технология

Code generated in the video can be downloaded from here:
github.com/bnsreenu/python_fo...
pypi.org/project/Boruta/
pip install Boruta
XGBoost documentation:
xgboost.readthedocs.io/en/lat...
Dataset:
archive.ics.uci.edu/ml/datase...)

Пікірлер: 43

  • @evyatarcoco
    @evyatarcoco3 жыл бұрын

    Dear sir, your episodes are great! I like to learn about new tools and libraries. Keep teaching us! Thanks

  • @DigitalSreeni

    @DigitalSreeni

    3 жыл бұрын

    Keep watching

  • @greendsnow
    @greendsnow2 жыл бұрын

    I can't get enough of these videos. And he knows that.

  • @DigitalSreeni

    @DigitalSreeni

    2 жыл бұрын

    ☺️

  • @RadhakrishnanBL
    @RadhakrishnanBL3 жыл бұрын

    Awesome video man. It really helped me.

  • @michaelmecham6145
    @michaelmecham61453 жыл бұрын

    Really well explained, thanks from Australia

  • @DigitalSreeni

    @DigitalSreeni

    3 жыл бұрын

    Glad it was helpful!

  • @fassesweden
    @fassesweden3 жыл бұрын

    Thank you for this video! Great stuff!

  • @DigitalSreeni

    @DigitalSreeni

    3 жыл бұрын

    Glad you enjoyed it!

  • @joytishfromm8223
    @joytishfromm82232 жыл бұрын

    Really good video!Thank you so much!

  • @jeffabc1997
    @jeffabc1997Ай бұрын

    nice content, thank you!

  • @channelforstream6196
    @channelforstream61963 жыл бұрын

    I would also be interested in more traditional machine learning. Most work done by data scientists I’ve seen is just preprocessing and postprocessing anyway

  • @pablovaras9435
    @pablovaras9435 Жыл бұрын

    great video

  • @lalitsingh5150
    @lalitsingh51503 жыл бұрын

    Thank you for another great video.

  • @DigitalSreeni

    @DigitalSreeni

    3 жыл бұрын

    Thanks for watching!

  • @lalitsingh5150

    @lalitsingh5150

    3 жыл бұрын

    @@DigitalSreeni Sir XGboost is giving error in Boruta...SVM is working fine ValueError: Please check your X and y variable. The provided estimator cannot be fitted to your data. Invalid Parameter format for seed expect int but value='RandomState(MT19937)'

  • @mohammadhassan5240
    @mohammadhassan52403 жыл бұрын

    best channel in youtube

  • @DigitalSreeni

    @DigitalSreeni

    3 жыл бұрын

    Thanks

  • @yogaforyou4213
    @yogaforyou42133 жыл бұрын

    Thank you so much for your video

  • @DigitalSreeni

    @DigitalSreeni

    3 жыл бұрын

    Glad it was helpful!

  • @RajeshSharma-bd5zo
    @RajeshSharma-bd5zo3 жыл бұрын

    Nice video 🤘🤘

  • @DigitalSreeni

    @DigitalSreeni

    3 жыл бұрын

    Thanks ✌

  • @0xcalmaf976
    @0xcalmaf9763 жыл бұрын

    Thanks a lot for sharing your knowledge with us! Do you consider making a tutorial with Brats or LITS challenges? We would love it:)

  • @DigitalSreeni

    @DigitalSreeni

    3 жыл бұрын

    I plan on recording videos on multiclass semantic segmentation which will help you segment BRATS or LITS, slice by slice. I haven’t experimented with 3D Unet yet but I plan on doing it sometime the next few months.

  • @0xcalmaf976

    @0xcalmaf976

    3 жыл бұрын

    @@DigitalSreeni That's great! But one thing, could you please choose the public dataset that would let us follow after you? Thanks!

  • @bikashchandragupta6333
    @bikashchandragupta63332 жыл бұрын

    Hello Sir, I am following your tutorial but facing an Error, "ValueError: Please check your X and y variable. The providedestimator cannot be fitted to your data. Invalid Parameter format for seed expect int but value='RandomState(MT19937)'". Any help regarding the issue will be highly appreciated.

  • @leamon9024
    @leamon90242 жыл бұрын

    Hello sir, would you cover a feature selection technique which uses hierarchical or k-means clustering if possible? I saw scikit-learn seems to have this function(sklearn.cluster.FeatureAgglomeration), but few people talks about that. Thanks in advance.

  • @manonathan5892
    @manonathan58923 жыл бұрын

    Thanks for the video. May I know how Boruta is different from Random Forest's feature importance? Are both same?

  • @anjalisetiya2335
    @anjalisetiya23352 жыл бұрын

    can this algorithm be applied for feature selection of mixed data type i.e. data has both boolean and continuous variables? Please let me know

  • @zakirshah7895
    @zakirshah78953 жыл бұрын

    Hello Teacher nice video. I am doing classification using CNN. Is there any good way for feature selection as I am using hybrid model. The accuracy is low may be because of the redundant features by the two model.

  • @kannansingaravelu
    @kannansingaravelu3 жыл бұрын

    Hi Sreeni. Thanks for the excellent videos. In many cases once the BorutaPy finished running, the tentative numbers printed out is different (less) than the actual runs. For example, in one of my use case with 196 features, the (100) iteration ended with 46 tentative features while the summary printed out only 28. Why is this different? How this is treated in Boruta?

  • @DigitalSreeni

    @DigitalSreeni

    3 жыл бұрын

    Sorry, didn't notice that behavior. I hope the documentation provides some explanation.

  • @awa8766
    @awa87663 жыл бұрын

    I'm curious to know if you could point out what the issue is. I have a dataset where my number of labels (y) is 55, and the number of independent variables (X) is 100. The dataframe total (if both X and Y combined) would be 55x101. I used a similar procedure to what you presented, and the only difference in datatype is that my y_train is int64 and my X_train is float64. I ran XGBoost and BorutaPy, but I am receiving an error when fitting the feature selector to X_train and y_train. The error I'm getting is: "Please check your X and y variable. The providedestimator cannot be fitted to your data. Invalid Parameter format for seed expect int but value='RandomState(MT19937)'" I can't seem to find an issue opened on either the BorutaPy or the XGBoost forums with the same error I'm getting. I'd appreciate your input!

  • @RadhakrishnanBL
    @RadhakrishnanBL3 жыл бұрын

    Any help to solve this error "XGBoostError: Invalid Parameter format for seed expect int but value='RandomState(MT19937)'"

  • @gurdeepsinghbhatia2875
    @gurdeepsinghbhatia28753 жыл бұрын

    nice sir

  • @DigitalSreeni

    @DigitalSreeni

    3 жыл бұрын

    Keep watching

  • @aditya_baser
    @aditya_baser2 жыл бұрын

    There are 7 features with rank 1, how do you further rank the features between them?

  • @MrTapan1994
    @MrTapan1994 Жыл бұрын

    I tried testing with all the feature and with boruta selected feature, the accuracy doesn't changes, so the idea is to use less feature keeping the metric same ?

  • @DigitalSreeni

    @DigitalSreeni

    Жыл бұрын

    Selecting fewer features using a feature selection technique like Boruta has several potential benefits: Improved model performance: By selecting only the most relevant features, the model may be able to better distinguish between signal and noise in the data, leading to improved model performance. Reduced overfitting: Selecting a subset of relevant features can help to reduce the risk of overfitting, which occurs when the model becomes too complex and fits to noise in the data rather than the underlying patterns. Improved interpretability: By reducing the number of features, the resulting model may be more interpretable, making it easier to understand the factors that are driving the model's predictions. Reduced computational cost: By working with a smaller number of features, the computational cost of training and evaluating the model may be reduced, which can be particularly important in large datasets or in cases where real-time predictions are required.

  • @carlosleandrosilvadospraze4005
    @carlosleandrosilvadospraze40053 жыл бұрын

    Professor, congratulations again for the video! I' m very grateful! I have a doubt. Could I use the feature selector at the end of a pre-trained CNN? (flatenned layer) I would like to reduce the dimensionality using a ML method.

  • @DigitalSreeni

    @DigitalSreeni

    3 жыл бұрын

    Technically the output of a pre-trained CNN would be a bunch of features so I don't see why you cannot perform feature selection on those features. I should admit that I have never tried so I cannot guide you on what to expect.

  • @carlosleandrosilvadospraze4005

    @carlosleandrosilvadospraze4005

    3 жыл бұрын

    @@DigitalSreeni Thank you! 😊

  • @sallahamine9467
    @sallahamine9467 Жыл бұрын

    why boruta algorithm does not work with ababoost catboost......

Келесі