Multiple Time Series Forecasting With Scikit-Learn

You got a lot of time series data points and want to predict the next step (or steps). What should you do now? Train a model for each series? Is there a way to fit a model for all the series together? Which is better?
I have seen many data scientists think about approaching this problem by creating a single model for each product. Although this is one of the possible solutions, it's not likely to be the best.
Here I will demonstrate how to train a single model to forecast multiple time series at the same time. This technique usually creates powerful models that help teams win machine learning competitions and can be used in your project.
And you don’t need deep learning models to do that!
Timestamps
0:00 Intro
1:28 Melt the data, stack the series
7:18 Split the data
10:29 Set-up a 1-step target
13:57 Create 4 fundamental features (feature engineering)
26:16 Choose an evaluation metric
31:34 Establish a baseline
35:18 Train the model
37:34 Evaluate the model
39:11 Extend the model to multi-step forecasting
43:04 Forecast new data
45:37 Next steps
Code: github.com/ledmaster/english_...
Timestamps:
0:00 Intro
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
// SUPPORT THE CHANNEL 👇❤️
Sign up for a Coursera course:
imp.i384100.net/EaDmQe
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
// SOCIAL MEDIA
LinkedIn: / mariofilho
Kaggle: kaggle.com/mariofilho
Twitter: / mariofilhoml
Blog: forecastegy.com
Some links above can be from partnerships where I get a commission if you buy a product, without any additional cost to you. Thanks for the support!

Пікірлер: 41

  • @nehan.2199
    @nehan.21992 жыл бұрын

    This is very helpful thank you! Where can I find the dataset to download?

  • @Forecastegy

    @Forecastegy

    2 жыл бұрын

    Great, here it is: archive.ics.uci.edu/ml/datasets/Sales_Transactions_Dataset_Weekly

  • @tom199520000
    @tom1995200007 ай бұрын

    I just checked this amazing video after your feature selection engineering video! I have no idea why this is video isn’t popular!!! Respect the effort you spent on this!

  • @Luckasborges
    @Luckasborges2 жыл бұрын

    Learning ML and English together! Here we go! hehe Congrats for the new channel, Mario!

  • @towhidultonmoy3046
    @towhidultonmoy30462 жыл бұрын

    Keep it up! You have a long way to go brother. Best wishes!

  • @vamsikrishnabhadragiri9742
    @vamsikrishnabhadragiri97422 жыл бұрын

    Why haven't perform standardization for the data? As sales for different products will be different ranges does it not affect the model performance?

  • @sancarlitos1125
    @sancarlitos11252 жыл бұрын

    Excellent explanation! Thanks for sharing it! I was realizing a similar forecasting, and I was wondering if when product number changes, let say from 0 to 1… the rolling window and the lag should be modified? Because we would be using the information of the last product. Thank you very much!

  • @kaianchan7768
    @kaianchan7768 Жыл бұрын

    Thanks for this tutorial. Will you provide some videos about many features? Thanks!

  • @Septumsempra8818
    @Septumsempra8818 Жыл бұрын

    Are we going to get a video on cross-validation and selecting the right model? Your time series videos have been a wealth of knowledge.

  • @ElChe-Ko
    @ElChe-Ko Жыл бұрын

    Nice! It would be interesting to see what to do if the time series have different lengths.

  • @igorkuivjogifernandes3012
    @igorkuivjogifernandes30122 жыл бұрын

    Hi, Mario. Awesome video...it helped me a lot. One doubt: what could we do if the train set has uneven peridiocity (the peridiocity is 2 days for one product, 7 days for another product, 3 days for another product and so or even worst...some products has only 1 or 2 observations), but my test set has even peridiocity (every product has peridiocity of 7 days)?

  • @Dragnar21
    @Dragnar212 жыл бұрын

    First of all, thank you for that video and that extraordinary explanation. I would like to know how would you structure your data, if the data is not the same length ?

  • @diegosccp09
    @diegosccp092 жыл бұрын

    you are a legend Im using this to do a masters assessment

  • @JoaoVitorBRgomes
    @JoaoVitorBRgomes2 жыл бұрын

    Vc é o cara!

  • @pcdowling
    @pcdowling8 ай бұрын

    Thank you.

  • @alirezajabbari2537
    @alirezajabbari25372 жыл бұрын

    Thank you Mario! You saved me in my 4th year project ciao

  • @Forecastegy

    @Forecastegy

    2 жыл бұрын

    Glad to hear that!

  • @user-fh7gb2yf5z
    @user-fh7gb2yf5z Жыл бұрын

    Mario, boa tarde. Tem algum dica para usarmos a LSTM para predições com passos à frente em um sistema MISO? .

  • @Mohammad-vr9dj
    @Mohammad-vr9dj Жыл бұрын

    Thanks for the useful video. Sorry, is it possible to implement independent spatial sequences simultaneously? I have a dataset which is consist of 1000 independent spatial sequences with dimension 2*7 (2 for x and y, and the length 7 for positions in each time). I implemented it with Simple RNN, LSTM and GRU. Can I do it with transformers (attention mechanism)? Could you introduce me a practical example?

  • @Mohammad-vr9dj
    @Mohammad-vr9dj Жыл бұрын

    Thanks for your useful video. Sorry, If our dataset has two target columns how can we write the codes?

  • @Orlandobelli
    @Orlandobelli2 жыл бұрын

    Good video, we can make multiples time series with ARIMA model?

  • @faraza5161
    @faraza5161 Жыл бұрын

    The Simple Imputer will impute mean values for the entire column in the missing values. Shouldn't that be done product wise as well? Thanks for a wonderful lecture btw :-)

  • @Learner_123
    @Learner_123 Жыл бұрын

    Thank you for making the topic simple. Since you have combined all the product sales to train and validate your model, How can one use this model to predict sales for 'any single' product only?

  • @zabmaz10

    @zabmaz10

    Жыл бұрын

    I have the same question, but I guess one way is to convert the product code into dummy variables and use those as features in the random forest.

  • @StatiR_br
    @StatiR_br2 жыл бұрын

    Olá Mario! Em primeiro lugar parabéns pelo vídeo ! Fiquei com uma dúvida: Nesse contexto, temos vários produtos (Product_Code) e apenas um modelo ajustado, da forma que está o dataset, o modelo irá/poderá considerar, por exemplo, o último 'lag_sales_1' de um Product_Cod para prever as vendas do próximo Product_Code ? Pois o modelo não saberá quando é um Product_Code e quando será outro. Ou eu estou confundindo? Desde já obrigado !

  • @guilhermeparreira5448

    @guilhermeparreira5448

    Жыл бұрын

    Concordo contigo. Essa forma de modelagem só funcionaria se todos os produtos tivessem uma venda média próxima (e olha lá). Penso que o mais correto seria o product code também como covariável do modelo.

  • @zulhas9
    @zulhas9 Жыл бұрын

    Hi Mario, thanks for the wonderful presentation. One qouestion, how could you use the feature the "Sales" to predict sales? Using that features, when you predict using .predict function, you have to pass that as an argument. In reality, you would not have that information available.

  • @mamyrak1114
    @mamyrak1114Ай бұрын

    i can do the same processus if in place of week i have a date like yyyy-mm-dd and how to handle the year?

  • @VG-yw2mp
    @VG-yw2mp11 ай бұрын

    Why dont we use product_code as one of the features while training?

  • @Gabriel-iw3hc
    @Gabriel-iw3hc11 ай бұрын

    how i future forecast with this method ? Ex: forecast week 52 ? i think, need to forecast another series too for another features .

  • @stonesupermaster
    @stonesupermaster Жыл бұрын

    Hello Mario, I have a question... how does the model know that we're trying to predict multiple products at once? I've trying to train a model in order to predict the sales of 2000 SKU and the main concern I have now is how to do it efficiently. I watched everything that you did but I still have the same problem, do you know where I can find an example of it? thank you very much for your video

  • @AskApt05

    @AskApt05

    7 күн бұрын

    Hi @stonesupermaster, Facing same problem. Have you found a solution? It would be really helpful if you can share. Thanks!

  • @jackcarter97
    @jackcarter974 ай бұрын

    How do I find the season effect features?

  • @jackcarter97
    @jackcarter974 ай бұрын

    how do I find the season effect features?

  • @ozan4702
    @ozan47022 жыл бұрын

    Why the difference should be a feature? Given sales and lag sales, difference can be already known.

  • @XiboquinhaMilGrau
    @XiboquinhaMilGrau Жыл бұрын

    Por essa eu não esperava kkkk

  • @efremyohannes2334
    @efremyohannes23342 жыл бұрын

    How to model time series for unevenly distributed data using sckit-learn

  • @aacharyadhruvi8301
    @aacharyadhruvi83012 жыл бұрын

    From where I can get Sales_Transactions_Dataset_Weekly.csv ?

  • @Forecastegy

    @Forecastegy

    2 жыл бұрын

    Here archive.ics.uci.edu/ml/datasets/Sales_Transactions_Dataset_Weekly

  • @vivianealveslima9358
    @vivianealveslima93582 жыл бұрын

    the code in GitHub is unavailable =S

  • @Forecastegy

    @Forecastegy

    2 жыл бұрын

    Oops! Fixed, this is the right link: github.com/ledmaster/english_tutorials/tree/main/multiple_time_series