What is Box Cox Transformation? - Time Series Analysis in Python

Ғылым және технология

Understanding the Box-Cox transform can take your forecasting skills to the next level. In this video, we will break what it is, why it is important and how you can use it in Python! 📈
Python notebook from video 🐍: github.com/egorhowell/KZread...
Medium article on Box-Cox transform 📝: medium.com/towards-data-scien...
Book a 1:1 mentoring call → calendly.com/egorhowell/consu...
💌 Weekly data science tips & FREE resume template: newsletter.egorhowell.com
MY FAVOURITE STUFF:
🎵 Epidemic Sound - Where I get my music: share.epidemicsound.com/v015na
🚀 VidIQ - How I optimise my KZread videos: vidiq.com/egorhowell
🗓️ Notion - How I manage all my stuff - affiliate.notion.so/egorhowell
(PS: Some of these links are affiliate links that I get a kickback from 😎)
CONNECT WITH ME:
Blog -- / egorhowell
Twitter (X) -- / egorhowell
GitHub -- github.com/egorhowell
LinkedIn -- / egorhowell
WHO AM I?
Hi, I'm Egor! I am a Data Scientist with a master's in Physics currently living in London. I share data science tutorials, advice and general tech topics!
TIMESTAMPS
0:00 Intro
0:15 Python tutorial
01:30 What is the Box-Cox transform
01:44 Why its important
04:05 Box-Cox in Python
06:24 Outro
DISCLAIMERS
This content is for educational and entertainment purposes only and should not be considered as professional advice. Views and opinions are my own and do not represent or reflect the opinions of my current or past employer or any organisations I am associated with. This description also contains affiliate links from which I may receive a small commission from.

Пікірлер: 16

@daansan2611 ай бұрын
Another great video Egor! Well done
@egorhowell
11 ай бұрын
Cheers David!
@tsunetasora10 ай бұрын
4:18 The seasonality and residuals look much multiplicative. box-cox says to transform skewed data into normal. One thing that confuses me is whether box-cox also turns multiplicative ts into an additive ts, because 5:44 after transformation, seasonality and residuals seem additive.
@egorhowell
10 ай бұрын
Yes you are right, the box-cox transform can also be used to turn the time series to be additive. I discuss this in another video in this course: kzread.info/dash/bejne/Zoqlmq2qkquxmJs.html
@antonschulte253810 ай бұрын
Thanks Egor.Its very understandable:) Helps a lot! Is there a way to reduce the effect of outliers on the transformation?
@egorhowell
10 ай бұрын
Thanks Anton! Hmm I am not too sure, my advice would be to remove them when calculating the box-cox lambda value. However, let me research and I will send some resources your way! :)
@antonschulte2538
9 ай бұрын
@@egorhowell Thanks a lot:)
@egorhowell
Ай бұрын
No problem
@tsunetasora10 ай бұрын
Another thing that confuses me is that box-cox says to transform skewed data into normally distributed. How can a time series show skewness? For cross-sectional data, the distribution of Y at a given x (draw a vertical line in the plot) is meaningful, because there can be multiple ys given an x, so we can see how frequent Y takes some value. While time series are essentially different, because a historical number has been observed and unique, I cannot figure out a distribution by a single certain value. Then how is it meaningful to apply box-cox to transform skewed data into normally *distributed*?
@egorhowell
10 ай бұрын
I understand your point, as for a normal regression problem we can see how the target variable is distributed relative to some feature. The goal is not necessary to show the skewness, because each data point belongs to a different distribution, so in reality it's quite difficult. However, I must say that this is reaching the edge of my current knowledge, so will be an interesting one to research about! Thanks for the question.
@dariustrabalza66296 ай бұрын
hey, are you supposed to apply the boxcox after applying log and diff or instead of log?
@egorhowell
6 ай бұрын
Hey! Normally you would apply the Box-Cox transform and then differencing transform. In reality, it depends what you are after. - diff(boxcox(x)) calculates relative changes - log(boxcox(x)) calculates the absolute differences Hope that makes sense!
@bhavikpatel608823 күн бұрын
Hi Egor, let say my end goal is to train, validate and test on monthly data from Jan 2023 to June 2024 and writing a function that takes user input of a month possibly next month(July 2024) to predict total monthly sales amount. if I apply box cox or any transformation the range of original value is going to be changed (Mostly going to be decreased). Do you think we must revert back the transformation when user gives input month for the prediction?
@egorhowell
21 күн бұрын
Yes, your prediction is going to be the box cox transformed output. So you need to do the inverse box cox to get back to the original time series units
@praveenbehara5 ай бұрын
Hi @Egor: Thank you for the videos. While going thru 3rd video, I saw the plotting function missing the following code. def plotting(, text=False, lam=None): if text: fig.add_annotation(x='1952-12-20', y=10, text=f'Lambda = {lam:.3f}', align='left', yanchor='bottom', showarrow=False, font=dict(size=20, color='black', family='Courier New, monospace'), bordercolor='black', borderwidth=2, bgcolor='white')
@egorhowell
5 ай бұрын
Hey, thanks for noticing this! I see this in the video, are you referring to the one GitHub?