System Design Interview: TikTok architecture with
We attempt to design a large-scale distributed video hosting platform like TikTok or Instagram Reels.
The engineering involved in building these systems is complex, and our attempt does not (even nearly) cover all the challenges that these engineering teams face. We instead have a mock system design interview setup. Yogita will have 45 minutes to design an architecture that can scale, is performant, fault-tolerant, and meets the functional requirements.
00:00 Intro
00:34 Problem Statement
01:24 Requirement listing
04:00 Capacity Estimation
06:34 Design skeleton APIs
08:34 Choosing datastores
12:10 Comparing datastores
19:16 Ingestion Engine
24:21 Video pipeline
30:59 Last mile delivery
33:46 What is a CDN?
35:52 Network Protocol
38:03 End to end request flow
39:54 Caching
41:19 Evaluation and verdict
45:03 Final Architecture
Yogita's Channel (sudoCODE): / @sudocode
InterviewReady: interviewready.io/?_aff=SUDOCODE
Social Media:
Github: github.com/coding-parrot/
Instagram: / applepie404
LinkedIn: / gaurav-sen-56b6a941
Twitter: / gkcs_
#SystemDesign #InterviewReady #SoftwareEngineering
Пікірлер: 723
If you are preparing for a system design interview, try get.interviewready.io. All the best 😁
@karunagadde
2 жыл бұрын
S3 is not a file storage
@vishal733
2 жыл бұрын
Hi. Could you please share the name of the online tool you are using for colaborating?
@ManishSharma-pe1jf
2 жыл бұрын
@@vishal733 All online meeting service will have a whiteboard inbuilt in it such as webex, zoom, etc.
@sayantangangopadhyay669
2 жыл бұрын
I have 2 question on the final architecture diagram. one is why raw video is sending directly from ingestion to s3. s3 only take final processed video after processing by workers right? and second, why the arrow is from different devices to CDN instead of CDN to different devices
@tanvirbinazam548
2 жыл бұрын
What software is used for drawing in this video?
Thank you both for putting this together and providing this content openly. This is very helpful for those trying to prepare for this exact type of interview scenario and who might not be familiar with the format. Excellent job!
Very detailed, touches very important system design aspects. Gives many pointers for further research! A zillion Thanks!
These kinds of mock discussion on SD is really helpful. Provides viewer a thought process while dealing such questions. Kindly do more these kinds of video ...
@user-ux3qt4sx3e
2 жыл бұрын
Why do u have two spaces around "viewer"
@suchismitagoswami5609
2 жыл бұрын
++
one of the most valuable content in youtube for young IT engineers
You both are just too good!! I love the authenticity and simplicity. The actual interview does take this similar course. Keep up the great work.
Awesome, guys! It is really valuable to see such interview in action. Feels like you are the one who is being interviewed. Good job, thank you! 🤩
Another awesome delivery , thanks Gaurav , One thought :- we increased the storage to ~6x for considering different resolution and formats , which we can handle by introducing 2 entities in the system . one , for avoiding different format , we can provide a dedicated video player to user, which understand our format only . Second entity is a resolution manager which we can place before streaming engine , which can help us to upgrade or downgrade a resolution as per user bandwidth or user reqest . take axample like netlix and youtube , they have their own media player which can understand their recording format . yes one extra task will be to convert uplaoded videos to application understanding format while uploading only but that will be fruitfull in saving 6x of storage cost . resolution can also be handled at runtime in 2 ways . -One by keeping always a high resolution copy and downgrade it at run time before serving to user. downside is a storage increment because of high resolution copies . - another is to always keep a low resolution copy for reference with some pixel patteren files to convert the low resolution copy to high resolution copy at run time . Up side it we can reduce the cost of storage system significantly. for perfromace handling in conversion , a dedicated system with predefined resolution converter filter can work .
@gkcs
2 жыл бұрын
Brilliant points, thanks!
@shirsh3
2 жыл бұрын
It would also be good idea to take a look at ffmpeg and "ts" files creation
@edwardspencer9397
2 жыл бұрын
Yes it is common sense to create your own video player which supports all devices instead of creating 20 formats lol.
@lhxperimental
2 жыл бұрын
@@edwardspencer9397 It not just about creating an app which can play video. You'll of-course have an app. Different formats have different properties. Some have small file sizes but require some hardware acceleration to perform well which may not be available on all devices. So even if you create your own player, it will do software decoding which will be slow - users will complain about phones getting warm, high battery consumption and sluggish performance. Instead you create different formats that are optimized for a particular family of hardware. There can always be a basic format as a fallback but you should cover the large percentage of devices in formats optimized for them.
@edwardspencer9397
2 жыл бұрын
@@lhxperimental Large percentage of devices is no longer true. Businesses always prefer those who have medium / high end phones/devices capable of hardware acceleration because all the others owning low end phones are mostly poor people who have no intention to spend any money on subscriptions or visit advertisers. So even if a poor guy uninstalls something due to overheating issues it shouldn't be a problem.
Excellent video ! Thanks Yogita for putting yourself out there for our benefit.
This was probably the best video so far. Please try to make more such videos
Thanks so much Sen-sei
I LOVE THIS VIDEO!!! You brought a pro and the back and forth brings that dual insight
This video is so good. It so helpful talking to engineering manager.
@pratikpatil5452
2 жыл бұрын
Liar it's no where near the real world projects...!! Although they are really good, it only gives us a idea of MVP and also how to crack interviews!! Real world scenarios are much worse and terrifying👻😱!!
I love this video and got to know atleast at a basic level the system design approach.
Kudos on this interview. So refreshing to see a mock sys design on youtube where the interviewer takes it seriously, challenges, questions and pushes the decisions of the interviewee.👏
Two of my fav youtubers on system desigm
Hey gorav, much helpful for the freshers and people with 1-2 years of experience in this field because this is how we deal with upper management, I always gets those diagrams and based on that do my implementation but now only I knew how they come to the conclusion of what needs to be done. Thanks for this. 👍
That was really amazing... like how smoothly she explains bits and pieces of the problem. loved it. Learned a lot. . . Thanks a lot for this content guyz.
@gkcs
Жыл бұрын
You're very welcome!
There should be more sessions like this. It's super helpful. I loved it!
Thanks a lot Gaurav for this extremely useful video. I must appreciate Yogita for this very detailed system design and component choices right from the queue, S3, CDN, Diff DB's, etc were awesome and especially the processing part of the video via workers. Thank you both!!
It was too good! informative. Hoping to see more such videos. Thanks Gaurva and Yogita.
Fantastic video, guys! Thanks so much for sharing! Very insightful!
Amazing video!!! Learnt a lot. The parallel workflow thing blew my mind. I thought it could be done later on, maybe post the original upload in a slower way. But that matrix thing was amazing!!
Great discussion. Yogita, huge respect. The way you explained the different choices you took, is an eye opener for people like me who is going to take the bull by horn soon. Subscribed to your channel as well. Thank you Gaurav.
I'm just 10 minutes in the video and it's already great! Thank you for this! :D
Scrolling tiktok for 45 min. - No Watch whole video for 45 min. - Yes, it's great.
Great video... The way she used all of her info and Gaurav summarized, it is just great in a short time. Thank you
The best mock I saw in my 2 months studying for my interview.
super informative , sudoCode effort was really great. Keep making more such content, lets take airbnb as next system.
Awesome stuff ! Thanks for this, Gaurav !
When i started watching i thought ill quit in between but the session was so nice and non boring and interactive that I watched the hole video thanks a lot for this
@ashish7516
2 жыл бұрын
this video was not on hole, are you sure watched this video only ?
I am watching this video after almost 2 years. Thanks for uploading these kind of videos, They are very helpful.
@gkcs
4 ай бұрын
Thank you!
Excellent session very helpful..u guys r actual heroes for dev like us..
amazing, thank you both for this
Very helpful. Have used all the knowledge gathered so far in the playlist. Thanks for sharing this discussion!
@gkcs
Жыл бұрын
You're welcome!
This is way to learn How system design with respect to requirements
One of the best videos to understand system design. Thanks guys
This video is amazing guys, great work
really enjoyed the session and also learned new things, keep uploading more
Thank you so much, Gaurav and Yogita. I got to learn a lot from this particular video. Please posting such videos for the community. Thanks again.
She came really prepared for this question! Didn’t she 😂 she was playing back what she prepped really nicely for this video. Great stuff folks 👍
Few ideas! - Utilising the fact that most requests are of videos that are in trend, and trends die in ~month or so, instead of storing all the transcoded files, we have a live transcoder, and store the result in a cache (or CDN) with a TTL of ~ month (this time can be decided by data analysis). Twitter did this and were able to save millions on storage costs. - We can have live websockets with the online users, so that whenever the video is complete we can notify them, and maybe also the users who were tagged, or are very engaged with an account. - Instead of dividing videos in chunks after receiving the whole video, let the client do the chunking and upload chunks only. This would result in way less failures as if a upload fails after uploading 95% of the video, you don't need to re upload the entire file again. - Maybe have caches on top of databases
@VikashSharmaVS
Жыл бұрын
s3 also have multiple tiers . you can set the rule to move files to lower tier after set time and further
@mostaza1464
Жыл бұрын
Agree with chunking the video on the client side!
This was very informative, thank you !
By watching this video I fallen in love with System Design 😅
this is so good, thank you Gaurav and Yogita!
Thanks Yogita and Gaurav, looking forward to more such videos
this video is just so precious . many thanks
Thanks Gaurav Sen & Yogita for informative contents. You guys are great. I was looking for such videos since long time. Finally found one. Thanks again.
@gkcs
5 ай бұрын
Our pleasure!
This is so practical and relevant. Thank you.
Great video as always Gaurav. Well done. Look forward to more such interviews. :)
Very helpful discussion around databases. Thanks Yogita and Gaurav!
In so many video I searched the difference between sql and no sql but i didn't understand the use case but I got a clear picture about the use case for the no sql.. Thanks for this keep posting your videos especially yogitha
i think the integrations of s3/cdn and cache/cdn are something i would like to learn more as a followup. Great video btw!
Amazing ....u guys rock...thanks for sharing , waiting for more 🙂🙂
One of the best video on this channel.
This is really informative. Good job folks. Looking for more sessions like these.
Long time subscriber of Yogita's channel here!
Thanks @gaurav for making such a extremely handy and useful video. Kudos for that. 👍 Can we please have part 2 of this video where you include discuss about the 1. Exception handling and reporting, 2. Ballpark estimate for each component of this system. 3. What strategy to be used a month or a year after to decrease load on the file system.
Many thanks for sharing. It is helpful to see the chain of thoughts, when architecting the solution.
Awesome thanks Gaurav and Yogita 👍
Maza aagaya... Thanks a lot... So much knowledge in a 45 min video.
More of this please! ♥️
Wow, this is so awesome!
Very good for some one who is interested in designing solutions...hits the basics really hard.
This was really nice discussion, AWS has got a good endorsement…. On a lighter note
I learned a lot from this video. Thank you very much.
Good one, @yogita explained very well.
Fabulous video.. Thank you @Gaurav and @Yogitha
amazing video...You should do videos like these more often....
Wow it was really great and i was waiting for this kind of video from long time to understand how the system design discussions will be done be in details which you did, Thank so much for both of you and Request you to come with similar kind of videos for different complex use-cases like Banking or Insurance ...e.t.c.
Very Informative! Thanks for sharing
Thank you Gaurav for the video, this kind of interacted videos will explore more and more queries to understand the sd
Great take at the design problem. :) However I'd have a different approach for replication. We're replicating the video in s3 for 2 reasons: 1. Fault tolerance 2. Latency due to geographical location I'd suggest to replicate to far fewer s3 locations and that too only for (1). To tackle (2) we can have this approach --> 1. Buffer around 1 second or so of the video on the device upfront. 2. When user starts watching the video, then lazily load the rest of the video in chunks. The buffering strategy further depends on (to name a few): 1. Device network quality 2. Prediction of potential videos which user might want to watch based on some ranking algorithm Also, regarding hot video meta data caching: 1. We can cache the api response at cloudfront end. 2. Redis can also be used alternatively. Redis might be a better approach here because it is distributed and if the video is deleted/modified by the OP then we can update it accordingly.
@kanuj.bhatnagar
2 жыл бұрын
1. We can cache the api response at cloudfront end. -> AWS has the Global Accelerator for this purpose. It's costly, but if you're ingesting ~1.2TB of videos everyday, you can afford it.
Gaurav sir aap to clean bold ho gaye. Interviewer got impressed throughout. Thanks so much for the efforts.
Ultimate knowledge 🔥
Thanks a lot for this awesome content 🙏
Thanks for this!
This is my first system Design video that I watch till end 😅
Wow very very educative !! Big ups !!
This video is very informative , thanks to both of u .
Great discussion...The most important parts starts at 19:20 and 38:04 to be specific
Great job, thanks!
Very useful video! Thank you
Hi first of all thank you both of you so much for sharing how things work .i will.wish for your best future
Very well designed ... Loved it 👍
Awesome content !!
Amazing video....lot of questions were addressed. This duo should do a video series covering other case studies like : stock broker platform , uber , whatsapp etc
@gkcs
2 жыл бұрын
kzread.info/dash/bejne/qKqcpZhtmLTAfc4.html
This concept of video is awesome
Super one, good work you both 👍
wow Yogita is a real pro It was amazing !!!!
Great video. Made me like and subscribe within 3 mins
Are there any such more videos? Love to see that 🙂. Very insighful.
Coincidentally Akamai CDN was down just a few days after this video was uploaded
I read that some people have already talked about this. As another solution per requirement, I feel you need not wait for all the formats and resolutions to be available one at a time. You can push them to a queue and then a worker group can keep on pushing. This will allow more parallelism. In this way the video with lower resolution/size can be made available for preview while the UI to the uploader can show that the rest are being processed. Or, otherwise the original video can be uploaded directly and the format and resolution part can be taken later. Many times we edit the videos. Once all formats are available the video can be made viewable to public.
Fantastic video.
Nice session !!
Hey Gaurav, Love to see this amazing and informative video. Please make more mock interviews video. All the best and Happy Deepawali 💥
Great video! One feedback - I didn't see the usage of the 1.2TB data you calculated, I mean a translation of how many servers (with resources like CPU, RAM, Disk, IO, etc) would be needed for ingestion pipeline as well as storage would have been helpful. Also, some interesting scenarios like thundering herd, data compression to reduce cost would have been of great help. And don't you think, putting all the video in the CDN would be cost heavy. Should have some strategy based on popularity/recency/TTL and upload/remove the video from CDN.
You guys are amazing
Thank you for this video