System Design for Flash Sales: Sharding vs. Zookeeper vs. Redis vs. Kafka

Ғылым және технология

In this video, we dive into the exciting world of flash sales during festival seasons. Discover how limited stocks of popular items like iPhones and MacBook Pros are booked in the blink of an eye. We explore four high-level system design approaches to ensure fair and efficient booking processes. From using Kafka and Redis to sharding and Apache Zookeeper, we compare the advantages and limitations of each approach. Join us to uncover the secrets behind successful flash sale booking systems! Don't forget to like and comment if you find this information helpful! In this comprehensive breakdown, we also address key challenges like preventing overselling and ensuring response times of under 100 milliseconds. Whether you're a tech enthusiast or preparing for interviews, this video provides valuable insights into building robust systems for flash sales. Don't miss out on mastering the art of booking your favourite gadgets!
credits
1. Some references take from here: System Design Fight club: • Google Interview Quest...
2. A good article on zookeeper github.com/saptarshibasu/zook...
3. How redis counters are not consistent: github.com/nateware/redis-obj...
4. Apache zookeeper multi instance setup: • Apache Zookeeper clust...
5. Apache zookeeper data models: • Zookeeper Data Model |...
6. Apache zookeeper playlist: • apache zookeeper
7. Redis increment counter: redis.io/commands/incr/
Chapters
00:00 Intro
01:49 Requirement
05:08 Kafka
09:29 Redis counter
15:50 Zookeeper
23:06 Sharded counter
26:26 Comparison

Пікірлер: 38

@mohitshaw10255 ай бұрын
Thanks for this video. I watched multiple videos about this topic. And this has been one of the most insightful one.
@anubhav_shrivastava
5 ай бұрын
thank you very much. It means a lot.
@ankitg200
Ай бұрын
agreed
@deepakjain2484 ай бұрын
I didn't quite understand what do you mean by storing counts for each consumer or are you storing the inventory count in DB (Approach 1) ? isn't it same as final approach then ?
@deepakjain2484 ай бұрын
Why can't we have seperate counter inventory for each redis instance in the approach 2 ?
@santoshbanerjee34078 ай бұрын
Very comprehensive coverage of the problem space. Nicely done! The comparison matrix shared at the end is really helpful. Regarding the Zk based approach, I'd like to add that typically Zk is not kept on high volume transaction paths. It's mostly used for global consensus and cluster membership kind of usecases (e.g. co-ordinating with Region servers in Hbase. Similarly for Hadoop etc).
@anubhav_shrivastava
7 ай бұрын
Thanks for adding this information. Yeah, may be that is the reason i could not find a convincing reference of using zookeeper in such cases.
@akshayparseja47187 ай бұрын
the issue of redis where we could get network issue and cause undersell, how will that get resolved. Wont this happen with zookeeper too?
@anubhav_shrivastava
4 ай бұрын
You are technically correct. But an advantage of zookeeper is that zookeeper clients have comma separated list of zookeeper servers, so chances of majority nodes going down or loosing a transaction is lesser.
@dhruvjainiitkgpcse5 ай бұрын
What would happen if the shard goes down in the final approach, how are we keeping track of number of items sold etc. Also is it just one sql database with 200 as value or there's replication.
@anubhav_shrivastava
4 ай бұрын
There will be replication as well. So, a DB shard going down will not be loss of data. You will know how many items are left to be sold.
@jlecampana8 ай бұрын
Great video Anubhav! Question, How do you actually handle the Counter for the Final "Sharding" solution? Is the Schema basically 2 Tables? Inventory (Counter) and Orders? Do we need some type of Row-Locking in there somewhere? Thank you for this fantastic lesson :)
@anubhav_shrivastava
4 ай бұрын
yes two tables: one for count and other for orders. I dont think we need row level locking for counts.
@susantaghosh504
2 ай бұрын
@@anubhav_shrivastava if multiple threads are trying to decrease the count simultaneously in the same shard, then how consistency will be maintained?
@thegt9 ай бұрын
Love the terminal. You must be a Matrix fan. Dark black background with the green from the Matrix.
@vipulgrover147311 ай бұрын
What are the advantages we are getting by dividing our database into 3 shards instead of having a single one ?
@anubhav_shrivastava
11 ай бұрын
when you make an insert query to database it takes lock and other transactions wait for the lock to finish. It will work fine till certain scale, but there will be point which your database starts to slow down.
@rakes5689 ай бұрын
Umm what about redis locks or postgres advisory locks? We can acquire a lock for every request, and decrement a counter in a relation db. A caveat I can see is, requests will get processed sequentially, but locks should be for very small duration, and might not be a problem. This is what's generally followed even for buying a product from regular inventory system to ensure an order gets placed only if inventory is available.
@anubhav_shrivastava
8 ай бұрын
I think this should work!
@soumik765 күн бұрын
Cassandra uses Version Vectors if I am not wrong, not CRDTs
@anubhav_shrivastava
2 күн бұрын
Hey Soumik I read the same when i was reading Cassandra in action. Immediate search shows that Cassandra does use CRTD www.baeldung.com/java-conflict-free-replicated-data-types
@vipulgrover147311 ай бұрын
Why didn't we use dynamo db directly for storing counter ?
@anubhav_shrivastava
11 ай бұрын
I am not sure about dynamo db, buy there are many other ready made solutions as well, like google spanner and cassandra which should also work fine.
@SemihSahinCS
3 ай бұрын
i think by default both dynamodb, and cassandra are eventually consistent, which leads to overselling. Configuring them with strong consistency, or using Spanner, we should be able to solve this though.
@munnukumar50382 ай бұрын
There is 1 problem with solution 4. Lets assume, counter range for shard 1 is exhausted. We need some kind of coordinator service such as zookeeper which will de register that shard node so that those requests would be shifted to some other shard using consistent hashing. Even after this,there is one edge case, what if some user was already bought the product and he was in shard 1 previously but now he is in shard 2, primary key constraint would not work here and that user can buy twice
@haridotvenkat10 ай бұрын
regarding last technique(DB sharding), if the hashing function does not guarantee uniform distribution for keys, it would end up facing undersell issue. Having said that, solution with Zookeeper is the best, but not so popular!
@anubhav_shrivastava
9 ай бұрын
yeah. I have not yet seen anyone using zookeeper independently. It is mostly used along with some apache project.
@santoshbanerjee3407
8 ай бұрын
The non-uniform distribution issue could be addressed by consistent hashing, as has been already mentioned in the video. The key feature to exploit in consistent hashing to ensure near uniform distribution is that of virtual nodes.
@anubhav_shrivastava
8 ай бұрын
@@santoshbanerjee3407 That's correct. Infact I created a video on consistent hashing kzread.info/dash/bejne/dZegmtiCZZiserA.html Do check out in free time.
@pankajk9073
2 ай бұрын
zk is good for read heavy usecases. It could not handle write requirement for flash sales
@BoldCoffee20134 ай бұрын
Q: Why do organisations have different implementation for flash sale? if they create new item in inventory DB with 1000 iPhones, order service will take care everything as regular order flow.
@anubhav_shrivastava
4 ай бұрын
That's true. If you look closely, this solution is nothing but how to maintain count at large scale. It can definitely be used in the regular inventory management.
@prateek_jesingh6 ай бұрын
Nice and well created video, Anubhav. Just the explanation could have been more clearer and detailed. More software engineers should subscribe to this channel.
@lalitbhagtani012 ай бұрын
Your explanation of redis is not correct. Distributed redis is a sharded database based on consistent hashing. The problem which you have stated will arise in case of leaderless replication solution like cassandra or dynamo DB. Distributed Redis will behave just like how sharded database your solution 4 worked.
@davendrasingh51811 ай бұрын
your database will be overwhelmed incase you have billions of users and few items let say at Amazon scale want to purchase ipHone X
@anubhav_shrivastava
11 ай бұрын
It is not a single database, but a cluster. AWS uses dynamo db for most of their use cases which is a distributed database. Like wise there is google spanner. As i said at 22:17, there are ready made solutions, but here i am trying to give sense of what challenges and considerations you need for the problem statement. I found this yesterday's post by aws interesting which gives idea on scale and how well a database can scale: aws.amazon.com/blogs/aws/prime-day-2023-powered-by-aws-all-the-numbers/ how well Dynamo db scales in prime day sale.
@feelyourbeat78205 ай бұрын
Worst explanation
@anubhav_shrivastava
5 ай бұрын
Sad to know. Happy to have chat on linkedin if you think something could be improved.