How indexes work in Distributed Databases, their trade-offs, and challenges

Ғылым және технология

System Design for SDE-2 and above: arpitbhayani.me/masterclass
System Design for Beginners: arpitbhayani.me/sys-design
Redis Internals: arpitbhayani.me/redis
Build Your Own Redis / DNS / BitTorrent / SQLite - with CodeCrafters.
Sign up and get 40% off - app.codecrafters.io/join?via=...
In the video, I explained how indexing works in a distributed database like DynamoDB and the challenges it brings. I discussed creating different types of indexes, such as global secondary indexes, for efficient querying. By using global secondary indexes, queries can be optimized by directly accessing the required data shards. I also touched on local secondary indexes for specific query patterns. Maintaining indexes can be costly, and the choice between global and local secondary indexes depends on the query requirements and consistency needs. The video ended with a suggestion to explore further on index creation using B+ Trees.
Recommended videos and playlists
If you liked this video, you will find the following videos and playlists helpful
System Design: • PostgreSQL connection ...
Designing Microservices: • Advantages of adopting...
Database Engineering: • How nested loop, hash,...
Concurrency In-depth: • How to write efficient...
Research paper dissections: • The Google File System...
Outage Dissections: • Dissecting GitHub Outa...
Hash Table Internals: • Internal Structure of ...
Bittorrent Internals: • Introduction to BitTor...
Things you will find amusing
Knowledge Base: arpitbhayani.me/knowledge-base
Bookshelf: arpitbhayani.me/bookshelf
Papershelf: arpitbhayani.me/papershelf
Other socials
I keep writing and sharing my practical experience and learnings every day, so if you resonate then follow along. I keep it no fluff.
LinkedIn: / arpitbhayani
Twitter: / arpit_bhayani
Weekly Newsletter: arpit.substack.com
Thank you for watching and supporting! it means a ton.
I am on a mission to bring out the best engineering stories from around the world and make you all fall in
love with engineering. If you resonate with this then follow along, I always keep it no-fluff.

Пікірлер: 39

  • @ozmenta9444
    @ozmenta94444 ай бұрын

    Making sure in depth and quality content reaches everyone is what separates you from the rest, who are making money just by dwelling on the surface. The word "thanks" alone can't show the gratitude of many, including me, who gets benefited a lot. I hope this continues forever!!

  • @swati12091993
    @swati1209199316 күн бұрын

    Thanks Arpit, for making such videos. After watching couple of videos on internals of a database, including yours, I have started enjoying learning about how things work in the background. Thanks for your effort!

  • @prateekraj1084
    @prateekraj10843 ай бұрын

    Instead of reading multiple blogs, going through your vlog saves time and brings interest back to the topic.

  • @prashantrajgor03
    @prashantrajgor034 ай бұрын

    How creating GSIs will solve the 2 major problems 1. Shard is slow 2. Shard is dead

  • @shubhamkumar6383
    @shubhamkumar63834 ай бұрын

    Hi Arpit Big FAN!! From your System design playlist where you explained about the database that was exactly asked me during the interview @ INDIA MART for Technical Lead Position seems like the interviewer and i studied from the same place😅 and from the microservices playlist many challenges were thrown in the Director of Engineering Round i was able to clear both the rounds because of your videos Thanks a ton !!!

  • @AsliEngineering

    @AsliEngineering

    4 ай бұрын

    this is such great news 🔥 Many many congratulations Shubham 🙌

  • @PranitKothari
    @PranitKothari4 ай бұрын

    Amazing. Nice detailed explanation!

  • @ishantsagar1759
    @ishantsagar17594 ай бұрын

    Very well explained Arpit. Before watching this video, I was literally confused as to why Partion Key is always required to create a LSI. I understood the complete picture of it now 👌

  • @harshitgangwar4500
    @harshitgangwar45004 ай бұрын

    Very well explained❤Learned something new today :) Gonna dig in a little deeper in this.

  • @techwithgd
    @techwithgd4 ай бұрын

    Thanks for this video, we too are planning to work on Sharding/Partition in few months and would love to take this project.

  • @shwetashetye8254
    @shwetashetye82543 ай бұрын

    Absolutely awesome content!

  • @aniruddhadeshmukh9445
    @aniruddhadeshmukh9445Ай бұрын

    fantastic video

  • @raj_kundalia
    @raj_kundalia2 ай бұрын

    Thank you so much!

  • @VerywellPeople-bs7ol
    @VerywellPeople-bs7ol4 ай бұрын

    Good video ❤

  • @abhaykatiyar3539
    @abhaykatiyar35394 ай бұрын

    Sharding can be done in relational or non-relational databases but I think non-relational db are more preffered as they have less overhead for example performing a join operation on sharded db in kind of a nightmare, but since nosql is imperative you specify the join logic in the application code itself and handle it there. In a nutshell SQL has feature for join but it is hard to make sense when db is sharded , but nql has no such concept so sharding make sense there much ..

  • @sankuM

    @sankuM

    4 ай бұрын

    what is meant by 'imperative' here? do NoSQL handle joins very differently?

  • @pratikdey8062
    @pratikdey806222 күн бұрын

    awesome

  • @riteeksrivastava6157
    @riteeksrivastava61577 күн бұрын

    Hi Arpit, thanks for explaining the concept. I have one question regarding global secondary index, what if the secondary attribute cardinality is very high like `created_at` kind of field? Will this sharding the index based on the value scale? I also need to read more about it, but would like to know your opinion.

  • @mohammedsafiahmed1639
    @mohammedsafiahmed16394 ай бұрын

    is an LSI a separate object that the main data itself? Cant we sort the main data itself by author key and the secondary attribute? Meaning inside of each node, the data would be sorted by athor then the secondary attribute.

  • @vivek2319
    @vivek23194 ай бұрын

    What I feel about your KZread Channel is, even if someone cannot afford your courses and still watches all the videos( which are FREE, btw! ) , they are more likely to ace the interviews.

  • @AsliEngineering

    @AsliEngineering

    4 ай бұрын

    yes. and also ace their career. it is just that I go slightly more practical and in-depth than this in my courses helping people build the right intuition.

  • @tarunstv796
    @tarunstv7964 ай бұрын

    Hey Arpit, Great content! Is there a video on distributed sequence generator?

  • @shouryagupta6969
    @shouryagupta69694 ай бұрын

    I'm just curious here, what if in the global secondary index instead of row_id (or primary key), we are able to store the page_no (actual hard storage page no)? This will fasten up reads that include GSIs a bit as it essentially skips the step of querying into the data shard and can directly access data using the page number. I understand that the performance difference might not be huge but in some niche over optimized scenarios this might come handy. The downside I believe would be that index creation will take some more time, but imo that can be written off.

  • @harshchiki7796
    @harshchiki7796Ай бұрын

    Which app do you use to write in an present in this (and other) videos? (in iPad) Thanks for the great content btw!!

  • @rohitreddy6794
    @rohitreddy67944 ай бұрын

    Thanks

  • @AsliEngineering

    @AsliEngineering

    4 ай бұрын

    Thank you so much Rohit 🙌

  • @tesla1772
    @tesla17724 ай бұрын

    In first case where we store blog_id(primary_key) in GIS, we will get the list for blog_ids when we try to get for a particular category. Then how will we get to know that in which db shard this blog id resides ? as the shard is based on author id.

  • @Raja-kl4op

    @Raja-kl4op

    4 ай бұрын

    Hi Arpit, same doubt here, Could you please help us with this one.

  • @chinmaykhamkar7372

    @chinmaykhamkar7372

    4 ай бұрын

    +1

  • @makarandpundlik1083

    @makarandpundlik1083

    4 ай бұрын

    I think there is a confusion between author_id(which he told asa paritiion key) and blog_id (which we are assuming as a partition key).

  • @kelvingandhi4124

    @kelvingandhi4124

    4 ай бұрын

    +1 In that case, again there will be data collection from all DB shards and combining results as blog_ids are spread across multiple shards ! Don't see any difference from actually submitted query... 🤔

  • @PrateekSaini
    @PrateekSaini4 ай бұрын

    With Naive implementation, the DB routing layer was firing queries to both the shard and merging the results (scatter gather). how does GSI change that? even now the data still resides on data node. Routine layer will still have to fire queries to both the nodes. How does it solve anything?

  • @karanchatwani5180

    @karanchatwani5180

    3 ай бұрын

    The first approach was querying the main shard with the category key which was not indexed, hence more latency. The second approach was querying the main shard with the primary key (user id) which is always indexed as it is a primary key, hence less latency.

  • @aqilaghamirzayev8189
    @aqilaghamirzayev81894 ай бұрын

    Thanks for good explanation. But is it OK using sql for saving blog data? Isn't ok nosql. Which spesific database would you recommend to choose saving blog data?

  • @AsliEngineering

    @AsliEngineering

    4 ай бұрын

    SQL works like a charm. No need to unnecessarily go for NoSQL solutions unless your data becomes massive.

  • @ShaikhZahid349
    @ShaikhZahid3494 ай бұрын

    Start kaun sa video se karu system design?????

  • @piyushpathak1186
    @piyushpathak11864 ай бұрын

    But how the global second index solves the problem that one of the shards is slow or dead???

  • @AsliEngineering

    @AsliEngineering

    4 ай бұрын

    It makes pagination and query efficient. If you store the complete data in GSI the. It removes the need to query the data Shards.

  • @eatajerkpal99
    @eatajerkpal993 ай бұрын

    hey arpit where can i find the notes you are presenting, from all videos?

Келесі