How Instagram efficiently serves HashTags ordered by count

Ғылым және технология

System Design for SDE-2 and above: arpitbhayani.me/masterclass
System Design for Beginners: arpitbhayani.me/sys-design
Redis Internals: arpitbhayani.me/redis
Build Your Own Redis / DNS / BitTorrent / SQLite - with CodeCrafters.
Sign up and get 40% off - app.codecrafters.io/join?via=...
In the video, I delved into how Instagram efficiently searches for hashtags using partial indexes in PostgreSQL. Instagram relies heavily on PostgreSQL for user-created data. By utilizing partial indexes, Instagram optimized hashtag searches by indexing only popular hashtags with over 100 photos, reducing the index size significantly. This optimization technique improved query performance, showcasing the power of understanding database features like partial indexes. Applying simple first principles led to a substantial enhancement in database performance, demonstrating the importance of database knowledge in optimizing production systems.
Recommended videos and playlists
If you liked this video, you will find the following videos and playlists helpful
System Design: • PostgreSQL connection ...
Designing Microservices: • Advantages of adopting...
Database Engineering: • How nested loop, hash,...
Concurrency In-depth: • How to write efficient...
Research paper dissections: • The Google File System...
Outage Dissections: • Dissecting GitHub Outa...
Hash Table Internals: • Internal Structure of ...
Bittorrent Internals: • Introduction to BitTor...
Things you will find amusing
Knowledge Base: arpitbhayani.me/knowledge-base
Bookshelf: arpitbhayani.me/bookshelf
Papershelf: arpitbhayani.me/papershelf
Other socials
I keep writing and sharing my practical experience and learnings every day, so if you resonate then follow along. I keep it no fluff.
LinkedIn: / arpitbhayani
Twitter: / arpit_bhayani
Weekly Newsletter: arpit.substack.com
Thank you for watching and supporting! it means a ton.
I am on a mission to bring out the best engineering stories from around the world and make you all fall in
love with engineering. If you resonate with this then follow along, I always keep it no-fluff.

Пікірлер: 58

  • @rahulsarkar4206
    @rahulsarkar4206 Жыл бұрын

    Man, you are next level. Your chanel is a binge watch for engineers.

  • @xskrish
    @xskrish Жыл бұрын

    5:31 "this is the beauty of Instagram, they never optimise" - truth has been spoken 😅

  • @AsliEngineering

    @AsliEngineering

    Жыл бұрын

    I admire them for this. They keep things extremely simple. Simple systems scale.

  • @dipankarkumarsingh
    @dipankarkumarsingh Жыл бұрын

    Finished ✅ ... ( Self Note ) : New Learning: 1) It's great to see the query execution plan .. an insight can help a lot in optimization ( till now I just knew the query execution plan in theory ... How to access it was new to me ) 2) partial Indexing can effectively be used in tail graph ... The idea of partial Indexing is similar to Heavy Light Decomposition ( CP Background) ( it is used in graph ) so yeah .. cool to see a idea getting implemented in lite version and gaining Huge performance. Now I think ... I can try to use Heavy Light Decomposition in Vector databases 🤔 (a must try 😁 thing )

  • @okcodingbeast
    @okcodingbeast7 ай бұрын

    Hi Arpit, your approach to engineering is truly inspiring. Despite the intimidating jargon like Partial Indexes, distributed transactions, etc, your emphasis on first principles makes Computer Science feel like Common Sense. Keep up the great work and continue to inspire us. Thanks!

  • @rajatiitr
    @rajatiitr Жыл бұрын

    I AM LATE TO FIND THIS CHANNEL, BUT LUCKY..... THIS IS ASLI ENGINEERING CHANNEL. THANKS ARPIT SIR FOR PROVIDING SUCH CONTENT.

  • @j-stormz5472
    @j-stormz547211 ай бұрын

    I’m really feeling lucky to have found your channel. Keep up the good work !

  • @tawseefbhat977
    @tawseefbhat977 Жыл бұрын

    as always concise n effective.. love it

  • @shantanutripathi
    @shantanutripathi Жыл бұрын

    That's a perfect usecase for Partial Indexing! Great Video.

  • @dennysam829
    @dennysam829 Жыл бұрын

    Amazing way of explaining!

  • @piyushverma1219
    @piyushverma1219 Жыл бұрын

    Simple yet so effective

  • @gambo9124
    @gambo9124 Жыл бұрын

    Thanks Arpit!!

  • @shishirchaurasiya7374
    @shishirchaurasiya7374 Жыл бұрын

    Got some really nice insights from the topic explained Arpit sir you are really amazing in explaining and breaking down hard things into simpler chunks

  • @utkarshshrivastav700
    @utkarshshrivastav700 Жыл бұрын

    GEM. Thanks for such awesome and mindblowing content.

  • @MohitSharma-yu2gt
    @MohitSharma-yu2gt Жыл бұрын

    Brilliant Insights 🙌

  • @yadneshkhode3091
    @yadneshkhode3091 Жыл бұрын

    Thank you for providing good content ❤️👍

  • @kjgopika8936
    @kjgopika8936 Жыл бұрын

    Thanks for this video.

  • @VM-ue6hu
    @VM-ue6hu Жыл бұрын

    too good !!!

  • @teetanrobotics5363
    @teetanrobotics5363 Жыл бұрын

    Hey, this course is not for beginner-friendly. The first video provided an intro but from the next video onwards we went straight into partial indexing which most people aren't aware of. If possible, could you also insert some basic videos, just to get the audience up to speed. Thanks for the amazing explanations though. You're extremely knowledgable and a great presenter.

  • @lakshjain6959
    @lakshjain69594 ай бұрын

    Do we have something similar to partial indexing in SQL server?

  • @saifulhasan2532
    @saifulhasan25326 ай бұрын

    What if we can have another boolean column which tell us whether hashtag popular or not. Default value of this column is false but Whenever hashtag count value reached to certain threshold(in this case 100) then it will update to true. What is pros and cons of this method over the partial indexing?

  • @ankurbansal18
    @ankurbansal189 ай бұрын

    Bhaiya What if I want to seach a tag with media < 100 ? (Like what if its not a popular tag and is a part of long tail)

  • @5590priyank
    @5590priyank Жыл бұрын

    Is it same as functional indexes?

  • @oshogarg5215
    @oshogarg5215 Жыл бұрын

    very good explanation but as a beginner I dont know about what are indexing , partial indexing in database ... will figure it out through Google , thanks.

  • @gaurangmittal5563
    @gaurangmittal5563 Жыл бұрын

    Hi Arpit bhaiya saw this video and even read the medium article, i just had one question , the article content is 10 years old, do you still think they will be using the same optimization techniques on the hashtags services or it could have changed by now.

  • @AsliEngineering

    @AsliEngineering

    Жыл бұрын

    does that even matter? what matters is the key concept we learned from it.

  • @AmanGupta-fb1fz
    @AmanGupta-fb1fz Жыл бұрын

    Hi, instead of creating a new partial index, what if they would have fired the sql query with count >= 100 filter directly? That would also sort around 169 rows only, right? Won't that be similar in performance as with partial indexes.

  • @AsliEngineering

    @AsliEngineering

    Жыл бұрын

    No. Because to power that you would have to create index on all data and that would make it bulky because of the long tail putting a stress on the database.

  • @prasathkrish924

    @prasathkrish924

    Жыл бұрын

    @@AsliEngineering Didn't the where works first, so after filtering the count>=100, there will only be 169 rows. So sorting only those rows is just enough right? Asking out of curiosity

  • @balasravandindukurthi4702

    @balasravandindukurthi4702

    Жыл бұрын

    @Aman, @Prasanth - I think Arpit's point here is to support the range queries efficiently you would have to create index on the table. Lets say you create index on all the rows then the index would be huge resulting in stress on database. Since we anyways want to show top hashtags only we can index on only rows that have high media count. This is what we call as partial index (creating index only on subset of rows that matches our filter criteria)

  • @atishayjain5321
    @atishayjain5321 Жыл бұрын

    hii arpit bhaiya can you please make the video on collaborative software like git automerge how they works?

  • @AsliEngineering

    @AsliEngineering

    Жыл бұрын

    There is a podcast coming on Collaborative Editors 2 weeks from now. You can find it on my channel home page.

  • @raj_kundalia
    @raj_kundalia Жыл бұрын

    interesting

  • @pankajvermacr7
    @pankajvermacr75 ай бұрын

    hey, is there a way to achieve this in Other Databases like MySQL, or MongoDB ?

  • @AsliEngineering

    @AsliEngineering

    5 ай бұрын

    Mongo does support partial index but MySQL does not.

  • @nishaaa_maurya
    @nishaaa_maurya3 ай бұрын

    If Instagram first sort on the basis of media_count. So, sorting on the basis of media count will not take heavy resource ? If not, please explain, It would be a great help. Thank You !!

  • @AsliEngineering

    @AsliEngineering

    3 ай бұрын

    Sorting is an extremely expensive operation. Even keeping a secondary index that is ordered by this ever changing attribute will lead to multiple tree rebalances degrading the database performance.

  • @mahendars1417
    @mahendars14175 ай бұрын

    Can we not just say include only those tags where media count is greater than 100 in the original query ??? I'm not great with databases, please forgive my naive question.

  • @5590priyank
    @5590priyank Жыл бұрын

    So if we create partial index with >100, if our query has >500 then also database would be smart enough to reuse above index?

  • @kamalhm-dev

    @kamalhm-dev

    Жыл бұрын

    Yes, because it's a subset of >100, it will also work if you use >1000 or even bigger

  • @adityasanthosh702
    @adityasanthosh702 Жыл бұрын

    Why did MySQL did not implement Partial Indexing? Also, can you compare between MySQL and PostgreSQL? What is the better database of those two

  • @AsliEngineering

    @AsliEngineering

    Жыл бұрын

    both has some advantages and disadvantages. more than performance there are a ton of other factors that decides which one is picked.

  • @anishkelkar6434
    @anishkelkar6434 Жыл бұрын

    The prices in the video shown are around 15% lower than the actual price offered on the website! Is there any discount code which we can encash?

  • @AsliEngineering

    @AsliEngineering

    Жыл бұрын

    They are old promotion videos. Yet to record the new one. Will be recording the new one today. There is no discount. I feel they are unfair to people who paid in full.

  • @SaketAnandPage
    @SaketAnandPage10 ай бұрын

    How the new tags which entered the count > 100 will get indexed?

  • @AsliEngineering

    @AsliEngineering

    10 ай бұрын

    Upon every update the database will see if an index needs to be updated or not. So partial indexes (like regular indexes) will be kept updated upon every commit.

  • @SaketAnandPage

    @SaketAnandPage

    10 ай бұрын

    Thanks @@AsliEngineering

  • @random4573
    @random4573 Жыл бұрын

    What is the cost of keeping index upto to date??

  • @AsliEngineering

    @AsliEngineering

    Жыл бұрын

    Nothing extra. It operates like a regular index just with an added filter.

  • @nehagour6928
    @nehagour6928 Жыл бұрын

    Such an amazing optimisation Instagram applied , Arpit brilliant insights

  • @gokulyc
    @gokulyc5 ай бұрын

    Good explanation. source link?

  • @AsliEngineering

    @AsliEngineering

    5 ай бұрын

    Instagram engineering blog.

  • @rjphotos2393
    @rjphotos2393 Жыл бұрын

    Isn't the article 10 yrs old? Are they still using these techniques?

  • @AsliEngineering

    @AsliEngineering

    Жыл бұрын

    I don't know if they are using the same technique or not. Most probably not but the optimization was pretty cool. It shows how partial indexes can be used to power queries over long tail distribution.

  • @rjphotos2393

    @rjphotos2393

    Жыл бұрын

    @@AsliEngineering Yup, true that!

  • @akash-kumar737
    @akash-kumar737 Жыл бұрын

    Thanks man I am about to work on a product and was stuck at autocomplete feature. Was looking to go with custom solution but now will use this feature of Postgresql. 😊

  • @dpm-07
    @dpm-07 Жыл бұрын

    Most awaited video. after watching this video 'itni kushi ..... intni kushi....' ref: kzread.info/dash/bejne/mYegsplqldvIoLQ.html

  • @dpm-07

    @dpm-07

    Жыл бұрын

    please make a video on heavy hitters with Twitter trending Example. Many of us has doubts related to scaling, choosing right ds or db. or how you can in prove those ....

Келесі