Deep Dive Into the Repository Design Pattern in Python

In this video, I’ll take a closer look at the repository design pattern in Python. This is a very useful pattern that allows you to keep your data storage separate from your data operations.
💡 Get my FREE 7-step guide to help you consistently design great software: arjancodes.com/designguide
🔥 GitHub repository: git.arjan.codes/2024/repository
💻 ArjanCodes Blog: www.arjancodes.com/blog
✍🏻 Take a quiz on this topic: www.learntail.com/quiz/rvgics
Try Learntail for FREE ➡️ www.learntail.com/
🎓 Courses:
The Software Designer Mindset: www.arjancodes.com/mindset
The Software Architect Mindset: Pre-register now! www.arjancodes.com/architect
Next Level Python: Become a Python Expert: www.arjancodes.com/next-level...
The 30-Day Design Challenge: www.arjancodes.com/30ddc
🛒 GEAR & RECOMMENDED BOOKS: kit.co/arjancodes.
👍 If you enjoyed this content, give this video a like. If you want to watch more of my upcoming videos, consider subscribing to my channel!
Social channels:
💬 Discord: discord.arjan.codes
🐦Twitter: / arjancodes
🌍LinkedIn: / arjancodes
🕵Facebook: / arjancodes
📱Instagram: / arjancodes
♪ Tiktok: / arjancodes
👀 Code reviewers:
- Yoriz
- Ryan Laursen
- Dale Hagglund
🎥 Video edited by Mark Bacskai: / bacskaimark
🔖 Chapters:
0:00 Intro
0:37 Repository code example
4:13 About the pattern
8:04 Better software testing
9:36 Warnings and Caveats
11:24 Final thoughts
#arjancodes #softwaredesign #python
DISCLAIMER - The links in this description might be affiliate links. If you purchase a product or service through one of those links, I may receive a small commission. There is no additional charge to you. Thanks for supporting my channel so I can continue to provide you with free content each week!

Пікірлер: 108

  • @ArjanCodes
    @ArjanCodes4 ай бұрын

    💡 Get my FREE 7-step guide to help you consistently design great software: arjancodes.com/designguide

  • @yurykliachko1815
    @yurykliachko18155 ай бұрын

    this is a good pattern when your entity (Post in this case) is stored partially in different storages (sql DB + cloud storage, sql db + nosql DB etc), it hides all the complexity. Thank you for this guide!

  • @ArjanCodes

    @ArjanCodes

    4 ай бұрын

    Glad you enjoyed the topic!

  • @FernandoCordeiroDr
    @FernandoCordeiroDr5 ай бұрын

    I had to use this pattern recently. I was working on a Django app that had to work both with MongoDB and Postgres' PGVector. I created a repository for each and then a factory function that, based on environmental variables. determines which repo to be used. These repos are then used inside the methods of normal Django models. The main benefit is that adding an integration to another vector database is just a matter of creating a new repository.

  • @notead
    @notead5 ай бұрын

    Hey Arjan, I just want to say thank you. I was able to land a job in data engineering thanks to your course and your videos on design patterns. Seeing your approach to building applications finally made it click for me that learning a language is the "easy" part, and that understanding _how to think about systems_ not only makes me a better developer - but is a super important, generalizable skill that goes beyond just programming. Maybe that's obvious for many, but I am really grateful for that insight.

  • @ArjanCodes

    @ArjanCodes

    4 ай бұрын

    It's an absolute pleasure hearing about your success story and your learning journey, thank you for letting me be part of it! Best of luck :)

  • @Djellowman
    @Djellowman5 ай бұрын

    Great no-nonsense video!

  • @rohailtaimourInc
    @rohailtaimourInc5 ай бұрын

    Hi @arjancodes, I’ve been really enjoying your videos and specifically how you always focus on how to test the code you demonstrate. Thank you for your content. I was wondering if you can cover testing functions that are decorated? They pose an interesting challenge and I didn’t find it to be straightforward to test such use cases

  • @markasiala6355
    @markasiala63555 ай бұрын

    I also have used this pattern without knowing it simply by focusing on decoupling and dependency injection. I have an abstract data class and an abstract FileIO class. Gives me flexibility on how I load data into the class or write it out. This helps me track changes in the data when I compare versions of the output data (e.g., I read in data from a user-friendly Excel file but write it out to pipe-delimited text, JSON, or YAML output where a simple diff tells me what changed).

  • @adjbutler
    @adjbutler5 ай бұрын

    I love your pattern videos! (I will even allow you to make up your own patterns) Or do PART 2, 3, 4 on previous patterns! Your videos are amazing! Thank you

  • @alexandarjelenic2880

    @alexandarjelenic2880

    5 ай бұрын

    Or combining patterns, or more example of solving the same issue with various approaches.

  • @notead

    @notead

    5 ай бұрын

    I agree! It would also be really cool to see more videos of him refactoring projects into using design patterns, especially hearing him discuss why he makes certain choices, the considerations and thoughts that cross his mind when making them.

  • @mhotzel
    @mhotzel5 ай бұрын

    I always use this kind of Repository, but I didn't know, that I follow a pattern 😀. Thank you.

  • @ArjanCodes

    @ArjanCodes

    4 ай бұрын

    Glad the video was helpful!

  • @VashdyTV
    @VashdyTV5 ай бұрын

    Great guide as always!

  • @ArjanCodes

    @ArjanCodes

    4 ай бұрын

    Thank you so much!

  • @axeldelsol8503
    @axeldelsol85035 ай бұрын

    This pattern is also very useful when you are wrapping a API offering CRUD routes for resources Great video !

  • @Nalewkarz

    @Nalewkarz

    5 ай бұрын

    It's much more suited for your usecase than his.

  • @rrwoodyt
    @rrwoodyt5 ай бұрын

    I like the separation and abstraction. It would have been interesting to see you make a class that could handle a generic dataclass, but that's beyond the scope of what you were trying to show. Maybe next time...

  • @2006pizzaboy15
    @2006pizzaboy155 ай бұрын

    You can also look at the Unit of Work pattern that often goes hand in hand with Repository.

  • @SeliverstovMusic
    @SeliverstovMusic5 ай бұрын

    I use repository on top sqlalchemy. A have a base repo class with CRUD function. For every table, I create a new subclass, and all CRUD operation become available for the table. Magic =)

  • @dadoo94000
    @dadoo940004 ай бұрын

    Thank you Arjan. I use this pattern in fastapi. Layer endpoint > layer services (logic etc) > layer repository with FastApi dependencies between these layers. I like it. Effectively, often I need more than simple CRUD operation and add it to my repository layer. It's not a good idea I think. Maybe we should create theses different method in services. But I like these pattern. WHen I need to call external api, I create a repository also for that. For me repo = access to data

  • @devilslide8463
    @devilslide84635 ай бұрын

    I particularly appreciate the ease of mocking this repository. It's very convenient for testing the logic of services that utilize the repository class.

  • @ArjanCodes

    @ArjanCodes

    4 ай бұрын

    I'm glad you enjoyed this design pattern!

  • @prinsniels
    @prinsniels5 ай бұрын

    I use the pattern a lot, but in a more general way. I tend to write things on the base of interfaces, combining it with dependency injection makes things easy to test and allows for composable programs and great flexibility. I tend to stay away from ORMs, for me they add an extra layer of complexity to programs and in analitics it quickly ends in writing straight SQL to your ORM, so cutting the middle man seems wise then 😅 Thnxs for the video!

  • @oscarmulin114

    @oscarmulin114

    5 ай бұрын

    Agree with avoiding ORMs 100%.

  • @bachkhoahuynh9110

    @bachkhoahuynh9110

    4 ай бұрын

    In data-centric applications, you can stay away from ORMs, but if your team uses an object-oriented domain model, ORMs are especially useful.

  • @obsidiansiriusblackheart
    @obsidiansiriusblackheart5 ай бұрын

    I find like most patterns, I have used this before but didn't know the name. Thanks for this awesome video! Your channel really helps me better understand coding and jargon in the field (I have ~10 years coding xp and 6/7 years work xp)

  • @ArjanCodes

    @ArjanCodes

    4 ай бұрын

    I'm really happy to hear that these types of videos have been useful! :)

  • @barefeg
    @barefeg4 ай бұрын

    Awesome. Maybe follow ups could be how to define filters in your get_all that are no tied to SQL (e.g. specification pattern), as well as handling transactions with unit of work pattern.

  • @basedmuslimbooks
    @basedmuslimbooks5 ай бұрын

    I love this - can you expand your repository design patterns to other databases ? Mongodb is something im struggling with. Or graph databases

  • @davidmasipbonet2508
    @davidmasipbonet25085 ай бұрын

    Why do you need create_table to be a classmethod?

  • @ajflorido
    @ajflorido3 ай бұрын

    using this pattern with SqlAlchemy you can load different models dinamically and use the same repo to get the data for different db engines. For example we have models for postgres,oracle and mysql that with SA some columns definitions for the model are quite different and we load the correct model dinamically within the repo itself, so you can also decouple this pattern into another step for different engines. Thanks Arjan!

  • @nightcrawer
    @nightcrawer4 ай бұрын

    Hey Arjan! thanks for the post. In a more complex applications using DDD is a good practice to separate domain from models ? My repository return a model and my model know hot to convert into a domain

  • @dankprole7884
    @dankprole78845 ай бұрын

    I use this for reading and writing dataframes. csv, parquet or pickle, local storage or s3. Same interface 😊

  • @sandeshgowdru8869
    @sandeshgowdru88694 ай бұрын

    Thanks a lot for making videos, I was looking for a architecture for getting data from multiple sources, I was looking into a combination of factory, strategy etc, But this pattern is perfect for my need Once again thanks a lot for sharing this....

  • @ArjanCodes

    @ArjanCodes

    4 ай бұрын

    I'm glad this video was helpful for your current objectives :)

  • @SkielCast
    @SkielCast4 ай бұрын

    In this case you have nonly a couple of columns but using row_factory = sqlite3.Row and casting to dict would have allowed to use the **kwargs syntax which is especially handly in this case, maybe the code could be a little easier to follow that way

  • @edgeeffect
    @edgeeffect5 ай бұрын

    It's one of my favourite patterns and I so dearly wish we had it in the awful legacy app we've got at work.

  • @Vijay-Yarramsetty
    @Vijay-Yarramsetty4 ай бұрын

    thanks

  • @dalenmainerman
    @dalenmainerman5 ай бұрын

    I actually used this one unintentionally At the earliest stages of a project, all data was stored in a bunch of csv files (not my idea, not my decision) Implementing all data-related operations with this pattern allowed me to migrate to the real database almost effortlessly

  • @edgeeffect

    @edgeeffect

    5 ай бұрын

    "not my idea, not my decision" ... is the (sad) story of our lives!

  • @sharkpyro93

    @sharkpyro93

    5 ай бұрын

    i worked in a project of a national wide editors and magazines publisher company and they used some excel sheets as db, it was miserable

  • @Naej7

    @Naej7

    5 ай бұрын

    @@sharkpyro9395% of the world data is stored in Excel sheets…

  • @dalenmainerman

    @dalenmainerman

    5 ай бұрын

    @@sharkpyro93 I have to work with google sheets as a db on my current project. Annoying af, trying to teach my colleagues to use real databases, wish me luck

  • @sharkpyro93

    @sharkpyro93

    5 ай бұрын

    @@dalenmainerman why do i feel like i know how your collegues look?

  • @BradleyBell83
    @BradleyBell835 ай бұрын

    Any reason why ABC was used as opposed to Protocol?

  • @aimbrock

    @aimbrock

    4 ай бұрын

    Talk about coming full circle here... I went searching for this Protocol package you mention and found a blog article claiming that Protocol is better and everyone should abandon ABC. In that same blog article he links to an ArjanCodes video that maybe answers your question: kzread.info/dash/bejne/qqqWl8qAfNKxYKQ.html Having just discovered Repository Pattern and Unit of Work and now ABC and Protocol I have no perspective to offer but I thought it funny.

  • @bachkhoahuynh9110
    @bachkhoahuynh91105 ай бұрын

    The repository pattern is not about switching from SQL to noSQL. We call this switching effect the persistent ignorance principle. the main thing to consider to use the repository pattern is that you want to decouple domain logic from infrastructure logic. A repository is usually backed by an ORM because when you use raw SQL, you eventually implement some ORM's features such as changes tracking, proxy for lazy loading, ... I only use raw SQL for complex queries.

  • @klmcwhirter
    @klmcwhirter5 ай бұрын

    A common misconception of design patterns is that the concrete implementations need to have the same method signatures (or implement the same interface). That simply is not true! The spirit of the Repository pattern is to decouple storage from business logic. If the storage strategy changes, then the business logic layer should not have to change. See the Open-Closed principle for details. That is hard to accomplish if every Repository in your code base has the same set of method signatures, i.e., implements the same interface (er, Protocol as you have taught us). The methods should implement a business function required by the layer calling into the storage layer instead. First of all, you NEVER should embed SQL statements in today's world. That is a huge design smell that will never pass code review in an enterprise context. Second, the functions in a Repository class should be elegantly "callable" from the business layer and not just implement CRUD methods. That is a wrong usage of the Repository design pattern. It is a misconception that an OR/M provides a Repository - that is not true! Session management and Repository implementations are different concerns and do not belong together. Except in "Hello, World" examples I guess. Don't do that. It just does not work when you have a complex data concept involving hundreds of tables. Yep, those are normal in real world use cases. Please think about the place where you may need to move functionality from a database to an API. That will provide you with a correct mental model about the Repository pattern. just encapsulate the behavior needed for the underlying operational storage mechanism. At the end of the day, it is a specialized kind of an Adapter. I love your content @ArjanCodes, please keep doing what you are doing. But this one could have been presented better. I, as an educator myself, realize there are compromises that need to be made to simplify introduction of (potentially) new concepts. But you went too far this time. Sorry.

  • @Jakub1989YTb
    @Jakub1989YTb5 ай бұрын

    Why classmethods if you are not using them to create "instances" of the class? Didn't you mean staticmethods? This is very misleading.

  • @muzafferckay2609
    @muzafferckay26095 ай бұрын

    Implementing repository pattern is not about switching from sql to nosql or vice verce. It decouple the business logic from the persistent layer this can be orm or sql language. As you mentioned implementing repository pattern limit querying, updating ... It is too hard to provide all feature that orm does. For example you have to define comparison operators such as in, grater than, less than etc. Your get method shold take set of relational fields to fetch as it is going to be used in different places. You have to define or, and and more complex query. Basically you have to define your own query language step by step as you need. And translating your query to Orms. Otherwise you have to define too many different get methods for querying

  • @Naej7

    @Naej7

    5 ай бұрын

    It is about switching. By decoupling the business logic from the persistance layer, you can switch the persistance class (one for SQL, one for NoSQL)

  • @broomva
    @broomva5 ай бұрын

    And how would you abstract away the SQL queries in the repository definition, so that different types of repository implementations could be made by changing something like a SQL objects template?

  • @FolkOverplay
    @FolkOverplay5 ай бұрын

    Is there a special reason why the tests were not refactored to use parameterize?

  • @ChrisBNisbet
    @ChrisBNisbet5 ай бұрын

    Hmm, do the tests you showed us test anything other than the mock class you created for the purpose of adding tests?

  • @joelffarthing

    @joelffarthing

    5 ай бұрын

    Imagine an application function or use case that depends on a repository; You can inject the 'fake' Repository in your test instead of the version that uses a real database. That way, you have something that implements the expected interface, but doesn't actually require a real database. He talked about this but didn't actually show an example. Architecture Patterns With Python is a great book that goes over this and other patterns in detail.

  • @ChrisBNisbet

    @ChrisBNisbet

    5 ай бұрын

    @@joelffarthing Yep, I get all that.

  • @user-pz3wg6ch9b
    @user-pz3wg6ch9b4 ай бұрын

    Why classmethod when it's not accessing any class instance variable also not returning the class? Can be a staticmethod right.

  • @Nalewkarz
    @Nalewkarz5 ай бұрын

    You are not limited to Python 3.12 with this. You can do it also with older versions. Just use T = TypeVar("T") and then inherit from Generic[T] in the repository. But i'll allow myself some criticism. This pattern is not very usefull without more strict Port/Adapter pattern where repository is implementation of concrete interface. For simple CRUDS you can go just with ORM it's just not worth the effort.

  • @TheOnlyEpsilonAlpha
    @TheOnlyEpsilonAlpha4 ай бұрын

    I used it lately without knowing that I used it and without the decorators. Wrote a file handling py for crud that way without specifying the content so it could be used to handle c.r.u.d.operations and is not stuck to a specific content type

  • @SkielCast
    @SkielCast4 ай бұрын

    Wouldn't it make more sense for the PostRepository methods to take a Post object rather than kwargs? That way we could have leverage typing

  • @johnabrossimow
    @johnabrossimow5 ай бұрын

    I wrote a class to access the filepaths in the project repository my app creates.

  • @CottidaeSEA
    @CottidaeSEA4 ай бұрын

    The repository is one of my favorites, because I really don't like it when database queries are tightly coupled with logic or the entities themselves.

  • @tihon4979
    @tihon49795 ай бұрын

    Cool! What about Unit of work pattern? ;)

  • @edgeeffect

    @edgeeffect

    5 ай бұрын

    Yeah.... this is all starting to sound a little bit like "Doctrine" ... but that's PHP???????

  • @sarveshsawant7232
    @sarveshsawant72325 ай бұрын

    Great

  • @ArjanCodes

    @ArjanCodes

    4 ай бұрын

    Thanks!

  • @Naej7
    @Naej75 ай бұрын

    I use it every day, because I need a InMemory version for my tests

  • @PietroBrunetti
    @PietroBrunetti5 ай бұрын

    If I don't remember wrong, I saw it in the Cosmic Python book.

  • @vikingthedude
    @vikingthedude5 ай бұрын

    This looks like the strategy design pattern, applied to storage. Here, the SQLite storage is a specific strategy. Another strategy could be a remote storage. Am I understanding this right?

  • @Naej7

    @Naej7

    5 ай бұрын

    I understand what you mean, but I can’t really say it is exactly the same thing. It does use the same mechanism though, which is essentially dependency injection

  • @thomaseb97

    @thomaseb97

    5 ай бұрын

    most patterns are conceptually similar, atleast within the same category, there is very little difference between them, they just tackle somewhat specific tasks if its easier for you to imagine it as strategy pattern go for it

  • @peterlogg5576
    @peterlogg55763 ай бұрын

    Is there a reason to make the parent `Repository` an `ABC` rather than a `Protocol`? We generally use `ABC` for our repositories but I'm curious if there's a reason not to implement it as a Protocol instead?

  • @feldinho
    @feldinho5 ай бұрын

    Using this pattern, how would you deal with N+1 problems? Imagine you have posts with authors; it's easy to get an author with no posts or a post with no authors, but what about retrieving both together? Calling each other's repository would lead to infinite recursion while using joins in both repos would lead to duplicate logic. How would you solve this?

  • @NotNullReference

    @NotNullReference

    5 ай бұрын

    You can add as many methods as need. The repository pattern is just a form to abstract and decouple the data access logic from bussines logics. So, if you need the Posts and Authors, you create a query that holds both items, in which repository you create this methods depends in the "dependent" side of the query, 'cause is different to said: - "I need the author of this book": meaning that you need a BookWithAuthor class, with the author as dependent side, so a join between book to author with a where in BookId - "I need the books of this author": meaning that you need a AuthorBook class, with the book as dependent side, so a join between author to book with a where in AuthorId Wherever is your case, you create the method in the strong side, in the first case, is in the BookRepository, in the second is AuthorRepository

  • @Nalewkarz

    @Nalewkarz

    5 ай бұрын

    It's not very usefull without rest of the hexagonal architecture building blocks. Basically you just won't do it like you think. You must have some facade like "use cases" or "service" then pack database objects to entities that in that case would be aggregates because it will consist of two different related types od objects. Just imagine DAO with prefetched related objects. I can recommend veru good book about such implementation "Implementing the Clean Architecture by Sebastian Buczynski".

  • @bentosalvador336

    @bentosalvador336

    4 ай бұрын

    Hey man, your question is very common. But it is also very simple to answer. Repository is NOT made for "queries" or "get data performatically". Repository should be used to persist the state of an "aggregate". In my point of view, you should have only the necessary methods to retrieve the Aggregate, then you modify it, and send it again to the repository asking to persisirst it. To avoid this kind of confusion about how to use repos, take a look at the CQRS concept.

  • @luscasleo
    @luscasleo5 ай бұрын

    I didn't know that its possible to declare generic classes using brackets like that and even not needing to declate the typevar T. Which python version is that?

  • @maephisto

    @maephisto

    5 ай бұрын

    3.12

  • @Naej7

    @Naej7

    5 ай бұрын

    The newest version 😉

  • @MrLotrus
    @MrLotrus5 ай бұрын

    It hurts when adding transactions

  • @maephisto
    @maephisto5 ай бұрын

    Around 5:40 we see that the new class Repository is a generic class that returns T (via get), list of T (via get_all). But why the add and update methods have no notion of T? Why do we go away from the generics and we introduce the **kwargs: object? I was expecting an add method which takes as an input a T.

  • @aflous

    @aflous

    5 ай бұрын

    It allows more flexibility in the sense that you would not be tied to only use Post as an argument for these methods

  • @maephisto

    @maephisto

    5 ай бұрын

    @@aflous well... So why not returning then with get and get_all an object with random fields. My point is : some methods are specialized for T, others not. And I don't think that's right because I understand flexibility but either everything is flexible or nothing is like that and it's based on T.

  • @aflous

    @aflous

    5 ай бұрын

    @@maephisto when you perform a get or get_all you wouldn't really need to specify any other info and you would expect to get an object of the same type (or a list of objects of that type). For other methods like update, you need at least to specify some other info like the data you want to supply for the update.

  • @maephisto

    @maephisto

    5 ай бұрын

    Not fully convinced. When you add, you add T. But the example with update make sense. Thanks

  • @jwcnmr
    @jwcnmr3 ай бұрын

    Design Patterns are usually "discovered." Has this pattern been described elsewhere?

  • @loicquivron3872
    @loicquivron38725 ай бұрын

    The thumbnail looks so cursed

  • @Naej7

    @Naej7

    5 ай бұрын

    Right ? It has a « made with AI » vibe

  • @brainforest88
    @brainforest885 ай бұрын

    Tipp: Never use Select * in a sqlquery in code. It bites back. Worked 25 years developing db applications in Oracle (pl/sql). Looking at sqlalchemy queries is exhausting the L1 cache in my brain. I‘m used to write my sql straight. Easier to understand and I doubt I can do everything I need with Orm, so why start with it in the first place.

  • @dankprole7884

    @dankprole7884

    5 ай бұрын

    Agreed, I use them both so infrequently I want to double my chances of remembering so I just use sql

  • @plato4ek
    @plato4ek4 ай бұрын

    9:05 So, in essence, you create a mock and put this mock under test. This is not a proper way to do testing.

  • @xtunasil0
    @xtunasil05 ай бұрын

    It's the standard way to work with the java spring framework

  • @thepaulcraft957
    @thepaulcraft9575 ай бұрын

    saw it during a internship every day and now I am sick of it

  • @Naej7

    @Naej7

    5 ай бұрын

    So since you’ve seen code every day, you’re now sick of code as well ?

  • @xiggywiggs

    @xiggywiggs

    5 ай бұрын

    @Naej7 some days... yeah lol

  • @thepaulcraft957

    @thepaulcraft957

    5 ай бұрын

    @@Naej7 no but it was completely overused and made simple things much too complicated

  • @Naej7

    @Naej7

    5 ай бұрын

    @@thepaulcraft957 Probably not, it’s used for a reason (often tests)

  • @thepaulcraft957

    @thepaulcraft957

    5 ай бұрын

    @@Naej7 but for many things you could use simple dtos instead of full repositories. Testing is a good point though.