A forbidden Python technique to put ANYTHING in a dict or set.

Ғылым және технология

Use with caution!
Common programming wisdom tells us that mutable objects should not be hashable since mutating the object might change its hash. But sometimes you really just want to have a dict or set of mutable things that you promise won't have their hashes change while in use. In Python there is a devious trick to do just that. Its legitimate use cases are very niche, but in a pinch you can use the identity of an object as as proxy for hashing the object itself. Learn how in this intermediate level Python video!
― mCoding with James Murphy (mcoding.io)
Source code: github.com/mCodingLLC/VideosS...
Datamodel docs: docs.python.org/3/reference/d...
Identity function: docs.python.org/3/library/fun...
SUPPORT ME ⭐
---------------------------------------------------
Sign up on Patreon to get your donor role and early access to videos!
/ mcoding
Feeling generous but don't have a Patreon? Donate via PayPal! (No sign up needed.)
www.paypal.com/donate/?hosted...
Want to donate crypto? Check out the rest of my supported donations on my website!
mcoding.io/donate
Top patrons and donors: Jameson, Laura M, Dragos C, Vahnekie, Neel R, Matt R, Johan A, Casey G, Mark M, Mutual Information, Pi
BE ACTIVE IN MY COMMUNITY 😄
---------------------------------------------------
Discord: / discord
Github: github.com/mCodingLLC/
Reddit: / mcoding
Facebook: / james.mcoding
CHAPTERS
---------------------------------------------------
0:00 Intro - hashing
1:20 Alternatives
1:44 Object identity
2:15 FORBIDDEN - IdMapping
3:14 FORBIDDEN - IdSet
3:44 See it in action
4:00 The downside
5:07 Thanks

Пікірлер: 100

  • @disgorgeengorge
    @disgorgeengorge9 ай бұрын

    Storing mutables forbidden justsu

  • @jullien191

    @jullien191

    9 ай бұрын

    왜요?

  • @Mutual_Information
    @Mutual_Information9 ай бұрын

    Interesting you chose 257 to demonstrate the problem. Is that b/c ints 256 and below always have the same id? I believe I recall a fact like that.

  • @blackdereker4023

    @blackdereker4023

    9 ай бұрын

    Exactly. Python caches -5 to 256 integer objects so you don't have to instantiate a new one to make simple calculations.

  • @khuntasaurus88

    @khuntasaurus88

    9 ай бұрын

    ​@@blackdereker4023i think this was pre-python3. Noe the range is a lot bigger iirc

  • @stevenluoma1268

    @stevenluoma1268

    9 ай бұрын

    @@khuntasaurus88 No I don't think so. In python 3.9.17, I checked with the following: (No I don't use semicolons in python, I'm just trying to keep things compact) n=256; x=256; n is x; (True) n=257; x=257; n is x; (False) n=-5; x=-5; n is x; (True) n=-6; x=-6; n is x; (False) Still looks like -5 to 256 to me.

  • @khuntasaurus88

    @khuntasaurus88

    9 ай бұрын

    @@stevenluoma1268 yep just tested on 3.11 and you are right. I used to do for i in range(-100,300): a=i, b=i; print(i) if a is b; But my dumbass forgot that i gets referenced, not copied in python. Now i tested it where each a = i becomes a = i-1 so that it doesnt reference the i and it works from - 5 to 256 :)

  • @cvl14

    @cvl14

    9 ай бұрын

    @@khuntasaurus88 you can use `lambda x: x is int(float(x))` to test the property on any `int`

  • @Phaust94
    @Phaust949 ай бұрын

    Some forbidden knowledge comes in handy! Thanks James as always for the awesome content you put up!

  • @KappakIaus
    @KappakIaus9 ай бұрын

    "In addition to making esoteric and questionable python content on youtube" 😂

  • @GanerRL
    @GanerRL9 ай бұрын

    I actually use this technique from time to time, especially in graphics/game stuff

  • @PeterZaitcev
    @PeterZaitcev9 ай бұрын

    This type of hashing is actually quite popular and useful in the games, where each entity has its own unique ID, but cloning is not a common thing, and cloned entity is not equal to its donor even though they are identical.

  • @moondevonyt
    @moondevonyt9 ай бұрын

    mad respect to the author for breaking down the twin snake sacrifice shoot technique in python, especially since this is some advanced wizardry that many don't dive into but real talk, while it's cool to learn about these forbidden techniques, relying on them in production code might be a big yikes using object IDs for hashing might lead to confusing behaviors for future devs who come across your code always good to know the tricks of the trade, but there's often a reason some techniques are labeled "forbidden" still, props for the deep dive and making it understandable

  • @yellingintothewind

    @yellingintothewind

    9 ай бұрын

    There are plenty of times when you need essentially this functionality, but if you do, it is generally best to be explicit. Use a normal dict, and use `myDict[id(myList)] = foo`, it does require you to keep `myList` alive for as long as the mapping is used, but future programmers won't be confused about what exactly is stored in the mapping.

  • @QuantumHistorian
    @QuantumHistorian9 ай бұрын

    Huh, I just stumbled upon this solution myself literally 3 days ago by having "def __hash(self)__: return id(self)" for a class where I knew the objects would live for the whole run time and "is" was as good as "==". I can see the downsides, but it is so helpful. In some cases, it can even be faster I think? hash() can be slow in some situations (ie, long or nested tuples), but id() is always very fast I believe?

  • @yellingintothewind

    @yellingintothewind

    9 ай бұрын

    In most python implementations, for non-trivial types, id is the memory location of the underlying c-struct, so `id(foo)` is constant-time, and just about the fastest thing you can ever do.

  • @vnikolayev
    @vnikolayev9 ай бұрын

    Thank you, each of your vids widens my understanding on Python!

  • @gloweye
    @gloweye9 ай бұрын

    I'm pretty chuffed that I know exactly why your int(float()) example uses 257, without having to look it up.

  • @BrunsterCoelho

    @BrunsterCoelho

    9 ай бұрын

    Came here to say this!

  • @onhazrat
    @onhazrat9 ай бұрын

    🎯 Key Takeaways for quick navigation: 00:00 🛍️ Hashing in sets and dictionaries for efficient lookups is like finding milk in a grocery store's refrigerated section. 00:28 🔄 Hash collisions degrade performance; mutating an object changes its hash. 00:55 🔢 Mutable things should not be hashable in Python; defining hash or setting it to None is advised. 01:21 🧩 Using an object's identity to create custom mapping and set types for unhashable/mutable elements. 02:15 🗺️ Mapping type uses id of key for lookups; ensures key is not garbage collected. 03:11 🔑 Set type uses id of value for existence checks; prevents value from being garbage collected. 04:06 🔍 In IdSet, duplicates require exact same objects, not just equal ones. 04:33 ⚠️ Using IdMapping or IdSet can be confusing; forbidden Python knowledge. 05:01 🚫 Forbidden technique useful only when certain objects are consistently used. 05:27 💼 Creator's disclaimer; the technique has potential but is discouraged. Made with HARPA AI

  • @mrphlip
    @mrphlip9 ай бұрын

    There are occasions that I've wished something like this was built-in (in "collections", maybe)... or related things like a version of "in" for lists that used "is" instead of "=="... It doesn't come up often, and it would be a confusing problem if it was used more commonly than it should be, but when you want it you know.

  • @shahaffrsshahaffrs5190
    @shahaffrsshahaffrs51909 ай бұрын

    You could make the IdSet/IdMapping use the id of a key only when it is emutable or when it doesn't have a hash method. That way, you narrow the problem to only the case when you want to be able to acess the same value with different and equal keys. To go around this problem, you could set extra data on initialisation th specify what emutable types abould use identity and what thpes should use equality for keys made from those types. Most of the times, the keys in sets and dics would be of the same type, so you could define IdSet/Idmapping subclasses that should have only one type of keys (which is already something that exists in the standard library, as defaultdict), just to make the usage clear, and safe.

  • @KonstantinUb

    @KonstantinUb

    8 ай бұрын

    immutable*

  • @benshapiro9731
    @benshapiro97319 ай бұрын

    I have said it before and I will say it again: this is THE premier advanced python channel on KZread

  • @samuelthecamel
    @samuelthecamel9 ай бұрын

    I was wondering why I could store class objects in sets even if they are mutable. This explains it!

  • @timseguine2
    @timseguine29 ай бұрын

    I have generally only needed something like this for lists of integers. And for that there is a simpler hack of converting the keys to tuples of integers

  • @SeaOfRandomness
    @SeaOfRandomness9 ай бұрын

    we need more python esoteric knowledge ! keep it coming

  • @robertbrummayer4908
    @robertbrummayer49089 ай бұрын

    Awesome video as always :)

  • @MagicGonads
    @MagicGonads6 ай бұрын

    I've had to use a similar technique to implement FFI over mutable objects, that is two objects which have a shared state between two languages by propagating any mutation to the other side. I need to have a consistent ID for this object on both sides of the fence, and which object corresponds to this ID, and I need fast way to remember which objects are managed without keeping them alive. The result is that I need a *weak* and *invertible* mapping between ID and object. In your solution this is decidedly not a weak mapping, but if you did want to make it weak then you have to add weakref finalizers to the objects so that they propagate their destruction to the cache and the other side so we remove their ID from the mapping.

  • @ketexon
    @ketexon9 ай бұрын

    set vs set

  • @aouerfelli
    @aouerfelli9 ай бұрын

    My solution to this is tracking the hashable objects by their own hash and the unhashable ones by their id.

  • @evlezzz
    @evlezzz9 ай бұрын

    I guess confusion between mutability and hashabilty should be addressed somehow. The idea that if we mutate the object it's has is going to change is not entirely true. There could be a lot of real-life scenarious when mutable objects are actually hashable and that don't cause any problems. We just need to make sure of 2 things: 1. Hash is compatible with equality, meaning there should not be any objects that are equal but has different hash. That is the real (and documented) reason why __hash__ is implicitly set to None if we define __eq__ in class but not define __hash__ itself. Default __hash__ implementation would most likely be incompatible with modified __eq__. Notice that this has nothing to do with mutability of an object, it just prevents random breakage of Hashable protocol. 2. Hash (and eq) should not depend on mutable state. That is not strictly necessary for some operations, but it is necessary when hash is used for looking up things that might change over time. Basically objects could be mutable and you might even create two separate objects with different attribute and still have them equivalent if only a subset of their state (values of attributes) participate in hash calculation and __eq__ operation. For example you could ignore some comments or other unimportant values. And you could also has it the other way around: immutable object could be not hashable. That's rare but sometimes reasonable. Currently official python documentation is unclear about some of this stuff and it would be nice to reword it better.

  • @juanmacias5922
    @juanmacias59229 ай бұрын

    lmfao sounds like C pointers. :D

  • @antoniov.fuentes5273
    @antoniov.fuentes52735 ай бұрын

    I recently had to simulate objects moving through a 2d grid, and because their position needed to be updated and overlaps were allowed I couldn't hashed them based on their state, so this forbidden techniche came in handy.

  • @Dongobog-ps9tz
    @Dongobog-ps9tz8 ай бұрын

    I have one personal project that parses a library as a dictionary so I can choose which functions to use in it with a JSON file and have it automatically adapt to the library updating and still let me access everything with JSON strings

  • @user-wv1in4pz2w
    @user-wv1in4pz2w9 ай бұрын

    forbidden technique is nice, but what's the use-case for it?

  • @Scymet

    @Scymet

    9 ай бұрын

    constant lookup time for mutable objects

  • @haxwithaxe
    @haxwithaxe9 ай бұрын

    I've done this in the distant past (so I'm being as vague as my memory) but I defined an id attribute and set it to a uuid and defined __hash__ to return the id attribute. It's not the fastest way to do it probably but it's nearly guaranteed to not to collide and is dependent on something I control.

  • @delevoxdg
    @delevoxdg9 ай бұрын

    The Dark Side of the Force is a pathway to many abilities some consider to be unnatural.

  • @sociablefish
    @sociablefish9 ай бұрын

    this is where i would mention c++'s const methods and references

  • @SkyyySi
    @SkyyySi9 ай бұрын

    Thinking about it, it would probably be better to implement this as a fallback, so only objects without hashing will use their id instead

  • @nisbahmumtaz909
    @nisbahmumtaz9099 ай бұрын

    The store might also have those UHT millk that doesn't require refrigeration :^)

  • @mCoding

    @mCoding

    9 ай бұрын

    I stand corrected 😂

  • @kristiandilov5249
    @kristiandilov52499 ай бұрын

    If only I could star this channel in a favourites section

  • @Zhaxxy
    @Zhaxxy9 ай бұрын

    for the issue with immutable objects, why couldnt you add a check it the object is hashable, if not then use the id

  • @SkyyySi
    @SkyyySi9 ай бұрын

    I guess I've just used Lua (and C) too much, but this doesn't feel illegal to me at all lol

  • @PeterZaitcev

    @PeterZaitcev

    9 ай бұрын

    This depends on the context. In some cases, this is not even legal, this is the preferred way to store data. I.e., game entities.

  • @Solarbonite

    @Solarbonite

    9 ай бұрын

    Yeah... This is pretty common in C++. Some libraries just use the pointer size_t value directly. Honestly it's free so why not 😂

  • @debakarr
    @debakarr9 ай бұрын

    I have seen another way where say you want to maintain cache but want to have mutable object as key. That cache can be use between different python processes. In that case one way would be to pickle the mutable object and use it as key for cache: Reuven M. Lerner - Practical decorators - PyCon 2019 - kzread.info/dash/bejne/f5580q98edndo7w.html

  • @chrisdaley2852

    @chrisdaley2852

    9 ай бұрын

    I think I prefer that. Performance would be the main concern right? But in general, this seems safer.

  • @jofofouj
    @jofofouj9 ай бұрын

    This chakra too dark.

  • @cmilkau
    @cmilkau9 ай бұрын

    There's nothing forbidden about it. As always you just need to know (and communicate) what you're doing. Using the ID is a bit like using a reference/pointer (where) as a key rather than the value (what).

  • @chrisdaley2852

    @chrisdaley2852

    9 ай бұрын

    The reference/value divide has always caused issues though. That's why some languages value immutability as a language principle. It's okay to do this in moderation if used with diligence. But all it takes is for someone to use a class that has some method which replaces a child object with a copy and if you're using that child as a key the ids won't match. Another one people have pointed out is that the ids of some object ids are cached so they might always match so: a = 256 b = 256 print(a is b) #true a = 257 b = 257 print(a is b) #false If you want simplified and scalable code, it's better not to do this. And if you do end up doing this, you have to know what you're doing.

  • @XCanG
    @XCanG9 ай бұрын

    Idea for the video: review *ruff* type checker and may be share your opinion on should all the type check rules be enabled or not? Because some of them prevent errors, but still could be cumbersome to use.

  • @benamiel6180
    @benamiel61809 ай бұрын

    Make a vid of the raise KeyError(item) from exc line you had in your code. Explain how it works. Never seen thw from keyword used like that. Please, do that.

  • @RatafakRatafak
    @RatafakRatafak9 ай бұрын

    Couldnot hash be based on id of object instead on state of object?!

  • @AleAle-hj6db
    @AleAle-hj6db9 ай бұрын

    its like using a pointer as key in C++. I do that every day

  • @ali-om4uv
    @ali-om4uv9 ай бұрын

    Someone once told me its forbidden and stupid to use a dict to implement the state pattern . Is this also true?

  • @szaszm_
    @szaszm_9 ай бұрын

    If the user code has the id anyway, why don't they just load the object by id, like dereferencing a pointer?

  • @0LoneTech

    @0LoneTech

    9 ай бұрын

    The id is just a number. It doesn't mean the object still exists, so the same issue as a dangling pointer makes it unsafe to dereference. Normal references do guarantee existence, and weak references can safely detect object destruction.

  • @electronerd
    @electronerd9 ай бұрын

    Another option would be to wrap the key in something hashable and use the regular collection types. This could use the id as in the video, or it could "freeze" the value in some way, or something else I haven't thought of. A similar technique would be to store a proxy for the key, such as a string or tuple representation.

  • @gloweye

    @gloweye

    9 ай бұрын

    If you store, say, a 1-element tuple with a list inside of it, then you get the same behaviour as in the video - it goes by the id() of the list.

  • @CoolDude911
    @CoolDude9119 ай бұрын

    Two arrays you store with method could have the same contents but different ids.

  • @mCoding

    @mCoding

    9 ай бұрын

    Indeed, this is one of the tradeoffs mentioned in the video. You can store anything in an id map, but you must lookup elements using the exact same object, not just an equal one.

  • @lapidations
    @lapidations9 ай бұрын

    This sound as dangerous as C++! Great

  • @murphygreen8484
    @murphygreen84849 ай бұрын

    Forward if you to assume I don't wonder the entire grocery store looking for an item 🤣

  • @KirkWaiblinger
    @KirkWaiblinger9 ай бұрын

    For extra credit, override the dict constructor with this... (wonder if there's a crazy way to make it work with dictionary literals even). Also, seems like you could do something more clever by separating primitives/value types, to be hashed as usual, from user defined objects, to be compared by id/reference only, ending up with something resembling how this would work in JS.

  • @chrisdaley2852

    @chrisdaley2852

    9 ай бұрын

    Don't do that. If you're using these types of techniques, be explicit. Make a new class to separate the use cases entirely. The problem is that using this technique puts a responsibility on the developer to take care of any object used as a key in the mutable_dict for its entire lifetime. If that responsibility is not properly conveyed to other developers on the project or to yourself later if you don't remember, you can end up with unintended side effects. This is a problem especially if you specifically obfuscate the meaning of your original code since you're essentially hiding your own bugs.

  • @KirkWaiblinger

    @KirkWaiblinger

    9 ай бұрын

    @@chrisdaley2852 oh yeah no 100% my comment is a joke

  • @JohnFallot
    @JohnFallot9 ай бұрын

    Mmm forbidden hashing 🤤

  • @tratbagd4500
    @tratbagd45009 ай бұрын

    Other solutions in python I've seen in some codebase was to hash them based on their string value after stringifying them

  • @danenergetics3907
    @danenergetics39079 ай бұрын

    That anime reference caught me off guard af lol

  • @user-wv5lp3nd8c
    @user-wv5lp3nd8c7 ай бұрын

    haha this is certainly the video of yours with the best jokes. Had to laugh a few times.

  • @andrey2001v
    @andrey2001v9 ай бұрын

    Thanks! This is so cursed. I hope I'll never see this technique again!

  • @AntonioZL
    @AntonioZL9 ай бұрын

    Father, I crave the forbidden data structure. 😢

  • @coder436
    @coder4362 ай бұрын

    I was gonna comment saying to use hash(id(x)) but then I saw that that's you did

  • @user-hk3ej4hk7m
    @user-hk3ej4hk7m9 ай бұрын

    Now don't go trying to store a set into itself

  • @gge6021
    @gge60219 ай бұрын

    gonna ship it this week lets see if im getting fired

  • @DMSBrian24
    @DMSBrian249 ай бұрын

    Dear god...

  • @user-zm7rc3yn3v
    @user-zm7rc3yn3v9 ай бұрын

    So pointers for python

  • @romalivejournal
    @romalivejournal9 ай бұрын

    You may do background colour black in Pycharm. In Settings: Colour Scheme -> General -> Text -> Default Text.

  • @alexischicoine2072
    @alexischicoine20729 ай бұрын

    Seems like if you’re going to use the same object why not just put the value in the object.

  • @uzairmughal4976
    @uzairmughal49769 ай бұрын

    I will be the next Voldemort. Thank you Professor

  • @GLITCH_-.-
    @GLITCH_-.-9 ай бұрын

    Awww... I was really hoping we can use emoji as variable names. Is your thumbnail misleading?

  • @mCoding

    @mCoding

    9 ай бұрын

    Haha i don't cover those kinds of topics in my channel, although i do talk about emojis in my str vs bytes video. You could certainly make a dict with emojis in it if you put quotes around them 😉. Or (highly not recommended) use a custom encoding that would allow you to use emojis in variable names. Or (medium not recommended) fork cpython and change the grammar to allow emojis in variable names. So many choices, have fun!

  • @vectoralphaAI
    @vectoralphaAI9 ай бұрын

    Alright fine i wont do it. I dont want you to get in trouble.

  • @deadeye1982a
    @deadeye1982a9 ай бұрын

    I guess I used it once accidentally.

  • @muhdkamilmohdbaki7054
    @muhdkamilmohdbaki70549 ай бұрын

    Honestly, the analogy used (buying milk from the store) is not the best example. Secondly, it would have been better if the video also included real use cases where this technique is used, preferably best practice. But yeah, I've only been using Python 2 for about a month, so perhaps this technique is more common and useful than I would imagine. The basics are more than enough for my needs.

  • @Gabryx_
    @Gabryx_9 ай бұрын

    lua be like

  • @davititchanturia
    @davititchanturia9 ай бұрын

    wait..., thats illegal

  • @chriswilliamson4693
    @chriswilliamson46939 ай бұрын

    How do you find milk in a supermarket? Go to the corner diagonally opposite the entrance

  • @Ca1vema
    @Ca1vema9 ай бұрын

    I see you're completely out of ideas

  • @yxh
    @yxh9 ай бұрын

    Please return to not including memes and other “entertaining” visual tidbits in your vids. I get that you might be trying to appeal more to the youtube algorithm but its distracting

  • @yujielee

    @yujielee

    9 ай бұрын

    idk, they use those really sparsely to the point where its not a distraction, and imo it would be much less attention grabbing if it were just straight code talk, but to each their own

Келесі