Use Arc Instead of Vec

Rust lets you do efficient reference-counted strings and dynamic arrays using Arc basically just as easily as their owning (and deep-cloning) equivalents, String and Vec respectively. So why not use them as a reasonable default, until you actually need the mutability that String and Vec provide? Get into the weeds with me here, feat. some cool visualizations, with special guest appearance from Box.
This video assumes some familiarity with Rust and its core smart pointer types, namely Vec/String/Rc/Arc/Box, along with data structures like HashMap and BTreeMap, and traits like Clone, Hash, Ord, and serde::{Serialize, Deserialize}.
serde feature flag for Rc/Arc: serde.rs/feature-flags.html#-...
Arc docs: doc.rust-lang.org/std/sync/st...
Vec docs: doc.rust-lang.org/std/vec/str...
Smart pointers in Rust: • Crust of Rust: Smart P...
animations: www.manim.community/

Пікірлер: 407

  • @JohnPywtorak
    @JohnPywtorak11 ай бұрын

    As a person relatively new to Rust, I kept thinking but Vec has the macro vec! for ease. And an Arc might not be as ergonomic to get in place. It would have been nice if that pre -step was indulged. Because you might reach for a Vec based on the ease vec! provides. So helpful though and was a great learning tool and fun watch.

  • @_noisecode

    @_noisecode

    11 ай бұрын

    That's a great point, and one I probably should have mentioned in the video! Thankfully, Arc is pretty easy to create: it implements From, so you can create one with `vec![1, 2, 3].into()`, and it also implements FromIterator, so you can create one by `.collect()`ing an iterator just like you would any other collection. Since the video was all about golfing unnecessary allocations etc., I should also mention that creating an Arc often involves one more memory allocation + memcpy than creating the equivalent Vec would have. There's some well-documented fine print here: doc.rust-lang.org/std/sync/struct.Arc.html#impl-FromIterator%3CT%3E-for-Arc%3C%5BT%5D%3E

  • @mateusvmv

    @mateusvmv

    11 ай бұрын

    ​@@_noisecode Is it the same as Vec::into_boxed_slice, which only re-allocates if the vec has excess capacity? Arc implements From without re-allocation.

  • @_noisecode

    @_noisecode

    11 ай бұрын

    It's one more allocation, even if the Vec doesn't have any excess capacity, since in general Arc needs to move the data into its own allocation containing the reference count info. For the record, `From for Arc` does in fact allocate (see the implementation here: doc.rust-lang.org/src/alloc/sync.rs.html#1350 ).

  • @DBZM1k3

    @DBZM1k3

    11 ай бұрын

    How does From compare to simply using into_boxed_slice and using Box::leak instead?

  • @AlgorithmAces

    @AlgorithmAces

    11 ай бұрын

    @Ayaan K yes

  • @fabbritechnology
    @fabbritechnology10 ай бұрын

    For high performance code, Arc is not cheap. Cloning a small string may actually be faster depending on your cpu topology and memory access patterns. As always, measure first.

  • @MusicGod1206

    @MusicGod1206

    10 ай бұрын

    Great point

  • @FandangoJepZ

    @FandangoJepZ

    10 ай бұрын

    Having small strings does not make your program high performance

  • @Mempler

    @Mempler

    10 ай бұрын

    A modern CPU (with avx-512 ext) can handle up to 64 bytes at the same time, nearly instantaneously. However, that's only for modern CPUs. Thus, if you know your architecture that you're running on, you can do pretty neat optimization

  • @warriorblood92

    @warriorblood92

    9 ай бұрын

    what you mean by cloning small string on stack? Strings are on heap right? so cloning will occur on heap only!

  • @David_Box

    @David_Box

    9 ай бұрын

    ​@@warriorblood92 strings can very much be on the stack. A "str" is stored on the stack (well actually it's stored in a read only part of the memory, different from the stack but it functions effectively the same), and you can very much clone them to the stack (even if rust makes it rather difficult to do so).

  • @mithradates
    @mithradates11 ай бұрын

    Nice, did not expect a full 10+ minutes advocating for Arc over Vec on my recommendations. You deserve way more subscribers.

  • @aadishm4793

    @aadishm4793

    Ай бұрын

    you too :-)

  • @pyromechanical489
    @pyromechanical48911 ай бұрын

    Arc/Rc work best when you don't really know the lifetimes of your data, but if your program is structured in a way that makes lifetimes obvious (say, loading data at the start of a block of code and reusing that), then you can use normal &'a [T] references and get the same benefits of cheap-to-copy immutable data that can be shared between threads, and doesn't even require a pointer indirection on clone!

  • @BigCappuh

    @BigCappuh

    5 ай бұрын

    How can you share state between threads without Arc?

  • @JeremyHaak

    @JeremyHaak

    5 ай бұрын

    @@BigCappuh Scoped threads can capture shared references.

  • @amateurprogrammer25
    @amateurprogrammer2511 ай бұрын

    It occurs to me that your use case for an Arc could potentially be better served by a &'static str or just an enum. If you have an in-game level editor that allows creation of new monster types, Arc would be ideal, but in most cases, the entire list of monsters that will ever exist is known at compile time. If you use an enum for the monster types, you can still derive all the times you were deriving before, with some help from the strum crate or similar you can implement as_str with custom strings containing spaces etc. very easily, your memory footprint is a _single_ word, you can #[derive(Copy)] meaning cloning is effectively instaneous, and as a bonus, you don't need a hashmap to keep track of monsters killed or enemy stats -- just declare the enum as #[repr(usize)] and use it as the index into a Vec, or better, an array.

  • @zerker2000

    @zerker2000

    11 ай бұрын

    So much this. Hashing and string comparisons seem super overkill for a closed set known at compile time, and even if it is extensible in the editor, it still seems better to have the actual ids be a `u16` or w/e and the actual names interned in a global vec somewhere. Most operations don't care about the name! (possibly a tuple of type and instance, if you find yourself runtime spawning "guard1" "guard2" "guard3" etc)

  • @_noisecode

    @_noisecode

    11 ай бұрын

    Thanks for mentioning this, and yes, I couldn't _possibly_ agree more that if you have a closed set of variants known at compile time, please, please use an enum--as you say, it is better in every conceivable way than making your IDs "stringly-typed", especially with the help of e.g. `strum` to get you the string versions if you do need them. Sometimes you do need actual dynamic strings though, and they follow a create-once-clone-often usage pattern like what I show in the video. In those cases, I believe my arguments for using Arc over String hold. For what it's worth, the real-world code that inspired the MonsterId in this video actually _was_ an ID that was loaded from a configuration file at runtime, and so there wasn't a closed set of variants known at compile time.

  • @zerker2000

    @zerker2000

    11 ай бұрын

    Personally in that circumstance I'd still be tempted to leak a `&'static [&'static str]`, unless you're reloading the config file _frequently_ or using those string ids over the network or something. But definitely makes more sense in that instance!

  • @alexpyattaev

    @alexpyattaev

    11 ай бұрын

    Make struct MonsterID(u16), and custom constructors for it that maintain actual names in a global vec, all behind rwlock. To log you can dereference to the actual location with string data, other "normal" uses can be all in u16. then all your game logic is just moving u16 around, no pointers or anything.

  • @CamaradaArdi

    @CamaradaArdi

    11 ай бұрын

    I can envision the scenario where you load a level from a file, then you either have a &'file str which you might not want to or clone the string once, which is really not that expensive.

  • @constantinhirsch7200
    @constantinhirsch72008 ай бұрын

    Rust's Arc is close to Java's default String type: Both are immutable, both will be automatically freed when no one has a reference anymore. Rust's String is more close to Java's StringBuilder. I see this as further validation that Arc is in fact quite a sane type to use in many situations.

  • @_jsonV
    @_jsonV11 ай бұрын

    As a core developer/moderator for Manim, it makes me happy to randomly find Manim-related videos in my recommended. Great job with the explanation of which data structure to use when mutability is(n't) required, and great visuals too!

  • @aemogie

    @aemogie

    11 ай бұрын

    manim-rs when /j

  • @Nick-lx4fo

    @Nick-lx4fo

    10 ай бұрын

    ​@@aemogiesomebody is probably working on it somewhere

  • @FoxDr
    @FoxDr11 ай бұрын

    Very good video, the advocated point is really useful indeed. I only have 2 nitpicks about it: - It addresses less experienced Rust developers, but you forgot to mention how to construct values of these types (not that it's exceedingly complicated). A pinned comment might help in that regard (since with the algo apparently taking a liking to it, you might get spammed with questions about construction. - I would generally never recommend `Rc`, since `Arc` works using relaxed atomic operations, which have no overhead compared to their non-atomic counterparts. And while the MESI protocol may cause cache misses when accessing the cache line where the counts have been updated, this is not relevant when working in a single-threaded environment. So in general, `Rc` and `Arc` have identical runtime costs (not just similar), making using `Rc` useful only when you want to semantically prevent its content's ownership from being shared across threads, without preventing the content from being punctually shared across threads.

  • @_noisecode

    @_noisecode

    11 ай бұрын

    Great feedback, thank you. I think you're right and I went ahead and pinned the existing discussion of how to create an Arc--I agree I should have mentioned it explicitly in the video itself. Live and learn. :) As for Rc vs. Arc, your point is well made, but I think I will stick to my guns on recommending Rc where possible. Even if there are expert-only reasons to be sure there is no practical performance difference, this runs counter to the official guidance from the Rust standard library documentation which states that there may indeed be a performance difference (doc.rust-lang.org/std/sync/struct.Arc.html#thread-safety), and aside from performance alone, I would argue that the semantic argument is enough. If I know my type is not meant to be shared across threads, I ought to use the least powerful tool for the job (Rc) that allows me to accomplish that.

  • @zuberdave

    @zuberdave

    11 ай бұрын

    Arc's clone uses Relaxed, but its drop does not (it uses Release). In any case the atomic increment in clone is going to be more expensive than a non-atomic increment whether it's relaxed or not. Possibly you are thinking about relaxed atomic loads/stores, which are typically no more expensive than regular loads/stores.

  • @MusicGod1206

    @MusicGod1206

    10 ай бұрын

    @@zuberdave Great point!

  • @jordanrodrigues1279

    @jordanrodrigues1279

    5 ай бұрын

    With Rc the compiler can rearrange the increment and decrement instructions and often cause them to cancel out. Sometimes Rc really is zero cost. That doesn't work with Arc.

  • @michawhite7613
    @michawhite761311 ай бұрын

    Another benefit of of Box is that the characters are actually mutable, even though the length isn't. So you can convert the string to uppercase or lowercase if you need to.

  • @Tumbolisu

    @Tumbolisu

    11 ай бұрын

    This makes me wonder if there are any unicode characters where the uppercase and lowercase versions take up different numbers of bytes. I imagine if you add diacritics you might find a situation where one version has a single unicode code point, while the other needs two.

  • @michawhite7613

    @michawhite7613

    11 ай бұрын

    @@Tumbolisu Unicode groups characters from the same alphabet together, so I think this is unlikely to ever happen

  • @Tumbolisu

    @Tumbolisu

    11 ай бұрын

    @@michawhite7613 I actually just found an example. U+1E97 (Latin Small Letter T With Diaeresis) does not have an uppercase version, which instead is U+0054 (Latin Capital Letter T) combined with U+0308 (Combining Diaeresis).

  • @Tumbolisu

    @Tumbolisu

    11 ай бұрын

    @@michawhite7613 Oh and how could I forget! ß is one byte while ẞ is two bytes. The larger ẞ was only introduced into the German language a few years back, while the smaller ß is ancient.

  • @michawhite7613

    @michawhite7613

    11 ай бұрын

    @@Tumbolisu You're right. The functions I'm thinking of are actually called `make_ascii_lowercase` and `make_ascii_uppercase`

  • @dekrain
    @dekrain10 ай бұрын

    Small correction. With Arc/Arc, the Arc pointer itself only stores 1 word, not 2, as the length is stored in the boxed cell on the heap in the String/Vec object, and String/Vec is Sized, unlike str/[T]. This can be potentially useful if space is at the most price, but you can also use a thin Box/Rc/Arc, which isn't available in standard library yet (ThinBox is in alloc, but it's unstable), which stores the length (and maybe capacity) directly next to the data, keeping the pointer single word.

  • @giganooz

    @giganooz

    9 ай бұрын

    Was just about to comment this. Also, hey man, didn't expect to run into you 😂

  • @phillipsusi1791
    @phillipsusi179144 минут бұрын

    I'll go one further... monster name strings likely are static anyhow, and so you don't even need Arc or Box, you can just use str directly. Then clones don't even need to increment a reference count on the heap, you just copy the pointer.

  • @enticey
    @enticey11 ай бұрын

    The info graphics for each explanation is expertly simple and straight forward, never change them.

  • @kirglow4639
    @kirglow463911 ай бұрын

    Awesome video and narration! Always exciting to see well-explained Rusty content. Keep it up!

  • @SJMG
    @SJMG11 ай бұрын

    That was really well done. I thought 15min on this topic was going to be a slog, but it was a well motivated, well visualized example. You've earned a sub. Keep up the good work, Logan!

  • @J-Kimble
    @J-Kimble11 ай бұрын

    I think this is the best explanation of Rust's internal memory management I've seen so far. Well done Sir!

  • @GuatemalanWatermelon
    @GuatemalanWatermelon11 ай бұрын

    The visuals were fantastic in guiding me through your explanation, great stuff!

  • @Otakutaru
    @Otakutaru9 ай бұрын

    You know that Rust is healthy as a language when there are videos about it that only rustaceans could fully understand and make use of

  • @nilseg
    @nilseg11 ай бұрын

    Very nice video ! I love how you explain this. Can't wait your next topic. I shared it on Reddit and already lot of views and good feedbacks. Continue your work ;)

  • @dragonmax2000
    @dragonmax200011 ай бұрын

    Really awesome insight! Please continue making these.

  • @spikespaz
    @spikespaz11 ай бұрын

    Your channel is going to explode if you keep doing videos like this one.

  • @TobiasFrei
    @TobiasFrei9 ай бұрын

    I really admire your dense, concise way to "think" in Rust 🤓

  • @Dominik-K
    @Dominik-K8 ай бұрын

    Thanks a bunch for the clarifications. Memory allocations are one of the major factors in shaping performance characteristics and understanding them may not always be an easy task. Your video and especially the visualization help a lot! Great work

  • @wetfloo
    @wetfloo11 ай бұрын

    loved the video, and loved the discussions in the comments too. really appreciate it as the rust beginner, keep it up!

  • @robertotomas
    @robertotomas11 ай бұрын

    Awesome! Thank you for this explanation. I’ve heard bits and pieces of this before and it was making sense that I should start doing this as I am learning rust… but this one video gave me a ton of context; I think I’m actually going to do this as a reactor phase now 😊

  • @waynechoi883
    @waynechoi8838 ай бұрын

    Just making this change on a large vec in my program resulted in a 5x speed up for me. Thanks for the video!

  • @leddoo
    @leddoo10 ай бұрын

    love it! i often do something similar with `&'a [T]` by allocating from an arena/bump allocator. (this also has the added benefit that the allocation truncation is free)

  • @tommyponce2511
    @tommyponce251111 ай бұрын

    Started watching the video thinking Arc and Vec had totally different use cases, and I'm glad you proved me wrong lol very useful info when you're trying to implement memory efficient coding. Thanks for the data man, really interesting and useful stuff. Cheers!

  • @NoBoilerplate
    @NoBoilerplate11 ай бұрын

    Fantastic video, wow!

  • @JeremyChone
    @JeremyChone11 ай бұрын

    Nice video, and very interesting take. I am going to give this pattern a try in some of my code and see how it goes. Thanks for this great video!

  • @marcb907
    @marcb90711 ай бұрын

    Interesting content and well explained. You should do more videos like this.

  • @jermaineallgood
    @jermaineallgood11 ай бұрын

    Thank you for this insight! I’d never think to use Arc instead of Vec, probably use Criterion to see performance timing between both

  • @kaikalii
    @kaikalii11 ай бұрын

    This is a great video. I'd love to see more like it.

  • @irlshrek
    @irlshrek11 ай бұрын

    this was fun! its like when they say "make it work, then make it right, then make it fast". This is a really good example for what to do in that second or third step!

  • @Calastrophe
    @Calastrophe11 ай бұрын

    I don't typically comment on videos. I have to say this was really well made, please keep up this level of content. I really enjoyed it.

  • @endogeneticgenetics
    @endogeneticgenetics9 ай бұрын

    `str` can also be accessed across threads via `&str` (since its immutable). And cloning has no special properties I can think of here since the data is immutable. `Arc` only seems advantageous if you want reference counting vs relying on something like `static for a string. The video was fun either way -- but can you give a reason you'd prefer the Arc or Rc fat pointer to just referencing str?

  • @andres-hurtado-lopez
    @andres-hurtado-lopez11 ай бұрын

    Is not only a beautiful insight on the internals of memory allocation but also does an implacable job of explaining the topic in plain English so even entry level developers can understand the good, the bad and the ugly. Keep doing such a great job divulging such an awesome programming language !

  • @nel_tu_

    @nel_tu_

    11 ай бұрын

    i think u meant to say impeccable

  • @andres-hurtado-lopez

    @andres-hurtado-lopez

    11 ай бұрын

    @@nel_tu_ Yup, sorry about the typo

  • @joelmontesdeoca6572
    @joelmontesdeoca657211 ай бұрын

    This was fantastic. Thank you for making this video.

  • @asefsgrd5573
    @asefsgrd557311 ай бұрын

    I would also mention `.as_ref()` as some impl types require the exact `str` type. Great video!

  • @otaxhu8021
    @otaxhu80219 ай бұрын

    great video. I'm learning Rust and this video is very helpful for understanding different ways of storaging data. I'm struggling with borrowing and ownership system but well I couldn't do any better

  • @RenderingUser
    @RenderingUser8 ай бұрын

    This could not have come at a more perfect time. I've been storing a list of a list of immutable data with a thousand elements in a vec

  • @ronniechowdhury3082

    @ronniechowdhury3082

    8 ай бұрын

    You should not be storing 2 d arrays, switch to contiguous storage and store the dimensions. ndarray might be an option

  • @RenderingUser

    @RenderingUser

    8 ай бұрын

    @@ronniechowdhury3082 I wish I knew what contiguous storage means.

  • @ronniechowdhury3082

    @ronniechowdhury3082

    8 ай бұрын

    @@RenderingUser just create a stuct that stores each row appended together in one long vec. Then store the width and height as usize. Finally add some methods that access a row or column at a time by slice. It will make your data access significantly faster.

  • @jehugaleahsa
    @jehugaleahsa7 ай бұрын

    I think what would have helped me was a quick example of how you initialize an Rc, Arc, and Box. It's pretty obvious when the str is a compile time constant, but less obvious when it's from a runtime string. Do you simply create a String and then Arc::new on it? Does memory layout change when it's a compile-time vs runtime string?

  • @_noisecode

    @_noisecode

    7 ай бұрын

    Check the pinned comment! (Spoiler: it's Arc::from("foo"), or Arc::from(my_string)). Memory layout doesn't change, as they're both the same type (Arc).

  • @thorjelly
    @thorjelly11 ай бұрын

    I have a few concerns recommending this to a beginner "as a default". I feel like the times when you actually want to clone the arc, such as if you want to store the same list in multiple structs without dealing with lifetimes, are quite situational. Most of the time, what you should do is dereference it into a slice to pass around, because it is more performant and it is more general. But I am afraid that using an arc "as a default" would encourage a beginner to develop the bad habit of just cloning the arc everywhere. The need to pass an immutable reference/slice is not enforced by the compiler, but it is with other data types. Worse, this could give beginners a bad misunderstanding how clone works, because arc's clone is very different from most other data type's clone. Do we want the "default" to be the absolute easiest, absolute most general solution? Or do we want the default solution to be the one that enforces the best habits? I would argue for the latter.

  • @rossjennings4755

    @rossjennings4755

    10 ай бұрын

    So what you're saying is that we should be recommending Box as default, then. Makes sense to me.

  • @thorjelly

    @thorjelly

    10 ай бұрын

    @@rossjennings4755 I would say if you're using Box you might as well just use Vec, unless for some reason you want to guarantee that it will never be resized.

  • @constantinhirsch7200

    @constantinhirsch7200

    8 ай бұрын

    When you look at the Rust language survey, one bug hurdle always mentioned is the steep learning curve of Rust. Just using Arc for all Strings by default may alleviate that burden. Performance is at least on par with any GC'ed language with immutable Strings (e.g. Java) and those also run fast enough most of the time. And secondly, who is to say that all Rust programs must always be optimized for runtime performance? If you do some rapid development (i.e. optimizing for developer performance) in Rust, of course you can use Arc and then later on *if* the program is too slow you can still come back and optimize the critical parts. From that point of view, thinking about lifetimes a lot early in development, just to avoid the reference counting might even be considered a premature optimization.

  • @4xelchess905

    @4xelchess905

    8 ай бұрын

    @@thorjelly The video mentions immutable data, in which case it won't be resized. But yeah totally agree on what you said, the default good practice should be Vec/Box for the owner and &[T] for the readers, and only consciously opt for Rc when useful or necessary.

  • @4xelchess905

    @4xelchess905

    8 ай бұрын

    @@constantinhirsch7200 "who is to say that all Rust programs must always be optimized for runtime performance?". Logan Smith. Logan Smith is to say precisely that. The whole point of the video you just watch is to advocate that Arc is more performant at runtime than Vec, while being a drop in replacement. The gripe thorjelly and I have with it is that Arc is a lazy halfed-ass optimizations. If you want to delegate optimization for later, why touch the code at all, why learn smart pointers the wrong way where you could stick to cloning Strings ? Wouldn't that be premature optimization, or at least premature ? If you want to optimize, why use smart pointers when a slice reference is both enough and better ?

  • @SophieJMore
    @SophieJMore11 ай бұрын

    Arc is sort of similar to how a lot of other languages like Java or C# handle strings, isn't it?

  • @dsd2743
    @dsd27436 ай бұрын

    As for Arc: Depending on the use case, you can just Box::leak() a String and pass around the &'static str. Typically, especially if used as IDs, the total number of such strings is low anyway.

  • @sanderbos4243
    @sanderbos424310 ай бұрын

    Your graphics and script are a masterpiece

  • @tyu3456
    @tyu345611 ай бұрын

    Awesome video!! Btw I love the font you're using, looks kinda like LaTex

  • @_noisecode

    @_noisecode

    11 ай бұрын

    It is! Courtesy of the Manim library--see the link in the description. :)

  • @rohankapur5776
    @rohankapur577610 ай бұрын

    this was very informative. we need more rust golfing vids on youtube!

  • @ouchlock
    @ouchlock11 ай бұрын

    awesome, wanna more content on Rust like this

  • @timClicks
    @timClicks11 ай бұрын

    Love this Logan. What a wonderful explanation and a good challenge to orthodoxy. I'll provide one answer the question that you posed a few times in the video, "Why use String (or Vec) rather than Arc?". That's because accessing the data from an Arc incurs some runtime cost to ensure that Rust's ownership semantics are upheld. That cost doesn't need to be paid by exclusively owned types.

  • @_noisecode

    @_noisecode

    11 ай бұрын

    Thanks for the kind words. :) Accessing an Arc incurs no runtime cost with regard to Rust's ownership rules. The runtime cost of accessing the pointed-to data is about the same as for Vec: a pointer indirection. Possibly you are thinking of RefCell? RefCell does involve some slight runtime overhead due to essentially enforcing Rust's borrow checking rules at runtime.

  • @timClicks

    @timClicks

    11 ай бұрын

    @@_noisecode Oof, I knew that I should have looked that up. You're right

  • @DenisAndrejew
    @DenisAndrejew10 ай бұрын

    Good food for thought and illustrations, but I very much wish you would use Rc instead of Arc in most of this, and then showed folks how to determine if you actually need to "upgrade" to Arc when necessary. Healthier practice for the whole Rust ecosystem to not default to thread-safe types & operations when not actually necessary. We'll all pay with decreased performance of the software we use proportionately to how much thread-safe code is overused. 🙂

  • @zeburgerkang
    @zeburgerkang4 ай бұрын

    subbed and saved for future reference... easy to understand explanation.

  • @TCSyndicate
    @TCSyndicateАй бұрын

    Commenters have pointed it out somewhat, but this video represents a misunderstanding of the purpose of different types. What you want here is a &[T] not an Arc. The confusion is sometimes you feel forced to make an allocation, cause you're doing something like giving monsters ids from a loaded configuration file. In that case, you make 1 allocation at the start of the program for the config file, then each monster holds a &str to that allocation. Having to make an allocation for the config file, doesn't mean you need to make, or hold an allocation for each thing that uses the config file. Consider writing a programming language implementation, with multiple parsing phases. The efficient thing to do is to make 1 String allocation at the start of the program for the source code, then a lex(&str) -> Vec, containing subslices of the original String buffer.

  • @rsnively
    @rsnively11 ай бұрын

    Great explanation

  • @kdurkiewicz
    @kdurkiewicz8 ай бұрын

    There's a disadvantage of using Rc/Arc though: these types are not serializable, while String is.

  • @_noisecode

    @_noisecode

    8 ай бұрын

    As I mentioned in the video, there's a serde feature flag that enables support for Rc/Arc. Check the docs.

  • @Shaunmcdonogh-shaunsurfing
    @Shaunmcdonogh-shaunsurfing11 ай бұрын

    I’ve turned on the bell notification. Also, happy to pay for a cheat sheet on memory allocation recommendations.

  • @torsten_dev
    @torsten_dev9 күн бұрын

    I'd prefer a Cow.

  • @jordanrodrigues1279
    @jordanrodrigues12795 ай бұрын

    Borrowing &str from Arc is pretty much free, it doesn't touch the counters. (There's a caveat.) Cloning or dropping Arc, checking for uniqueness (make_mut), stuff with weak references, those do touch the counters and typically require fences. The caveat is that Rust standard library does put the counters next to the data, so any thread updating counters causes cache line contention for other threads that are merely reading &T. And to put this in perspective, memory fences are lighter than system calls or io. There are no gains to be had unless you're CPU-intensive. (Which you might be. Zoom-zoom.)

  • @isaaccloos1084
    @isaaccloos108410 ай бұрын

    Great video, hope you make more like it 👍🏻

  • @alexpyattaev
    @alexpyattaev11 ай бұрын

    The actual amount of memory allocated to a String during clone operation is actually allocstor specific. For 6 byte string I would not be surprised to see allocator using 8 byte bucket. So there will always be a degree of waste when cloning strings/vecs.

  • @krzyczak
    @krzyczak11 ай бұрын

    Fantastic video!

  • @scvnthorpe__
    @scvnthorpe__10 ай бұрын

    Weird question but, if a Vec is meant to be growable, why does it have a defined capacity? My best guess is that some people might initialise a Vec with n null spaces for performance reasons (given general expected requirements) and then you'd need to know if you can safely go faster with allocations in the intended way... but it's, let's be real, a pretty poor guess lol.

  • @felgenh399
    @felgenh39911 ай бұрын

    Interesting never really thought of this. Nice vid.

  • @clockworkop
    @clockworkop11 ай бұрын

    Hello, the video is great and I really like the point you are making, especially the cash proximity. I will definitely give this a try at some point. Even with that though, I have a few questions. By wrapping the values in Arc, you are effectively turning clones into references without lifetimes. I understand that sometimes its better and easier to work with the full owned value, but if you need that, you can just clone the shared reference on demand. I don't know why, but this feels a bit like Swift to me. Rust has the advantage of the ownership model so if you can do the job just with shared references, I don't see the need for Arc. But of course I could be wrong so please correct me if that's the case.

  • @_noisecode

    @_noisecode

    11 ай бұрын

    I think it's really insightful for you to compare Arc to something you might find in Swift. Arc does share a lot of similarities with Swift's copy-on-write Array struct, and Foundation's NSArray (where `-copy` just gives you back the same array with an increased reference count). The core insight is the same: for an immutable data structure, a shallow copy is equivalent to a deep copy. Rust's superpower is of course that you can hand out non-owning &[T]s and be sure you aren't writing dangling reference bugs. And the video does not intend to dispute that! You should absolutely design your code to hand out references instead of ownership where it makes sense to do so. In my video, I'm just pointing out that Arc can be an optimization over Vec when you were already using Vec--in other words, in places where you've already decided that you need to give out ownership.

  • @GoogleUser-id4sj
    @GoogleUser-id4sj11 ай бұрын

    Great video and animation!

  • @CamembertDave
    @CamembertDave11 ай бұрын

    I agree with your premise and the reasons you give from 12:45, but I found your main arguments kinda... odd? In your opening points you say this is especially useful for data that implements Clone, but the usage pattern you lay out explicitly involves not cloning the data. You clone Strings in the example, but there's clearly no reason to do that because the data is immutable - you're only cloning the Strings to get multiple references to that data. Passing around multiple references to a single piece of data is the whole point of Arc, so of course that is a better solution than duplicating the data to share it. It actually feels like that's the real reason you are wanting to use Arc, but it's not mentioned in the video. You do make a good point of explaining the inefficiency of Arc though. The example itself also strikes me as odd, because the ids are a struct which implements Clone and then to avoid the performance cost of cloning the ids all over the place you reach for Arc, when surely the more natural optimization is to avoid the unnecessary cloning by using a struct which implements Copy instead of Clone, say MonsterId(u64)? If you really need the string data for something other than simply being an id, then you can put that in the EnemyStats struct (which I would assume contains various other data you don't want to be copying around even if immutable). As I said though, I do agree with your overall point. Perhaps an example that used Vec would have cleared these points up, because although - as you quite rightly point out - String and Vec work in essentially the same way, they are quite distinct semantically in most situations. It would be obvious that calling clone on the (probably very long) enemies_spawned Vec is a bad idea, for example, even if this was immutable.

  • @soulldev
    @soulldev11 ай бұрын

    Great video, great channel, Arc is so good.

  • @markay7311
    @markay731111 ай бұрын

    This to me seems like comparing apples and oranges. As you mentioned, Vec works well for a modifiable buffer. Yet, you do advocate for using a simple slice wrapped by Arc. This assumes you have the slice at compile time. How would you build your slice dynamically without Vec? It seems to me you would still need a Vec, which you can convert into [T] to wrap with Arc. Even worse, Arc is usually for multithreaded environments. Why not just use Rc? My point is, I don’t see this suggestion making any sense really, as these two type have very different specific use cases. The video was well made though, I appreciate the great effort.

  • @MusicGod1206

    @MusicGod1206

    10 ай бұрын

    vec![...].into() gives you an Arc, so it's one "clone", and from then on no expensive clones at all. So you build it initially with Vec, and then convert it. After this conversion, all the points in the video apply. Regarding Rc vs Arc 3:20

  • @markay7311

    @markay7311

    10 ай бұрын

    @@MusicGod1206 it sounds to me like simply borrowing would do the trick

  • @expurple

    @expurple

    9 ай бұрын

    @@markay7311 It would, but only in a single-threaded environment and only if there's an obvious owner that outlives the rest. Also, Rc/Arc don't require lifetime annotations (I don't mind these, but only for simple cases with temporary "local" borrowing)

  • @user-uf4rx5ih3v

    @user-uf4rx5ih3v

    6 ай бұрын

    @@expurple Using a non mutable reference is perfectly fine for as many threads as you wish. Arc isn't some magic savior.

  • @iamtheV0RTEX
    @iamtheV0RTEX11 ай бұрын

    Very neat insight! My experience with Arc so far has mostly been limited to either "fake garbage collection" with the Arc anti-pattern, or sharing immutable data between threads or async futures. I've tried avoiding cloning Vecs/Strings by passing around &[T] and &str references (and their &mut counterparts) but putting lifetime annotations in your hashmap keys is a nightmare.

  • @blehbleh9283

    @blehbleh9283

    11 ай бұрын

    How is that an antipattern for async shared state?

  • @iamtheV0RTEX

    @iamtheV0RTEX

    11 ай бұрын

    @@blehbleh9283 I didn't say it was, I said fake GC was the antipattern, where you give up on handling lifetimes and overuse Rc and Arc in cases where it's not necessary.

  • @blehbleh9283

    @blehbleh9283

    11 ай бұрын

    @@iamtheV0RTEX oh okay! Thanks for teaching

  • @trustytrojan
    @trustytrojan11 ай бұрын

    great video, im not the most familiar with rust but this explanation resonated with me but mostly all this made me think of is how Java's String is straight up immutable from the get-go 😂

  • @habba5965

    @habba5965

    11 ай бұрын

    Rust's str literal is also immutable.

  • @Turalcar
    @Turalcar11 ай бұрын

    I'd also look into compact_str (there are other similar crates but this one is the fastest of those I tried).

  • @Shaunmcdonogh-shaunsurfing
    @Shaunmcdonogh-shaunsurfing11 ай бұрын

    Fantastic video

  • @jeffg4686
    @jeffg46868 ай бұрын

    🦀🦀🦀🦀🦀🦀🦀🦀🦀🦀 This tutorial got a 10 crab review, and a second viewing

  • @Rose-ec6he
    @Rose-ec6he11 ай бұрын

    I'm not fully convinced. I'd love to see a follow-up video about this. Here's my thoughts. When i first saw this pop up in my feed I was very confused because Arc is a wrapper to a pointer and Vec is a data structure so comparing an Arc to a Vec seems like an unfair comparison. It seems more appropriate to me to compare Arc to Arc and there's very little difference here, though i suppose specifically when dealing with strings it's not easy to get access and use the underlying vec, nonetheless, It makes more sense to me to compare the two. Until you brought up the fact Arc implements deref I was thinking it was all round acpointl idea but now I'm split on the issue. Something else to consider is ease of use which I dont think you addressed very well. Lifetimes will definitely come into play here but dont with String so it won't be just as easy to pass around at all. Another barrier is if you need to build the string at runtime you will normally end up with a vec anyway which could be shrunk to size and putting the vec behind an arc would achieve mostly the same thing, in comparison having an array pre-built at compile time is very rare in my experience. There are definitely extra steps and efforr involved here which I'm not convinced you have considered carefully. There is no built-in way to convert from a vec to an array, there are some useful crates but more libraries always mean more complexity in your codebase so they're best avoided adding without some consideration. I also think the performance benefits you state are very exhaggerated and It's never worth talking performance without having some benchmarks to back them up imo. Strings are rarely large too so the memory reduction might be there but it would be small, but once again there's not benchmarks to back any of this up so I don't know and I'm not set in either perspective. I'll keep an eye on your channel. I hope to see some follow-up!

  • @dimitardimitrov3421
    @dimitardimitrov342110 ай бұрын

    Best Rust channel on YT! Super high quality!

  • @cookieshade197
    @cookieshade19711 ай бұрын

    I'm confused by the use case presented -- if you want cloneable, immutable string data, surely you'd just pass around indices into a big Vec, or even just &str's directly if the lifetimes allow it? Good video nonetheless.

  • @iwikal

    @iwikal

    11 ай бұрын

    Sure, you could always construct a Box and then Box::leak it to get an immortal &'static str if you're fine with never reclaiming the memory. This memory leak could become a problem if it's unbounded though. Imagine the game is able to spawn arbitrarily many monsters over time, creating more and more IDs. I'm assuming by immutable he meant "immutable as long as it's in use, but after that it gets deleted". If you want to reclaim memory by getting rid of unused IDs, the Vec strategy gets iffy. What if you want to delete an ID in the middle of the Vec? Not an unsolvable problem, but it's already getting much more complex than the simple MonsterID(String) we started with. Plus, if you actually want to access the string contents you need access to the Vec, so you need to pass around a reference to it. And if you're going multithreaded you need to protect it with a Mutex or similar. I'm not a fan.

  • @cookieshade197

    @cookieshade197

    11 ай бұрын

    @@iwikal Hmm, all true on paper. I would assume that, even in a very large game, all monster ID strings ever encountered during runtime are a finite set taking up at most 10kB or so in total, or maybe 1MB if we have very long text descriptions. If the game can dynamically generate large numbers of monster ID strings, or load/deload bigger data chunks, I'd try replacing the Vec with a HashMap or similar, though that gets awkward with multithreading for the same reason.

  • @iwikal

    @iwikal

    11 ай бұрын

    @@cookieshade197 If you leak all IDs and the game keeps allocating new ones, it will run out of memory sooner or later (potentially much later). Maybe you can get away with it if you assume that nobody will leave the game running for more than 24h straight, but what if it's a server? Ideally it should be able to handle years of uptime.

  • @iwikal

    @iwikal

    11 ай бұрын

    @@cookieshade197 To elaborate, what I mean is you can't always assume that there is a reasonably sized set of possible IDs, and even if there was you'd have to construct some kind of mechanism for reusing the old ones. Say we were talking about a ClientId instead, based partially on their IP address. It just seems wrong to me that I should let those stay around in memory after the connection is terminated, until the same client connects again. Maybe they never do, in which case the memory is wasted.

  • @masondeross

    @masondeross

    11 ай бұрын

    @@iwikal The issue isn't running out of memory. That is almost never going to happen in a real game. The issue is cache misses. You want to be able to perform operations on a large number of monsters every single frame, and every unnecessary byte (for the particular operation, hence games using data orientated design where "wasteful" copies of monsters using different structures are fine as long as only minimal related data is kept in context for each part of logic; it isn't about total memory usage in games, which is very counterintuitive to other domains) is another monster pushed off the cache.

  • @SaHaRaSquad
    @SaHaRaSquad11 ай бұрын

    For short strings the smartstring library is even better: it stores strings of up to 23 bytes length in-place without any heap allocations, and imitates the String type's interface. Basically like smallvec but for strings.

  • @khuiification
    @khuiification8 ай бұрын

    Great video, good explanation! Would be cool with some real world examples. I don't see why you would want to use strings as the ID here in the first place, just use a u32. Its good to explain, but i can't really think of when i would need this.

  • @jeffg4686
    @jeffg46868 ай бұрын

    Great tutorial. One thing I was thinking about recently is the overuse of Result - not all functions are fallible, yet many unnecessarily return Result instead of just a value for the infallible functions. I think everyone just got used to returning Result... Worth looking into. Also worth a clippy lint if there isnt one for this. For an API, it should always be Result oc, but we're often not developing apis

  • @zombie_pigdragon

    @zombie_pigdragon

    8 ай бұрын

    Hm, do you have any examples where this has happened? I've never seen it in the wild.

  • @anaselgarhy
    @anaselgarhy11 ай бұрын

    Useful information, thx

  • @blehbleh9283
    @blehbleh928311 ай бұрын

    Arc is a godsend for concurrency

  • @MagicNumberArg
    @MagicNumberArg9 ай бұрын

    Do you use Copy On Write for any cases?

  • @rikschaaf
    @rikschaaf9 ай бұрын

    8:18 I come from a different programming language, so correct me if I make wrong assumptions. I was surprised that the cloning of the "Goblin" string required actual copying. Are these underlying character arrays mutable or something? If not, don't you only need such copying when you modify the string, like when concatenating or when replacing certain characters?

  • @SimonBuchanNz

    @SimonBuchanNz

    9 ай бұрын

    Yes, String is a mutable string. If you start looking into how GC languages handle strings under the hood, you start noticing a pattern of a lot of hidden heavy lifting, be that C# creating implicit StringBuilders, JS using a complicated rope data structure that you pay for later, and so on. For them, that's fine, but in Rust you want to be the one picking what the behavior is, and this is advocating actually doing that.

  • @Xld3beats
    @Xld3beats7 ай бұрын

    Went down the rabbit hole, the important thing I was missing is Box is not the same as Box!!!

  • @_noisecode

    @_noisecode

    7 ай бұрын

    Indexing into Arc/Box etc. works just fine because they deref to [T], which is the thing that implements indexing. Try it out!

  • @Erhune
    @Erhune11 ай бұрын

    In your final section about Arc, your diagram shows Arcs having ptr+len, but in this case String is a Sized type so the Arc only has ptr. Of course that doesn't undermine your point that Arc is just plain bad :)

  • @_noisecode

    @_noisecode

    11 ай бұрын

    Ack, you're right! That Arc pointing to the String should be just a single pointer, no len. Thanks for pointing that out! My mistake.

  • @craftminerCZ
    @craftminerCZ11 ай бұрын

    One thing to note about Box is that if you're trying to basically allocate a massive array on the heap, you'll hit on one of its fundamental problems, that being Box first allocated on the stack and only then copies stuff onto the heap. This results in very easy stack overflows when you're supposedly allocating on heap, unwittingly overflowing the stack in the process of trying to Box an array of a size too massive for the small default stack size rust has.

  • @MusicGod1206

    @MusicGod1206

    10 ай бұрын

    Is there any work around this?

  • @Outfrost
    @Outfrost10 ай бұрын

    Why would the clone performance of Arc be a factor? You get a pointer to the same exact slice. That's like taking an immutable reference to a Vec, which is faster. It does not fulfil the same role as a Vec clone, so it should not be compared to it. I also don't think your stack size and cache locality argument works for anything besides a static string slice. I can't imagine the semantic gymnastics needed to justify iterating over a significant number of Arc clones pointing to the same [T] in memory. In general I think you're making a different argument than you think you're making, and giving a different programming tip than you think you're giving.

  • @sharperguy
    @sharperguy9 ай бұрын

    So now I wonder what kind of situations Cow would be more appropriate when modifying the data might be required.

  • @ihgnmah
    @ihgnmah8 ай бұрын

    If you don't support clone and your data is immutable, wouldn't &str be sufficient as you can have many shared references to read the data without explicitly cloning it?

  • @tobix4374
    @tobix437411 ай бұрын

    Great Video!

  • @linkernick5379
    @linkernick537911 ай бұрын

    Dear author, what tool have you used to create such an informative animation? This video is (at least) has been made in a very approachable and educative way. It reminds me 3b1b style, but in the different content aspect.

  • @avidrucker

    @avidrucker

    11 ай бұрын

    Looks like 3B1B's "Manim"

  • @_noisecode

    @_noisecode

    11 ай бұрын

    Yep! I've updated the description to say so with a link.

  • @adambright5416
    @adambright54168 ай бұрын

    Goblin deez primeagen

  • @vmarzein
    @vmarzein11 ай бұрын

    Really like this video. Nice that it has no music over it

  • @ilyastaouaou

    @ilyastaouaou

    11 ай бұрын

    I agree bro

  • @deadvirgin428
    @deadvirgin42810 ай бұрын

    This is so funny, so much for "memory management without GC" and y'all just end up using a GC anyways.

  • @nicolasmazzon7231
    @nicolasmazzon723111 ай бұрын

    One of the few advanced rust videos that's been really insightful.

  • @Kupiakos42
    @Kupiakos4211 ай бұрын

    I wonder if we could save some cycles by instead having the `ptr` at 10:00 point directly to the data instead of needing to offset. It would require a negative offset for accessing strong and weak but that's much rarer than Deref.

  • @rainerwahnsinn3262
    @rainerwahnsinn32628 ай бұрын

    It seems `Box` is like an immutable `String`, but even better because it lacks the capacity because it can’t ever allocate. In other words, if your `String` is not mutable, you should use `Box`. What am I missing?

  • @_noisecode

    @_noisecode

    8 ай бұрын

    Cloning a Box requires a deep clone of the string data. Cloning Arc only does a shallow clone and bumps the refcount. If you don't need clone, you're right (as I also mention at the end), Box is your best option. If you do, Arc can be better. Both are better (for immutable strings) than String.

  • @jean-michelnadeau5447
    @jean-michelnadeau54475 ай бұрын

    I agree with the caveat that if it is used as an id of sort, Eq and Hash should use pointer (this is not unsafe).

  • @DylanRJohnston
    @DylanRJohnston11 ай бұрын

    Unless you’re planning on generating monster IDs at runtime why not just drop the ARC and use &’static str? Or for that matter why not use a unique zero width type and make MonsterID a trait?

  • @appelnonsurtaxe

    @appelnonsurtaxe

    11 ай бұрын

    The ustr crate also provides a convenient _leaked_ string type with O(1) comparison time a O(n) construction. If the variants aren't known at compile time but don't need freeing after they're known, it can be a good approach.

  • @martingeorgiev999
    @martingeorgiev9997 ай бұрын

    The equivalent to Arc::clone would be giving a shared reference to a slice which as Arc::clone requires no allocation.

  • @FinaISpartan
    @FinaISpartan11 ай бұрын

    Just a reminder that if you dont need thread saftey, you're better off with Box or Rc