<- Back
Comments (35)
- tialaramexVery often if you have text, which this does, you can make huge savings by being intelligent with the text.Rust intentionally provides the simplest possible growable string buffer String, which is literally (under the hood, you can't poke this legitimately) Vec<u8> plus the promise that this is UTF-8 text.But you might find your needs better served by one (or several) of:Box<str> -- you don't need capacity, so, don't store it => length == capacityCompactString -- use the entire 24 bytes for SSO, up to 24 bytes of UTF-8 inline, obviously doesn't make sense if all or the vast majority of your strings are 25 bytes or longerColdString -- same idea but for 8 bytes, and also not storing capacity, this only makes sense over Box<str> if you have plenty of <= 8 byte strings
- _alphageekIf anyone's doing this kind of optimization, dhat-rs is worth a look, it shows you exactly which fields and call sites are eating memory, instead of just a total. Saves a lot of guessing about where to start.
- el_pollo_diabloSo there are now two ways to represent the same state: None or Some(struct whose fields are all None). Even though one of these representations is never produced by the deserialization routine, anyone could construct it if the constructor is public. And even if they don't, the different representations will show up in pattern matching as separate paths for every access to the field. This looks like a good opportunity to make these types (optimized for storage) private, and to define public view objects/accessors (optimized for usage) on top of them that merge equivalent representations.
- Groxxtbh "trait" feels like a very problematic name for that type, for this kind of educational purpose - `trait` is already an established concept and keyword: https://doc.rust-lang.org/book/ch10-02-traits.htmlIt's especially problematic because traits don't have memory behaviors like this article in most cases - by default they're unsized, because it's a description of behavior, not data, and you can't even use them as a struct field without extra work.Like, replace "trait" in here with "box" and see how confusing it would be to be describing how you saved memory by boxing your box, because option doesn't box like many other languages do.
- mstangeAre there any tools that help finding these kinds of things? Like a profiler that says "80% of the allocated bytes are objects of this type, with 95% of those having that field set to None"
- vlovich123Small correction:> a lot of boxes means a fragmented heap. In such case it's not a problem but this might be worth keeping in mind.A good malloc will be able to handle this without issue due to various optimizations specifically that inherently fight fragmentation. Default Linux malloc (glibc) may have issues but I did say good malloc (and even glibc generally shouldn’t struggle with the pattern described I think).
- OptionOfTI quite often have this issue with async. You get a state machine that is huge because of how Rust builds it.
- ozgrakkurtstd::alloc::Allocator when?
- krautsauer[Edit:deleted]
- squirrellousI wonder from time to time whether you can decide the best “schema shape” beforehand, ie before you can run real workloads that stress the memory implications of such things. This can be very useful if you are trying to decide the boundary of some public facing API, but for whatever reason can’t run benchmarks (lack of impl, data, time, etc).Without that, if you try to suggest a transformation like this when the schema is first conceived, it will likely be considered premature optimization.
- donk8r[dead]
- WhyNotHugoTLDR: use a nullable pointer instead of fields in nested structs to save memory.
- ossianericson[dead]
- TranspectiveDev[dead]