Comment by alexvitkov

3 months ago

Thanks for the write-up. My biggest fear is not references, overloads or memory management, but rather just the layout of their structures.

We have this:

    sizeof(String) == 24
    sizeof(Option<String>) == 24

Which is cool. But Option<T> is defined like this:

    enum Option<T> {
       Some(T),
       None,
    }

I didn't find any "template specialization" tricks that you would see in C++, as far as I can see the compiler figures out some trick to squeeze Option<String> into 24 bytes. Whatever those tricks are, unless rustc has an option to export the layout of a type, you will need to implement yourself.

2 comments

alexvitkov

vlovich123 3 months ago

You don’t need to determine the internal representation as long as you’re dealing with opaque types and invoking rust functions on it.

As for the tricks used to make both 24 bytes, it’s NonNull within String that Option then detects and knows it can represent transparently without any enum tags. For what it’s worth you can do similar tricks in c++ using zero-sized types and tags to declare nullable state (in fact std::option already knows to do this for pointer types if I recall correctly)

ithkuil 3 months ago

Yeah currently "niche optimization" is performed when the compiler can infer that some values of the structure are illegal.

This can be currently done when a type declares the range of an integer to not be complete with the

rustc_layout_scalar_valid_range_start or _end attribute (requires #![feature(rustc_attrs)])

In your example it works for String, because String contains a Vec<U8> which inside contains a capacity field of type struct Cap(usize) but the usize is effectively constrained to contain values from 0..=max_isize

The only way for you to know that is to effectively be the rustc compiler or be able to consume it's output