lccc

LCRust ABI, Version 0

This document is not yet published in a stable form and is subject to change

Preamble

This document is a normative specification of version 0 of the lcrust ABI. This is not an abi defined by the rust lang team, and should not be considered portable to all rust implementations.

For all purposes, you can rely on this ABI or a future version when compiling using lccc. Other implementations of rust may adopt this specification as well, at their option.

This document is intended to mirror guarantees provided by the rust language team, and the Unsafe Coding Guidelines, in terms of abi. If a difference is observed, then for all purposes, this specification will override both the language team and the Unsafe Coding Guidelines while the particular version is in use, and a new version will be prepared and released with the correct ABI as required by both sources.

As the layout and abi of scalar types, types with the repr(C) layout, and types with the repr(transparent) layout are well defined, none of those types are documened here.

Crate ABI Selection

  1. An implementation shall support this version and any other implementation-defined versions.
  2. An implementation should provide a mechanism to select it between the in-use version.
  3. An implementation may use an implementation-defined version for proc-macro crates, and for the proc_macro standard library, even if a selection mechanism used above would select a different version.

lccc

This section is not normative

When building a crate, you can pass -Zbuild-abi=0 (or -frustc-abi=0) to force lccc to build against abi version 0. This requires the standard library to be built with abi version 0.

To detect incompatibilities from version 0 to the latest version, -W abi-incompatibility=0 (or -D abi-incompatibility=0 for deny) or -Wabi-incompatibility=0 (or -Werror=abi-incompatibility=0) can be passed, or #[warn(lccc::abi_incompatibility_0)] (#[deny] likewise). To detect incompatibilities with prescribed layout and abi requirements with those included in this specification, -W lang-abi-incompatibility or -Wlang-abi-incompatibility may be used. This is enabled by default when -Wabi-incompatibility is used.

It is an error to depend upon an rlib or dylib crate built with a different abi version. The implementation shall issue a diagnostic if such is requested. When linking against a staticlib crate built with a different abi version, other than a #![no_std] crate that does not contain an extern crate declaration naming std, the behaviour is undefined if either crate dynamic links against std.

When linking a single crate with multiple source files, the behaviour is undefined if the source files were built with a different abi.

A crate may opt-out of the abi guarantees herein by passing -Z repr-rust-layout=randomizeor -frust-struct-layout=randomize to the compiler command line.

When building a proc-macro crate, all functions declared #[proc_macro], #[proc_macro_derive], and #[proc_macro_attribute] will use the function call ABI of the latest implemented version (otherwise the abi will follow the selected version).

Recommened Extensions

An implementation of this ABI is recommended to provide each of the extensions defined by this section.

Feature Tests

The implementation should define the lcrust_abi and lcrust_abi_v0 features. An implemenation of any version of this ABI should define the lcrust feature, but shall not define the lcrust_v0 feature. This feature, if defined, shall no effect on the well-formedness or semantics of the program, and merely exists to allow the a user crate to assert dependance on this ABI, without depending on non-portable details.

The feature lcrust_layout may be defined by the implementation. If so, it shall permit the use of the repr(lcrust_v0) and repr(lcrust) attributes applied to structs, unions, and enums. Both shall be equivalent to repr(Rust) as described by this ABI. [Note: The difference between them is that the former in all future versions of this ABI refers to the layout from this version, version 0, and the latter refers to the layout from the current version and is subject to change in accordance with the applicable abi version.]

The feature lcrust_abi_call may be defined by the implemenation. If so, it shall permit the use of the extern "lcrust" and extern "lcrust-v0" which have the same call ABI (but a distinct type to) extern "Rust" as defined by this ABI [Note: like repr(lcrust_v0) and repr(lcrust), extern "lcrust-v0" refers to the call abi defined for extern "Rust" by this version, and extern "lcrust" refers to the call abi defined by the version in use.]

Type Layout

In this section, if a type is said to have the same layout as some other type, unless otherwise stated, it also has the same ABI as that type for the extern "C", extern "Rust", extern "cdecl", extern "stdcall", extern "fastcall, extern "thiscall", extern "vectorcall", extern "win64", extern "sysv64", extern "aapcs", extern "lcrust", and extern "lcrust-v0" abis. If an implementation defines other platform-specific or unstable abis, whether or not types explicitly defined to have the same layout herein also have the same ABI for those abis is unspecified.

!

The type ! shall have the same layout and ABI as (). A function returning ! shall have the same abi as a _Noreturn function declared in C with a return type of void.

repr(transparent) Structures

Each field of a repr(transparent) structure shall occur at offset 0, regardless of order in the type declaration.

repr(Rust) Structures

The layout of a repr(Rust) structure shall be as follows:

repr(Rust) Unions

A union declared repr(Rust) shall be equivalent to a union declared repr(C).

[Note: This implies that each field of a repr(Rust) union starts at offset 0.]

repr(Rust) enums

DST Pointer Layout

Trait Object Vtable Layout

The vtable of any trait object has the following layout:

Trait Object Vtable Definition

The VTable for a trait impl of an object-safe trait shall be emitted as follows:

The Fields of each SingleTraitVtable in VTable<T> for U shall be populated as follows:

The Creation of the vtable of a type shall ODR use it’s destructor, and each function of the trait impl that is declared with a reciever, even if that destructor is trivial.

A trait impl for an unsized type does not generate vtables, including for non-generic trait impls.

When a crate consists of multiple translation units, for the purposes of vtable generation, the translation unit that defines a non-generic implementation is the one that contains the definition of the first non-generic function not bounded by Self: Sized that is not defined with an #[inline] attribute, other than #[inline(never)], or, if no such function exists, an unspecified translation unit that is part of the crate.

The name of a vtable for a trait impl is given in Name Mangling.

Tuple Layout

char type

Closure Type Layout

[ Note: Given the following code snipet

    let x = 5;
    let y = String::new();
    let mut z = Box::new(1337);
    let mut c = ||{
       *z = 42;
       println!("{}: {}",{x},y);
    };

The captures of c have types i32, &String, and &mut Box<i32>, respectively. ]

Future and Generator type Layout

The future type produced by an async function or async block is given as a repr(Rust) enum. The discriminant values are given in ascending order starting from 0 and no variant is considered to have a user-provided discriminant. The number of variants is the 1 plus the total number of await points, and is ordered in source order within the block, left-to-right within subexpressions of a complete statement (reguardless of evaluation order of those subexpressions).

The nth variant type has the same layout as a closure type that captures under ordinary capture rules each local variable that:

For the purposes of this section, generator types produced by a generator closure are considered the same as the future type described in this section, with each yield expression replaced with an await point on a future that returns the resume arguments.

[Note: Source Order is agnostic of actual evaluation order. In particular, given the following:

let a: Future<Output=&mut i32>, b: Future<Output=i32>; 
async{
    *a.await = b.await; 
}

the first variant correponds to the entry point of the async block, the second to the LHS of the assignment (a.await), and the third to the RHS of the assignment, however the execution order of variants is 0, 2, 1. ]

An await expression shall initialize the future to the variant declared by that await point prior to returning from the poll funtion, with each capture initialized to the appropriate variable.

DST by-value parameters

A parameter that has a !Sized type shall be passed as an invisible pointer to the parameter. The pointer has the layout and ABI defined in DST Pointer Layout.

Special Function Generation

Inline functions

Any function declared with an #[inline] attribute, other than the #[inline(never)] attribute, is not generated into a translation unit by default. Instead, any translation unit that ODR-uses the function, except in some implementation-defined set of uses that result in a function call, such that the call is inlined and the final object file contains no actual call to the function, generates a copy of the function, and places it in a COMDAT linkonce group, such that each final link target contains exactly one copy of the function.

If the function has a force-definition attribute (#[no_mangle], #[link_name], #[export_name], or an implementation-specific force definition attribute) and an #[inline] attribute other than #[inline(never)], then a defintion as above is generated in the translation unit that contains the function, regardless of whether or not an odr-use is contained

Destructor Generation

For the Purposes of this section, a type T has non-trivial destruction if it is any of the following:

And a type T has trivial destruction if it does not have non-trivial destructions.

[Note: PhantomData<T> has trivial destruction wrt. this section regardless of T]

The destructor of a type T is odr-used in the following contexts if T has non-trivial destruction:

The destructor of a type T is odr-used in the following contexts regardless of whether T has non-trivial destruction or trivial destruction:

When a destructor of type T is odr-used, it may cause implicit generation of the destructor, as folows:

The generation of a destructor instantiates the generic impl of the core::ops::Drop trait for the type, if any.

#[track_caller] shims

Track Caller shims shall be generated in the following contexts:

The following guarantees are made about #[track_caller] shims generated by vtables:

Standard Library Type, Guaranteed Layouts

The standard library implementation shall guarantee layout and abi compatibility of all of the following groups of types:

The Standard library implemention shall guarantee layout, but not necessarily ABI, compatibility of all of the following groups of types:

The layouts of the following standard library structures are restricted to the following:

Beyond these rules, implementations SHOULD ensure that when breaking abi changes occur in the standard library, an error is issued when linking an rlib or dylib target built against an earlier abi version of the standard library. There are no further restrictions on the layout or ABI of types in the standard library

Function ABI

Parameters that have size 0 are ignored for the purposes of function call ABI. Return values that have size 0 shall be equivalent to a function returning void declared in C.

extern "Rust" is treated identically to extern"C", except as follows:

The extern "rust-call" abi is defined as follows:

The implementation is recommened to define two additional ABIs, extern "lcrust" and extern "lcrust-v0", behind the lcrust_abi_call feature. These ABIs, if defined, shall have the same behaviour as extern "Rust"

Symbol Identity

Any external symbol given a name under this ABI that refers to an allocatable object is guaranteed to be distinct from any other external symbol that refers to an allocatable object.

An allocatable object is either a function, or an object (including promoted constant, vtable, string literal, or other object that is not directly nameable in rust source) that has a non-zero size.

Further any static or function declared in rust source that is not given an external name (IE. private and not referenced by an inlineable or generic function) is guaranteed to have a distinct address from any allocatable object.

Special Symbol Definitions

There shall be a function declared with external linkage, which is the mangled form of the function core::intrinsics::caller_location() (Mangled as _ZNSt10intrinsics15caller_locationEv), which shall have the signature extern"Rust" fn()->&'static Location<'static> and which shall be defined as though with the #[track_caller] attribute.

If a program links std, test, or proc_macro, there shall be a static defined with weak linkage, which is the mangled form of the static std::alloc::__global_allocator. It shall have type &'static dyn std::alloc::GlobalAllocator. If a program defines the symbol with the above name, then regardless of whether the program links std, test, or proc_macro, calls to the global allocator (including via alloc::alloc::alloc and alloc::alloc::Global) shall result in a call to the vtable function provided by that symbol, using the data pointer provided by that symbol. The behaviour is undefined if the symbol is defined by the program with weak linkage, and the progam links std, test, or proc_macro.

Name Mangling

The LCRust ABI uses itanium mangling for names. The first component of the mangled name is the crate name, optionally with a trailing disambiguator, which is implementation-defined, except that none of thecore, std, alloc, or proc_macro crates shall have a trailing disambiguator.

It is extended as follows:

Destructor Name

The name for the destructor shall be the name for the Complete Object Destructor under the Itanium C++ ABI.

Inherent impls

Inherent impls use the mangling of path to the type, or the mangling of the non-marker trait for inherent impls for dyn Trait types.

Primitive inherent impl

The mangling of the inherent impls for the following primitive types are given as follows:

Trait impl Item

The mangling of a trait impl is

<unqualified-name> := .II<trait name>$<impl type>__

<unqualified-name> :=.II<trait name>$<impl type> _ <seq-name>_

If multiple generic impls exist in a given scope for a particular trait and particular type, the first such impl in declaration order in that scope is mangled using the first form, and subsequent impls are mangled using the second form. The trait impl is qualified in the scope it is defined in

The VTable for a trait impl is

<special-name> := VT<impl name>

Destructor name

The generated destructor of a struct, enum, or union type is the complete-object destructor for the path.

Otherwise, the mangled name of the destructor is as follows:

<non-class-type-path> := .NC<non-class type>

<special-name> := Z<non-class-type-path><ctor-dtor-name>

Where the complete-object destructor is used always.

Unnamed Bindings

A const or static with the name _ is an unnamed binding.

The name of such an item is as follows:

<unnamed-binding-name> := .Uv_ <unnamed-binding-name> := .Uv<seq-id>_

<unqualified-name> := <unnamed-binding-name>

The first unnamed binding in a particular scope recieves the name .Uv_. Subsequent unnamed bindings use the form with a seq-id.

Anonymous Block Scope Names

For the purpose of mangling local names, an anonymous block scope (for example, a block expression used as the initializer for a static/const, or in a type) is as follows:

<anon-encoding> := <data encoding>.LD_ <anon-encoding> := <data encoding>.LD<seq-id>_ <anon-encoding> := <type or template name>.LT_ <anon-encoding> := <type or template name>.LT<seq-id>_

<local-name> := Z<anon-encoding>E<entity name>[<discriminator>] <local-name> := Z<anon-encoding>Es[<discriminator>]

When the block appears in the initializer or type of a static or const, the form with .LD is used. The data encoding is the encoding of the static/const that contains the type The first top-level block that contains a mangled name in the initializer uses the form with a seq-id, subsequent top-level blocks use the latter.

When the block appears in a generic parameter for a type, the form with .LT is used. The type or template name is the name of the type or template is the type appears in the declaration of the type or an inherent impl naming the type. If the block appears in a trait impl, the trait impl name is used instead. If the block is introduced in a generic argument to the type, then the untemplated name is used. Otherwise, the full name of the type is used (for example, if the block appears in a where clause).

Any block introduced in the type of the function (including a where clause) is mangled as though it is contained within the function itself, except that the signature of the function is ignored for declarations. It is mangled using data encoing

[Example 1 (crate name example):

pub static FOO: usize = {
    pub struct Bar;
    0
}

Bar has the encoding _ZN7example3FOO.LD_E3Bar ]

track_caller shims

If a function defined with the #[track_caller] attribute is used as a function pointer type, or in the instantiation of the vtable of a trait implementation, where the trait declaration does not declare the defined function with the #[track_caller] attribute, a shim shall be emitted.

The first shim in a particular instantiation location shall be of the form.

<shim-name> := <function name>.CL<instantiation location encoding>__

<shim-name> := <function name>.CL<instantiation location encoding><seq-id>

<special-name> := <__shim-name__>

The first shim in a given instantiation location has no seq-id, the second shim in a given location has a seq-id of 0, etc.

The instantiation location of a shim is as follows:

[Note: the seq-id for instantiations are unique for the location, not the shimmed function.]

[Note: For example, in this code (for a crate name of test)

#[track_caller] fn bar(){}
#[track_caller] fn baz(){} 
fn foo(){
    let _: fn()->() = bar;
    let _: fn()->() = baz;
}
static FOO: fn()->() = bar;

There are 3 shims created:

  1. _ZN4test3barEv.CLNS_fooEv_ (test::bar()->() {shim 0 for test::foo()}),
  2. _ZN4test3bazEv.CLNS_3fooEv0_ (test::baz()->() {shim 1 for test::foo()}), and
  3. _ZN4test3barEv.CLNS_3FOOE_ (test::bar()->() {shim 0 for test::FOO)) ]

The track_caller shim shall have the same abi as the the function type unless the function has an abi not specified by the Rust Programming Language or this ABI Specification, in which case the abi of the shim is unspecified. The parameter and return types of the shim shall be the parameter and return types of the function type. Calling the shim shall call the underlying function with the parameters to the function, and the implicit source location parameter being a static reference to a Location<'static> that represents the location of the trait method definition or the use of the function name that was coerced to an fn-pointer.

Edition Specific Names

Edition Specific Names (names of the form edition#name) are mangled with a suffix:

The fully mangled name of the non-specific name is mangled as follows:

<edition-specific-suffix> := .DE<edition number>__
<edition-specific-suffix> := .DE<edition number>_<number>_

The number at the end is given by the level the edition specific name applies to. levels are from last component forward, and ignore non-name components (such as template argument lists). The first level is mangled without a number, the second level is mangled with the number 0, third with 1, etc.

[Note: For example, in the following code in the crate example:

pub fn edition2021#foo(){}
pub mod edition2018#bar{
    pub fn baz(){}
}

foo has the name _ZN7example3foo.DE2021__Ev and bar::baz has the name _Zn7example3bar3baz.DE2018_0_ ]

Async Block/Fn Future type

The type of an Async block is given as follows:

<local-name> := Z<containing function encoding>.AS_
<local-name> := Z<containing function encoding>.AS<seq-id>_
<local-name> := Z<containing data encoding>.AS_
<local-name> := Z<containing data encoding>.AS<seq-id>_
<local-name> := Z<containing type anon-encoding>.AS_
<local-name> := Z<containing type anon-encoding>.AS<seq-id>_

containing function or containing data is the declaration that contains the async block, the function encoding in the case of a block within a function scope, and the static or const in the case of a block within the initializer of a static or const. In the case of a block within a type, anon-scope encoding is used.

The first async block in declaration order produced within a given context uses the form without the seq-id. Subsequent ones use the form with seq-id.

The type returned by an async function uses a different mangling:

<local-name> := Z<function encoding>.AF_

function is encoding of the function that is declared async. There can only be one such type per function.

The name of any object (including the content of a string literal or byte string literal, or a constant value subject to const-promotion) produced in the initializer of a static item is given by the lifetime-extended temporary name in the Itanium C++ ABI.

The name of any object produced in the initializer of a named const item for which the address is taken (a named promoted constant) is given as follows:

  1. If only one such object is created, the name is given as though the const item is replaced with a static item with the same name, and the reference in both the type and the initializer are removed, except that the static is permitted to have an unsized type.
  2. If there is more than one such object, then the name of each object is as if the entire const item is replaced with a static with the same name, type, and initializer. (The an object produced in the initializer of an unnamed const item cannot be accessed, except during the evaluation of the initializer, and thus is not given a name)

The name of any object produced in the intializer of an associated const item is given as though the associated const item was a named const item with the same name, but is scoped by:

  1. In the case of an inherent impl block or a default initializer of a trait definition, the name of the trait or type,
  2. In the case of a trait impl, the name of the trait impl.

If the const or static is non-generic, and is not the default initializer of an associated const in a trait definition, or an associated const defined in a generic impl block, then it is emitted in the translation unit in which the item is defined. If the const or static is generic, or is either the default initializer of an associated const in a trait definition, or an associated const defined in a generic impl block, then it is emited in each translation unit that odr-uses the item, and each definition is emitted in a COMDAT group, such that each final link target has at most one unique copy of each object.

Each allocatable object produced in the initializer of a const item within a final link target must be defined in the same translation unit, and each allocatable object produced in the initializer of a static within a final link target must be defined in the same translation unit as the object produced by the static item.

A const item is odr-used if:

(In particular evaluating it in a constant evaluated const fn without returning the value to initialize a const/static item, or using the const in a generic argument, does not odr-use it)

A promoted constant evaluated in any other context is not guaranteed to have a unique address or a name given by this ABI (but is still considered an object).

Complex Expression Mangling

Expression Syntax is used for mangling const generics. Currently, only literals and direct template parameters (possibly enclosed in braces) are mangled, but the const_generic_adts and generic_const_exprs allow complex instantiation-dependant expressions.

Expressions used in where bounds for generic_const_exprs are not mangled

Tuple Initializers

Tuple, tuple-struct, and tuple-variant initializers should use type(init,...) mangling (cv<type>_<expressions>*E), with the following exceptions:

In the case of a builtin tuple type, use the tuple type for the type encoding. In the case of a tuple struct, use the tuple-struct type. In the case of a tuple-variant, use the name of the tuple-variant.

[Examples:

// crate foo;
#![feature(generic_const_exprs,const_generic_adts)]
pub struct Foo<T>(T);
impl<T> Foo<T>{
    pub const fn value(&self) -> usize{  }
} 

pub fn foo<T: StructuralEq + Copy, const V: T>() -> [(); {Foo::<T>(V).value()}] where [V; {Foo::<T>(V).value()}]: {
    [(); {Foo::<T>(V).value()}]
}

let x = foo::<(),()>();

has mangling _ZN3foo3fooIu4unitXu4unitEAcldtcv_ZN3foo3FooIT_ET0_u4uint5value ]

Struct Initializer

Struct and struct-variant initializers should use type(init) where init is the braced-init-list (il<braced-expr>*E). Each braced expr should use the name, or in the case of tuple, array index, for each field initialized in order. If the initializer has a trailing ..<expr> it should be mangled using the vendor extended unary operator v116dotdot followed by the expr.

As Cast

As casts should be mangled as the corresponding C++-style cast expression based on the type of conversion performed, according to the following table:

Not

When Not is applied to an expression with a non-dependant bool type, it uses encoding for logical not (nt<expr>). In all other cases (when applied to an integer type, or an expression of a dependant type) it uses encoding for bitwise not (co<expr>).

Simple Blocks

A simple block (An unsafe, const, or normal block with no statements other than items and the return expression) is mangled as the constituent expression. The kind of block ({} vs. unsafe {}) has no effect on mangling. If the block has no expression, then it is mangled as the constructor for ().

[Note: This allows {x}, unsafe{x}, and x to be mangled identically]

panicking

Panicking Runtime

The follow symbols shall be defined by the implementation. All functions in this section are defined extern "Rust":

The following symbols shall be defined by the implementation, when any panic is permitted to unwind. It is implementation-defined if they are defined when a panic is not permitted to unwind. All functions in this section are defined extern "Rust".

Itanium Unwinding

Foreign Exceptions

Multiple Crate TUs

A crate may consist of multiple translation units, divided as the implementation chooses from the input files. Aside from this section, there are no limits on what may be included in each translation unit, provided all entities with external linkage that this abi requires are generated

dylib ABI Info

Each dylib crate shall be built by the compiler to include a section, which shall not be loaded at runtime, shall have the name .note.lcrust.build-info, and shall have an alignment within the image that equals or exceeds 8 bytes.

The section shall contain a structure as follows:

#[repr(C,align(8))]
struct ABIInfo{
    abi_ver: i64,
    compiler_name_and_version: u32,
    codegen_opts: u32,
    crate_name: u32, 
    padding: [u16;1],
    extra_length: u16,
    extra: [Extra;extra_length]
};

The Extra structure shall be declared follows:


#[repr(C,align(8))]
pub struct Extra{
   e_type: u32,
   e_size: u16,
   e_bytes: [u8]
}

Where e_bytes is e_size trailing bytes, and e_type is an index in the binaries dynamic string table which is a null terminated multibyte string encoded in UTF-8 which represents the type of the Extra field. The valid type names, and there meanings, are unspecified. If a type name is not understood, the particular Extra field shall be ignored.