arm-software / abi-aa Goto Github PK
View Code? Open in Web Editor NEWApplication Binary Interface for the Arm® Architecture
License: Other
Application Binary Interface for the Arm® Architecture
License: Other
AAPCS64 says:
At all times the following basic constraints must hold:
Stack-limit ≤ SP ≤ stack-base. The stack pointer must lie within the extent of the stack.
A process may only access (for reading or writing) the closed interval of the entire stack delimited by [SP, stack-base – 1].
But we ought to allow writes below SP for stack probing purposes (while still being careful to avoid the implication that there's some kind of red zone).
It has been a while since we did a sanity check on the individual ABI documents.
Lets do a review pass over them and check for things like:
Small things we can fix as part of this issue. For bigger things, create a new issue.
R_MORELLO_JUMP_SLOT is under-documented; in reality it is the same as R_AARCH64_JUMP_SLOT, with the linker initially filling in the 64-bit VA in the first half of the slot (pointing at the PLT header) just like with R_AARCH64_JUMP_SLOT (though this is awkward for the run-time linker, and would be better if it had the same in-memory format as R_MORELLO_RELATIVE so bounds can be provided by the static linker; this aligns with R_AARCH64_JUMP_SLOT, which for lazy binding is initially resolved identically to R_AARCH64_RELATIVE).
Document AArch64 PLT sequences, include BTI and PAC options
Move the AArch64 Feature bits and dynamic tags as they are more suited to the SYSVABI than AAELF64.
The content of some images is not shown correctly when using GitHub's dark mode.
The following example is from abi-aa/aapcs64/aapcs64.rst:
Section headers have inconsistent capitalisation.
One example is Section 3: Introduction And Scope
The table in section 10.3 of the EHABI for AArch32 that describes the unwind codes contains references of the form "remark c", but the list of remarks is now using a numeric sequence rather than an alphabetic one. The two need to be reconciled.
What are these supplements? We could at least reference one or two.
Section 1.6.3 Transferring Control to a Landing Pad in https://itanium-cxx-abi.github.io/cxx-abi/abi-eh.html describes that there are some registers which are not callee-saved but still restored on transfer of control to a landing pad.
For AArch64 I believe these registers are x0-x3. These are at least the registers that the libgcc unwinder restores outside of the callee-saved registers.
The decision of what these registers contain and whether to use a subset or all registers can be made by a personality routine and landing pad without affecting other binaries (since this communication happens right at the moment of transfer of control).
Hence it would not be ABI that we need to specify.
I would expect that the limitation that we restore only registers x0-x3 (which is applied in the platform unwinder) should be recorded somewhere.
I guess somewhere in https://github.com/ARM-software/abi-aa/blob/main/cppabi64/cppabi64.rst#id38 ?
It has been a while since we did a sanity check on the individual ABI documents.
Lets do a review pass over them and check for things like:
Small things we can fix as part of this issue. For bigger things, create a new issue.
Both clang and GCC leave out the DW_AT_location
for TLS variables ostensibly due to lack of a relocation to describe the location in the debug data.
For x86_64 DW_AT_location
is emitted as:
DW_AT_location (DW_OP_const8u 0x0, DW_OP_form_tls_address)
With the relocation:
R_X86_64_DTPOFF64
to the symbol definition.
In AArch64 the closest equivalent is R_AArch64_TLS_DTPREL
although it is marked as a dynamic relocation (normally used for the GOT entry created by the global-dynamic in the traditional dialect). If we marked this relocation as static and dynamic (and made sure that static linkers could handle it in a static context), then we could support DW_AT_location
LLVM review that removed the DW_AT_location
https://reviews.llvm.org/D43860
For example:
__thread int foo;
int main(void) {
return foo;
}
With clang -g -O2 --target=aarch64-linux-gnu
0x00000023: DW_TAG_variable
DW_AT_name ("foo")
DW_AT_type (0x0000002b "int")
DW_AT_external (true)
DW_AT_decl_file ("/path/to/tlsdbg.c")
DW_AT_decl_line (1)
With an x86_64 target
0x00000023: DW_TAG_variable
DW_AT_name ("foo")
DW_AT_type (0x00000036 "int")
DW_AT_external (true)
DW_AT_decl_file ("/path/to/tlsdbg.c")
DW_AT_decl_line (1)
DW_AT_location (DW_OP_const8u 0x0, DW_OP_GNU_push_tls_address)
The x86_64 relocs are
Relocation section '.rela.debug_info' at offset 0x400 contains 5 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000000008 000000030000000a R_X86_64_32 0000000000000000 .debug_abbrev + 0
0000000000000011 000000040000000a R_X86_64_32 0000000000000000 .debug_str_offsets + 8
0000000000000015 000000070000000a R_X86_64_32 0000000000000000 .debug_line + 0
000000000000001f 000000060000000a R_X86_64_32 0000000000000000 .debug_addr + 8
000000000000002d 0000000a00000011 R_X86_64_DTPOFF64 0000000000000000 foo + 0
Note the R_X86_64_DTPOFF64
The AArch64 TLS sequences are designed so that a static linker can relax the model when it knows certain information. For example when linking an executable and the definition is known the Initial Exec model can be relaxed to the Local Exec model. These sequences should be documented so that code-generators can take advantage of linker TLS relaxation, and avoid generating code-sequences that a static linker may incorrectly relax.
The sysvabi is the most appropriate place for this documentation as TLS requires runtime support.
It has been a while since we did a sanity check on the individual ABI documents.
Lets do a review pass over them and check for things like:
Small things we can fix as part of this issue. For bigger things, create a new issue.
It has been a while since we did a sanity check on the individual ABI documents.
Lets do a review pass over them and check for things like:
Small things we can fix as part of this issue. For bigger things, create a new issue.
It has been a while since we did a sanity check on the individual ABI documents.
Lets do a review pass over them and check for things like:
Small things we can fix as part of this issue. For bigger things, create a new issue.
Cloning git://github.com/rst2pdf/rst2pdf (to revision d9bf8cd737c11cfc572c936173228c1103344fdf) to /tmp/pip-req-build-is58szgo
Running command git clone --filter=blob:none --quiet git://github.com/rst2pdf/rst2pdf /tmp/pip-req-build-is58szgo
fatal: remote error:
The unauthenticated git protocol on port 9418 is no longer supported.
Please see https://github.blog/2021-09-01-improving-git-protocol-security-github/ for more information.
The fix for issue #126 added a cross reference to the C & C++ language bindings, but the result is now misleading. The original text was intended to permit other languages to have their own rules for bit-field layout, but the text now could be read as implying that the C & C++ rules must apply to other languages as well.
It think the best solution is to move the link to a footnote, with the text that points to the C/C++ bindings. eg
[footnote] The C & C++ layout rules for bit-fields are defined in [xref].
This would replace the text added in #126
the spec recommends special syntax to mark purecap functions, but the assembler should know which functions are purecap already for interworking, so it can generate CFI based on that (which is more reliable and consistent with existing practice: the meaning of the cfi directives depend on the target arch and abi settings).
if we assume that base lp64 abi functions and purecap abi functions can be mixed on a call stack and there are no further capability abi variants then there is no need for special syntax (it can just cause trouble in asm code when the target abi setting and cfi directive are inconsistent).
C23 (most recent public draft: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3054.pdf ) defines "bit-precise integer types": types _BitInt(N)
and unsigned _BitInt(N)
with a given width. The Arm ABI (both AAPCS32 and AAPCS64) needs to define the ABI for these types; see the x86_64 ABI https://gitlab.com/x86-psABIs/x86-64-ABI for an example.
This means specifying the size, alignment and representation for objects of those types (including whether there are any requirements on the values of padding bits in the in-memory representation), and the interface for argument passing and return (including, again, any requirements on padding bits - both padding bits up to the size of an object of that type, and any further padding beyond that within the size of a register or stack slot used for argument passing or return).
"toolchain" is not consistent across the document. Sometimes it's "tool-chain", others it's "tool chain".
It should be made consistent.
The Morello ABI specifies that at most 8 arguments of a C function can be passed explicitly in registers. Any additional arguments must be ‘spilled’ onto the stack and passed implicitly to the callee. This ABI assumes that the caller and the callee share the same stack.
I propose a new ABI that removes this assumption, thereby enabling the callee to execute on a different stack from the caller's if desired.
Before the introduction of the new varargs ABI (#112), variadic arguments were also passed via the stack. At present, they are passed via a buffer pointed to by c9
. Thus, variadic functions no longer need to scan through the stack to retrieve its variadic arguments, which used to be a source of spatial safety violations.
The current ABI necessarily exposes the caller's stack pointer to the callee. Under situations where the callee is untrusted code, this poses a security risk as the callee has the power to corrupt the caller's stack frames. The proposed ABI does not mandate that the argument buffer be allocated on the stack, allowing flexibility in its location depending on the level of security needed. For example, if the callee is trusted by the caller, the buffer may well be allocated on the stack. Otherwise, the buffer can be allocated on a single-use page that is deallocated by the caller as soon as the callee returns.
I propose that the spilled arguments be passed via the same buffer used for passing variadic arguments. Here's a sketch of the implementation:
va_start
macro can thus be implemented to correctly set the va_list
variable.We aim to split the transition into two phases to increase compatibility. Two clang flags (-morello-bounded-memargs
and -morello-bounded-memargs=caller
) are added.
-morello-bounded-memargs=caller
-morello-bounded-memargs
c9
buffer is sufficiently large, its bounds become imprecise and may range over data that the caller does not intend to pass to the callee. The compiler should detect such situations and allocate the buffer at a representable alignment.va_start
macro sets the va_list
capability, it would be desirable to have that capability's lower bound increased past the region containing the spilled arguments. A carefully designed padding scheme for step 2 above will be needed to make sure that this is the case despite any address rounding.An alternative to this approach is to use ‘directed capabilities’ proposed by Georges et al.. This requires hardware and ISA changes though.
A less disruptive approach is to maintain the current ABI but use a certain intermediary to copy the spilled arguments to the other stack when the callee needs to be executed on another stack. The challenges with this approach are that a) it is slower and b) it is unclear who this intermediary should be and how such an intermediary can be implemented for all possible function signatures and the resulting sets of arguments that need to be copied over.
In the arm64e document from Apple there is a description of the keys are assignments [1].
I would be useful if the pauthabielf64 document could do the same so various parts of the system know what to expect from a key, e.g. is it per-thread, shared within a process, or global across all processes.
Some parts of ELF are described in the generic ELF specification (http://www.sco.com/developers/devspecs/gabi41.pdf) as being described in specific sections of the processor supplement, but the aaelf32 and aaelf64 documents don't describe them, or do describe them but not in the section the gabi says. The places I've noticed this are:
e_flags
This member holds processor-specific flags associated with the
file. Flag names take the form EF_machine_flag. See ‘‘Machine
Information’’ in the processor supplement for flag definitions
We describe the e_flags field, but not in a section named "Machine Information".
.got
This section holds the global offset table. See ‘‘Coding Examples’’
in Chapter 3, ‘‘Special Sections’’ in Chapter 4, and ‘‘Global Offset
Table’’ in Chapter 5 of the processor supplement for more informa-
tion.
It looks like we say some things about the GOT in the section "Proxy Generating Relocations" in both aaelf32 and aaelf64, but don't have a specific "Global Offset Table" section. AC6 armlink and aarch64-none-elf-ld both appear to place the address of the .dynamic section at the start of the GOT, but I don't know if that's something that should be required by the ABI or if they just happen to do this (if so we should have some wording saying that a toolchain can put toolchain-specific stuff in the GOT).
.plt
This section holds the procedure linkage table. See ‘‘Special Sec-
tions’’ in Chapter 4 and ‘‘Procedure Linkage Table’’ in Chapter 5 of
the processor supplement for more information.
aaelf32 describes the plt in "PLT Sequences and Usage Models", aaelf64 describes it in "Program Linkage Table (PLT) Sequences and Usage Models" (though that's an example and the section says the platform standard should define it) but neither has a specific "Procedure Linkage Table" section.
DT_PLTGOT
This element holds an address associated with the procedure link-
age table and/or the global offset table. See this section in the
processor supplement for details.
AC6 armlink appears to set this to the address of the .got section, aarch64-none-elf-ld appears to set this to the address of the .got.plt section, but neither aaelf32 nor aaelf64 say anything about what it should be set to.
It has been a while since we did a sanity check on the individual ABI documents.
Lets do a review pass over them and check for things like:
Small things we can fix as part of this issue. For bigger things, create a new issue.
Re "Procedure Call Standard for the Arm Architecture", section "8.1.5 Volatile Data Types".
The final paragraph appears to be poorly worded, and read pedantically, allows the use of LDM & STM instructions for volatile accesses on processors even where that is not correct. I believe on some Cortex-M series, those instructions need to be avoided for volatile accesses, as they may be restarted after an exception and repeat an access.
The paragraph contains the sentence
The only guarantee applying to volatile types in these circumstances are that each byte of the type shall be accessed exactly once for each access mandated above, and that any bytes containing volatile data that lie outside the type shall not be accessed."
However, the "circumstances" is, in my reading, given by the previous sentence (regarding restrictions on bus width), and so the requirement applies only for restricted bus width.
For 32-bit volatile accesses on a 32-bit bus, the "accessed exactly once" then does not apply and nothing in the text stops the compiler from using LDM and STM incorrectly. For a 64-bit volatile access, the text appears to be recommending LDM / STM, as these satisfy the final sentence.
I think two changes are warranted:
As ARM pointed out in private email, we do not want to rule out LDM / STM on processors where they are safe. (E.g., volatile int64_t access on a Cortex-A series.) I presume a compile time decision not a runtime decision! Possibly even on Cortex-M we should allow LDM / STM for Cortex-M 64-bit accesses, and leave it to the programmer to split into 2 × 32-bit where necessary?
(Another corner case on the instructions used is based on the data-type. On a 32-bit processor with 64-bit floating point, one option is to use FP instructions for volatile uint64_t access. I'm guessing we don't want to end up requiring compilers do that on DPFP Cortex-M? Or do we....)
I think we should convert the tables that track the changes in the ABI specifications in sections + paragraphs. At the moment, we see changes even in the rows describing the previous versions just because we need to reformat the table. I think this could be error prone (changes could be missed).
As an example of what I mean for "reformatting", see this PR:
The link [ACLE] in aapcs32.rst leads to https://developer.arm.com/products/software-development-tools/compilers/arm-compiler-5/docs/101028/latest/1-preface, which currently gives a 404: Not Found error.
It has been a while since we did a sanity check on the individual ABI documents.
Lets do a review pass over them and check for things like:
Small things we can fix as part of this issue. For bigger things, create a new issue.
It has been a while since we did a sanity check on the individual ABI documents.
Lets do a review pass over them and check for things like:
Small things we can fix as part of this issue. For bigger things, create a new issue.
The main branch of the abi-aa repo is still called master. I'm planning to migrate the repo from 'master' to 'main' in about a week, and remove all references to 'master'.
It has been a while since we did a sanity check on the individual ABI documents.
Lets do a review pass over them and check for things like:
Small things we can fix as part of this issue. For bigger things, create a new issue.
5.7.8 Group Relocations in aaelf64.rst states:
"Generating the field for a Gn relocation directive starts by examining the residual value Yn after the bits of abs(X) corresponding to less significant fields have been masked off from X. If M is the mask specified in the table recording the relocation directive, Yn = abs(X) & ~((M & -M) - 1).
Overflow checking is performed on Yn unless the name of the relocation ends in "_NC"."
This is incorrect - the overflow check must be done on X, not on abs(X) since that creates an overflow bug. The overflow check does not need lower bits to be removed either. This paragraph can be removed since relocations already show what overflow checks to perform on X.
"trade off" is not a correct term. It should be either "tradeoff" or "trade-off"
https://reviews.llvm.org/D132386 [AArch64][PAC] Lower auth/resign into checked sequence. The resign sequence ends up with raw pointers temporarily in registers. If there is a context switch then these registers will be stored in memory somewhere. To protect against an attack that can read this memory an OS can sign registers on context switch, most likely with the generic key. However doing so for all registers would be expensive. If it can be documented that only a subset of the registers (x16/x17) in Darwin's case contain raw code pointers then only these registers need to be signed on context switch.
If this convention were documented then platforms could take advantage of it.
The pauthabielf64 is not the ideal place for this documentation as it is the ELF ABI, but it is the only one that exists right now. Raising as an issue for future consideration. There may be future PAuthABI documents or pauthabielf64 may have its scope widened.
I'm working on something for llvm that deals with assigning register blocks to HFA/HVA types and what we currently do
is if the number of members is >8 it goes on the stack. (if we assume it's the only parameter)
So a struct of 8 doubles goes in registers, 9 doubles goes on the stack. Makes sense since the number of argument registers is 8 per class of register.
This agrees with "Stage C – Assignment of arguments to registers and stack" rule C2.
(https://github.com/ARM-software/abi-aa/blob/2bcab1e3b22d55170c563c3c7940134089176746/aapcs64/aapcs64.rst#parameter-passing-rules)
However if you look at the definition of HFA/HVA it limits them to 4 members (presumably at any depth, so counting nested structs and arrays).
https://github.com/ARM-software/abi-aa/blob/2bcab1e3b22d55170c563c3c7940134089176746/aapcs64/aapcs64.rst#homogeneous-aggregates
"An Homogeneous Floating-point Aggregate (HFA) is an Homogeneous Aggregate with a Fundamental Data Type that is a Floating-Point type and at most four uniquely addressable members."
I don't see this limitation referenced anywhere else in the document, is there a justification for this rule?
(if there is I don't think it's being applied correctly with clang)
Some ideas I had:
To anyone that understands CHERI, CAP_INIT(S, A, CAP_SIZE, CAP_PERM)
is obviously "the capability with base S
, offset A
(i.e. address ("value" in Morello-speak) S + A
), length CAP_SIZE
and permissions CAP_PERM
", but that is not specified anywhere.
Moreover, this is not precise enough, as the otype is left unspecified. Function pointers should be created as sentries ("RB" in Morello-speak), and data pointers should be left unsealed.
CONTRIBUTING.md refers to a non-existent LICENSE.md file.
Section 5.3.4 of the AAPCS 32 bit ends with:
The layout of bit-fields within an aggregate is defined by the appropriate language binding.
Please add a note that for C/C++ this language binding is in section 8 of the very same document, so it's easier to find.
I believe that AAPCS includes the hybrid model in the aapcs64-morello document.
At the bottom of this document there is a table of C/C++ types to machine types.
The entries with __capability
should be capability machine types in AAPCS.
https://github.com/ARM-software/abi-aa/blob/main/aapcs64-morello/aapcs64-morello.rst#id38
Passing -c
(or --compressed
) to rst2pdf would make it compress the PDFs, which would reduce their size by about 80%:
abi-aa/tools/rst2pdf/generate-pdfs.sh
Line 48 in 8b9c82e
Granted, the savings are only a few MBs for all PDFs together...
I have noticed that because we have allowed merge commits, the history of the repo is a bit messy, with commits that look redundant to me:
2020-01-22 14:59 +0000 Ties Stuij M─┐ Merge pull request #8 from ARM-software/pdf-generation
2019-12-02 10:12 +0000 Ties Stuij │ o style pages
2020-01-22 09:31 +0000 Ties Stuij M─┤ Merge pull request #10 from ARM-software/frame-chain-pac-enabled
2020-01-21 17:12 +0000 Ties Stuij │ o {origin/frame-chain-pac-enabled} clarify frame chain behaviour when PAC is enabled
2020-01-21 14:48 +0000 Ties Stuij M─┤ Merge pull request #9 from rsandifo-arm/c++-mangling
2020-01-21 12:19 +0000 Richard Sandiford │ o Update C++ mangling to reflect existing practice
2019-11-26 12:03 +0000 Ties Stuij M─┤ Merge pull request #7 from fpetrogalli/front-page-readme
2019-11-21 23:27 -0600 Francesco Petrogalli │ o [front-page-readme] {origin/front-page-readme} [main readme] Editorial improvement.
2019-11-21 17:15 +0000 Ties Stuij M─┤ Merge pull request #6 from fpetrogalli/add-arm-logo-in-readme
2019-11-21 11:08 -0600 Francesco Petrogalli │ o {origin/add-arm-logo-in-readme} [abi][readme] Add Arm logo to readme files.
2019-11-21 17:14 +0000 Ties Stuij M─│─┐ Merge pull request #5 from fpetrogalli/fix-rendering-of-links-in-readmes
2019-11-21 10:59 -0600 Francesco Petrogalli │ │ o {origin/fix-rendering-of-links-in-readmes} [abi][readme] Fix links to issue tracker.
2019-11-15 09:21 +0000 Ties Stuij M─┼─┘ Merge pull request #4 from fpetrogalli/add-link-to-issues
2019-11-14 10:05 -0600 Francesco Petrogalli │ o {origin/add-link-to-issues} [abi] Add links to GitHub issue tracker in READMEs.
2019-11-12 16:51 +0000 Ties Stuij M─┤ Merge pull request #1 from fpetrogalli/add-readme-at-abi-level
2019-11-12 10:28 -0600 Francesco Petrogalli │ o {origin/add-readme-at-abi-level} [abi] Fix links in top level README file of the `abi` folder.
2019-11-12 10:20 -0600 Francesco Petrogalli │ o [abi] Add README at level.
2019-10-25 10:26 -0500 Francesco Petrogalli o─┘ [VFABIA64] Import Vector Function ABI for AArch64.
2019-10-25 10:25 -0500 Francesco Petrogalli o [AAPCS64] Import Procedure Call Standard for AArch64.
2019-10-25 10:24 -0500 Francesco Petrogalli I [README] Init commit
For this reason, I have disabled merge commits in the setting of the repository, which leaves us only the possibility of
My preference would be to disable the latter option too. Allowing only squash and merge would make sure that each commits we accept in the repo will be verified once we enable CI.
Any thoughts?
As it says on the tin, we should have some CI in place for:
inbranch vector calls get a mask argument so the call only operates on active lanes according to that mask, but the abi does not say what happens with the inactive lanes in the returned vector.
i think it should explicitly say that inactive lanes have unspecified value in the returned vector.
https://github.com/ARM-software/software-standards/blob/master/abi/aapcs64/aapcs64.rst:
compiler may ignore a volatile qualification of an automatic variable whose address is never taken unless the function calls setjmp()
This statement contradicts the ISO C11 standard that does not mention `setjmp' before “Standard headers” section, which is long after the definition of “volatile” and its meaning for program execution.
Major compilers (GCC and Clang) follow the standard correctly. See the code generated for
void f(){ volatile int x=0; x; }
(even with -O3
): https://godbolt.org/z/QHGt2M (edit: add =0
)
It has been a while since we did a sanity check on the individual ABI documents.
Lets do a review pass over them and check for things like:
Small things we can fix as part of this issue. For bigger things, create a new issue.
Currently these are both given as CAP_INIT(S, A, CAP_SIZE, CAP_PERM)
, but that does not make sense. Both always have a null symbol and the image base load offset needs to be added in. For R_MORELLO_IRELATIVE it also needs Indirect(...)
around it (or an explicit capability version).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.