VSI OpenVMS Calling Standard
- Operating System and Version:
- VSI OpenVMS x86-64 Version 9.2-1 or higher
VSI OpenVMS IA-64 Version 8.4-1H1 or higher
VSI OpenVMS Alpha Version 8.4-2L1 or higher
Preface
The VSI OpenVMS Calling Standard defines the requirements, mechanisms, and conventions that support procedure-to-procedure calls for OpenVMS VAX, OpenVMS Alpha, OpenVMS Industry Standard 64 (I64), and OpenVMS x86-64. The standard defines the run-time data structures, constants, algorithms, conventions, methods, and functional interfaces that enable a native user-mode procedure to operate correctly in a multilanguage environment on VAX, Alpha, Itanium®, and x86-64 systems. Properties of the run-time environment that must apply at various points during program execution are also defined.
The 32-bit user mode of OpenVMS Alpha provides a high degree of compatibility with programs written for OpenVMS VAX.
The 64-bit user mode of OpenVMS Alpha is a compatible superset of the OpenVMS Alpha 32-bit user mode.
The 32-bit and 64-bit user modes of OpenVMS I64 and x86-64 are highly compatible with OpenVMS Alpha.
The interfaces, methods, and conventions specified in this manual are primarily intended for use by implementers of compilers, debuggers, and other run-time tools, run-time libraries, and base operating systems. These specifications may or may not be appropriate for use by higher level system software and applications.
This standard is under engineering change order (ECO) control. ECOs are approved by VSI's OpenVMS Calling Standard committee.
1. About VSI
VMS Software, Inc. (VSI) is an independent software company licensed by Hewlett Packard Enterprise to develop and support the OpenVMS operating system.
2. Intended Audience
This manual primarily defines requirements for developers of compilers and debuggers, but the information can apply to procedure calling for all programmers.
3. Document Structure
This manual contains the following chapters and appendixes:
Chapter 1, Introduction provides an overview of the standard, defines goals, and defines terms used in the text.
Chapter 2, OpenVMS VAX Conventions describes the primary conventions in calling a procedure in an OpenVMS VAX environment. It defines register usage and addressing as well as vector and scalar processor synchronization.
Chapter 3, OpenVMS Alpha Conventions describes the fundamental concepts and conventions in calling a procedure in an OpenVMS Alpha environment. The chapter defines register usage and addressing, and focuses on aspects of the calling standard that pertain to procedure-to-procedure flow of control.
Chapter 4, OpenVMS I64 Conventions describes the fundamental concepts and conventions in calling a procedure in an OpenVMS I64 environment. The chapter defines register usage and addressing, and focuses on aspects of the calling standard that pertain to procedure-to-procedure flow of control.
Chapter 5, OpenVMS x86-64 Conventions describes the fundamental concepts and conventions in calling a procedure in an OpenVMS x86-64 environment. The chapter defines register usage and addressing, and focuses on aspects of the calling standard that pertain to procedure-to-procedure flow of control.
Chapter 6, Signature Information and Translated Images (Alpha and I64 Systems) describes signature information and its role in interfacing with translated OpenVMS VAX and Alpha images on Alpha and I64 systems.
Chapter 7, OpenVMS Argument Data Types defines the argument-passing data types used in calling a procedure for all OpenVMS environments.
Chapter 8, OpenVMS Argument Descriptors defines the argument descriptors used in calling a procedure for all OpenVMS environments.
Chapter 9, OpenVMS Conditions describes the OpenVMS condition and exception handling requirements for all OpenVMS environments.
Appendix A, Stack Unwinding and Exception Handling on OpenVMS I64 describes stack unwinding and exception handling for OpenVMS I64 environments.
Appendix B, Stack Unwinding and Exception Handling on OpenVMS x86-64 describes stack unwinding and exception handling for OpenVMS x86-64 environments.
Appendix C, Summary of Differences from Related Industry Software Conventions contains a brief summary of the differences of this calling standard from Intel Itanium and industry x86-64 software conventions.
4. Related Documents
VAX Architecture Reference Manual
Alpha Architecture Reference Manual
OpenVMS Programming Interfaces: Calling a System Routine
Guide to POSIX Threads Library
VAX/VMS Internals and Data Structures
OpenVMS AXP Internals and Data Structures
Itanium® Software Conventions and Runtime Architecture Guide
Intel IA-64 Architecture Software Developer's Manual
Intel 64 and IA-32 Architectures Software Developer Manuals
System V Application Binary Interface, AMD64 Architecture Processor Supplement, Version 1.0
Linux Standard Base, Version 5.0
5. VSI Encourages Your Comments
You may send comments or suggestions regarding this manual or any VSI document by sending electronic mail to the following Internet address: <docinfo@vmssoftware.com>
. Users who have VSI OpenVMS support contracts through VSI can contact <support@vmssoftware.com>
for help with this product.
6. OpenVMS Documentation
The full VSI OpenVMS documentation set can be found on the VMS Software Documentation webpage at https://docs.vmssoftware.com.
7. Typographical Conventions
The following conventions are used in this manual:
Convention | Meaning |
---|---|
Ctrl/x | A sequence such as Ctrl/x indicates that you must hold down the key labeled Ctrl while you press another key or a pointing device button. |
PF1 x | A sequence such as PF1 x indicates that you
must first press and release the key labeled PF1 and then press and release another
key (x ) or a pointing device button. |
... |
A horizontal ellipsis in examples indicates one of the following possibilities:
|
. . . | A vertical ellipsis indicates the omission of items from a code example or command format; the items are omitted because they are not important to the topic being discussed. |
( ) | In command format descriptions, parentheses indicate that you must enclose choices in parentheses if you specify more than one. |
[ ] | In command format descriptions, brackets indicate optional choices. You can choose one or more items or no items. Do not type the brackets on the command line. However, you must include the brackets in the syntax for directory specifications and for a substring specification in an assignment statement. |
| | In command format descriptions, vertical bars separate choices within brackets or braces. Within brackets, the choices are optional; within braces, at least one choice is required. Do not type the vertical bars on the command line. |
{ } | In command format descriptions, braces indicate required choices; you must choose at least one of the items listed. Do not type the braces on the command line. |
bold type | Bold type represents the name of an argument, an attribute, or a reason. Bold type also represents the introduction of a new term. |
italic type | Italic type indicates important information, complete titles of manuals, or variables. Variables include information that varies in system output (Internal error number), in command lines (/PRODUCER=name), and in command parameters in text (where dd represents the predefined code for the device type). |
UPPERCASE TYPE | Uppercase type indicates a command, the name of a routine, the name of a file, or the abbreviation for a system privilege. |
Example |
This typeface indicates code examples, command examples, and interactive screen displays. In text, this type also identifies website addresses, UNIX commands and pathnames, PC-based commands and folders, and certain elements of the C programming language. |
- | A hyphen at the end of a command format description, command line, or code line indicates that the command or statement continues on the following line. |
numbers | All numbers in text are assumed to be decimal unless otherwise noted. Nondecimal radixes—binary, octal, or hexadecimal—are explicitly indicated. |
Chapter 1. Introduction
This standard defines properties such as the run-time data structures, constants, algorithms, conventions, methods, and functional interfaces that enable a native user-mode procedure to operate correctly in a multilanguage and multithreaded environment on OpenVMS VAX, OpenVMS Alpha, OpenVMS I64, and OpenVMS x86-64 systems. These properties include the contents of key registers, format and contents of certain data structures, and actions that procedures must perform under certain circumstances.
This standard also defines properties of the run-time environment that must apply at various points during program execution. These properties vary in scope and applicability. Some properties apply at all points throughout the execution of standard-conforming user-mode code and must, therefore, be held constant at all times. Examples of such properties include those defined for the stack pointer and various properties of the call stack navigation mechanism. Other properties apply only at certain points, such as call conventions that apply only at the point of transfer of control to another procedure.
Note
In many cases, significant performance gains can be realized by selective use of nonstandard calls when the safety of such calls is known. Developers of compilers and other tools are encouraged to make full use of such optimizations.
The procedure call mechanism depends on agreement between the calling and called procedures to interpret the argument list. The argument list does not fully describe itself. This standard requires language extensions to permit a calling program to generate some of the argument-passing mechanisms expected by called procedures.
Calling sequence—instructions at the call site, entry point, and returns
Argument list—structure of the list describing the arguments to the called procedure
Function value return—form and conventions for the return of the function value as a value or as a condition value to indicate success or failure
Register usage—which registers are preserved and who is responsible for preserving them
Stack usage—rules governing the use of the stack
Argument data types—data types of arguments that can be passed
Argument descriptor formats—how descriptors are passed for the more complex arguments
Condition handling—how exception conditions are signaled and how they are handled in a modular fashion
Stack unwinding—how the current thread of execution is aborted efficiently.
1.1. Applicability
This standard defines the rules and conventions that govern the native user-mode run-time environment on OpenVMS VAX, Alpha, I64, and x86-64 systems. It is applicable to all software that executes in OpenVMS native user mode.
All externally callable interfaces in OpenVMS supported, standard system software
All intermodule calls to major software components
All external procedure calls generated by OpenVMS language processors without interprocedural analysis or permanent private conventions (such as those used for language-support run-time library [RTL] routines).
1.2. Architectural Level
This standard defines an implementation-level run-time software architecture for OpenVMS operating systems.
The interfaces, methods, and conventions specified in this document are primarily intended for use by implementers of compilers, debuggers, and other run-time tools, run-time libraries, and base operating systems. These specifications may or may not be appropriate for use by higher-level system software and applications.
Compilers and run-time libraries may provide additional support of these capabilities via interfaces that are more suited for compiler and application use. This specification neither prohibits nor requires such additional interfaces.
1.3. Goals
Applies to all intermodule callable interfaces in the native software system. Specifically, the standard considers the requirements of important compiled languages including Ada, BASIC, BLISS, C, C++, COBOL, Fortran, Pascal, LISP, PL/I, and calls to the operating system and library procedures. The needs of other languages that the OpenVMS operating system may support in the future must be met by the standard or by compatible revisions to it.
Excludes capabilities for lower-level components (such as assembler routines) that cannot be invoked from the high-level languages.
Allows the calling program and called procedure to be written in different languages. The standard reduces the need for using language extensions in mixed-language programs.
Contributes to the writing of error-free, modular, and maintainable software, and promotes effective sharing and reuse of software modules.
Provides the programmer with control over fixing, reporting, and flow of control when various types of exception conditions occur.
Provides subsystem and application writers with the ability to override system messages toward a more suitable application-oriented interface.
Adds no space or time overhead to procedure calls and returns that do not establish exception handlers, and minimizes time overhead for establishing handlers at the cost of increased time overhead when exceptions occur.
Supports a 32-bit user-mode environment that provides a high degree of compatibility with the OpenVMS VAX environment.
Supports a 64-bit user-mode environment that is a compatible superset of the OpenVMS Alpha 32-bit environment.
Simplifies coexistence with OpenVMS VAX procedures that execute under the translated image environment.
Simplifies the compilation of OpenVMS VAX assembler source to native OpenVMS Alpha object code.
Supports a multilanguage, multithreaded execution environment, including efficient, effective support for the implementation of the multithreaded architecture.
Provides an efficient mechanism for calling lightweight procedures that do not need or cannot expend the overhead of setting up a stack call frame.
Provides for the use of a common calling sequence to invoke lightweight procedures that maintain only a register call frame and heavyweight procedures that maintain a stack call frame. This calling sequence allows a compiler to determine whether to use a stack frame based on the complexity of the procedure being compiled. A recompilation of a called routine that causes a change in stack frame usage does not require a recompilation of its callers.
Provides condition handling, traceback, and debugging for lightweight procedures that do not have a stack frame.
Makes efficient use of the Alpha architecture, including effectively using a larger number of registers than is contained in a conventional VAX processor.
Minimizes the cost of procedure calls.
Extends all of the goals listed above for the OpenVMS Alpha environment to the OpenVMS I64 environment.
Supports a 64-bit user mode environment that is highly compatible with the OpenVMS Alpha 64-bit user mode environment.
Makes efficient use of the Itanium architecture, including using a larger number of registers than is contained in a conventional Alpha processor, as well as additional I64 architecture features.
Follows conventions established for Intel Itanium processor software generally except where required to preserve compatibility with OpenVMS VAX and Alpha environments.
Extends all of the goals of the earlier OpenVMS environments to x86-64 compatible systems.
Follows industry conventions established for the Intel and AMD compatible x86-64 processor software generally except where required to preserve compatibility with OpenVMS for earlier environments.
Checking of argument data types, data structures, and parameter access. The OpenVMS protection and memory management systems do not depend on correct interactions between user-level calling and called procedures. Such extended checking might be desirable in some circumstances, but system integrity does not depend on it.
Information for an interpretive OpenVMS Debugger. The definition of the debugger includes a debug symbol table (DST) that contains the required descriptive information.
1.4. Definitions
Address: On OpenVMS VAX systems, a 32-bit value used to denote a position in memory. On OpenVMS Alpha, OpenVMS I64, and OpenVMS x86-64 systems (collectively referred to as the 64-bit systems), a 64-bit value used to denote a position in memory. However, many 64-bit applications and user-mode facilities operate in such a manner that addresses are restricted only to values that are representable in 32 bits. This allows addresses on 64-bit systems often to be stored and manipulated as 32-bit longword values. In such cases, the 32-bit address value is always implicitly or explicitly sign-extended to form a 64-bit address for use by the hardware.
Argument list: A vector of entries (longwords on OpenVMS VAX, quadwords on 64-bit systems) that represents a procedure parameter list and possibly a function value.
Asynchronous software interrupt: An asynchronous interruption of normal code flow caused by some software event. This interruption shares many of the properties of hardware exceptions, including forcing some out-of-line code to execute.
Bound procedure: A type of procedure that requires knowledge (at run-time) of a dynamically determined larger enclosing scope to function correctly.
Call frame: The body of information that a procedure must save to allow it to properly return to its caller. A call frame may exist on the stack or in registers. A call frame may optionally contain additional information required by the called procedure.
Condition handler: A procedure designed to handle conditions (exceptions) when they occur during the execution of a thread.
Condition value: A 32-bit value (sign-extended to a 64-bit value on 64-bit systems) used to uniquely identify an exception condition. A condition value can be returned to a calling program as a function value or it can be signaled using the OpenVMS signaling mechanism.
Descriptor: A mechanism for passing parameters where the address of a descriptor is an entry in the argument list. The descriptor contains the address of the parameter, data type, size, and additional information needed to describe fully the data passed.
Exception condition (or condition): An exceptional condition in the current hardware or software state that should be noted or fixed. Its existence causes an interruption in program flow and forces execution of out-of-line code. Such an event might be caused by an exceptional hardware state, such as arithmetic overflows, memory access control violations, and so on, or by actions performed by software, such as subscript range checking, assertion checking, or asynchronous notification of one thread by another.
During the time the normal control flow is interrupted by an exception, that condition is termed active.
Function: A procedure that returns a single value in accordance with the standard conventions for value returning. Additional values may be returned by means of the argument list.
Function pointer: See Procedure value.
Function value: Depending on context, either 1) a value that is returned as a result of calling a procedure, or 2) a procedure value (see below).
Hardware exception: A category of exceptions that reflect an exceptional condition in the current hardware state that should be noted or fixed by the software. Hardware exceptions can occur synchronously or asynchronously with respect to the normal program flow.
IP (I64 platforms): Instruction pointer—a value that identifies a bundle of instructions in memory; the address of the first (lowest addressed) byte of an aligned 16-byte sequence that encodes three Itanium architecture instructions. See also PC.
IP (x86-64 platforms): Instruction pointer—an address that identifies an instruction in memory. See also PC.
Immediate value: A mechanism for passing input parameters where the actual value is provided in the argument list entry by the calling program.
Language-support procedure: A procedure called implicitly to implement high-level language constructs. Such procedures are not intended to be explicitly called from user programs.
Leaf procedure: A procedure that makes no outbound calls. Conversely, a non-leaf procedure is one that does make outbound calls.
Library procedure: A procedure explicitly called using the equivalent of a call statement or function reference. Such procedures are usually language independent.
Natural alignment: An attribute of certain data types that refers to the placement of the data so that the lowest addressed byte of the data has an address that is a multiple of the size of the data in bytes. Natural alignment of an aggregate data type generally refers to an alignment in which all members of the aggregate are naturally aligned.
This standard defines five natural alignments:Byte—Any byte address
Word—Any byte address that is a multiple of 2
Longword—Any byte address that is a multiple of 4
Quadword—Any byte address that is a multiple of 8
Octaword—Any byte address that is a multiple of 16
PC: A value that identifies an instruction in memory. On OpenVMS VAX, Alpha, and x86-64 systems, the address of the first (lowest addressed) byte of the sequence (unaligned on VAX and x86-64, longword aligned on Alpha) that holds the instruction. On OpenVMS I64, the IP (see above) of the bundle that contains the instruction added to the number of the slot (0, 1, or 2) for that instruction within the bundle. Sometimes used as a synonym or generic alternative to IP.
Procedure: A closed sequence of instructions that is entered from and returns control to the calling program.
Procedure value: An address value that represents a procedure. On OpenVMS VAX systems, a procedure value is the address of the entry mask that is interpreted by the CALL
x
instruction invoking the procedure. On OpenVMS Alpha systems, a procedure value is the address of the procedure descriptor for the procedure. On OpenVMS I64 systems, a procedure value is the address of a function descriptor for the procedure; it is also known as a function pointer. On OpenVMS x86-64 systems, a procedure value is a 32-bit address for either the entry point of a procedure or, if the entry point address is not representable in 32-bits, a 32-bit address for trampoline code that jumps to the actual entry point; the trampoline code may be created by the linker or be created dynamically in the case of a bound procedure value.Process: An address space and at least one thread of execution. Selected security and quota checks are done on a per-process basis.
This standard anticipates the possibility of the execution of multiple threads within a process. An operating system that provides only a single thread of execution per process is considered a special case of a multithreaded system where the maximum number of threads per process is one.
Reference: A mechanism for passing parameters where the address of the parameter is provided in the argument list by the calling program.
Routine: Synonym for procedure or function.
Signal: A POSIX defined concept used to cause out-of-line execution of code. (This term should not be confused with the OpenVMS usage of the word that more closely equates to exception as used in this document).
Standard call: Any transfer of control to a procedure by any means that presents the called procedure with the environment defined by this document and does not place additional restrictions, not defined by this document, on the called procedure.
Standard-conforming procedure: A procedure that adheres to all the relevant rules set forth in this document.
Thread of execution (or thread): An entity scheduled for execution on a processor. In language terms, a thread is a computational entity used by a program unit. Such a program unit might be a task, procedure, loop, or some other unit of computation.
All threads executing within the same process share the same address space and other process contexts, but they have a unique per-thread hardware context that includes program counter, processor status, stack pointer, and other machine registers.
This standard applies only to threads that execute within the context of a user-mode process and are scheduled on one or more processors according to software priority. All subsequent uses of the term thread in this standard refer only to such user-mode process threads.
Thread-safe code: Code that is compiled in such a way to ensure it will execute properly when run in a threaded environment. Thread-safe code usually adds extra instructions to do certain run-time checks and requires that thread local storage be accessed in a particular fashion.
Trampoline: A code fragment (often just one or a very few instructions) that forwards a jump or call.
Undefined: Referring to operations or behavior for which there is no directing algorithm used across all implementations that support this standard. Such operations may be well defined for a particular implementation, but they still remain undefined with reference to this standard. The actions of undefined operations may not be required by standard-conforming procedures.
Unpredictable: Referring to the results of an operation that cannot be guaranteed across all implementations of this standard. These results may be well defined for a particular implementation, but they remain unpredictable with reference to this standard. All results that are not specified in this standard, but are caused by operations defined in this standard, are considered unpredictable. A standard-conforming procedure cannot depend on unpredictable results.
Chapter 2. OpenVMS VAX Conventions
This chapter describes the primary conventions in calling a procedure in an OpenVMS VAX environment.
2.1. Register Usage
In the VAX architecture, there are fifteen 32-bit-wide, general-purpose hardware registers for use with scalar and vector program operations. This section defines the rules of scalar and vector register usage.
2.1.1. Scalar Register Usage
Register | Use |
---|---|
PC |
Program counter. |
SP |
Stack pointer. |
FP |
Current stack frame pointer. This register must always point at the current frame. No modification is permitted within a procedure body. |
AP |
Argument pointer. When a call occurs, AP must point to a valid argument list. A procedure without parameters points to an argument list consisting of a single longword containing the value 0. |
R1 |
Environment value. When a procedure that needs an environment value is called, the calling program must set R1 to the environment value. See bound procedure value in Section 7.3, “Miscellaneous Data Types”. |
R0, R1 |
Function value return registers. These registers are not to be preserved by any called procedure. They are available as temporary registers to any called procedure. |
Registers R2 through R11 are to be preserved across procedure calls. The called procedure can use these registers, provided it saves and restores them using the procedure entry mask mechanism. The entry mask mechanism must be used so that any stack unwinding done by the condition handling mechanism restores all registers correctly. In addition, PC, FP, and AP are always preserved in the stack frame (see Section 2.2, “Stack Usage”) by the CALLS or CALLG instruction and restored by the RET instruction. However, a called procedure can use AP as a temporary register.
If JSB routines are used, they must not save or modify any preserved registers (R2 through R11) not already saved by the entry mask mechanism of the calling program.
2.1.2. Vector Register Usage
This calling standard does not specify conventions for preserved vector registers, vector argument registers, or vector function value return registers. All such conventions are by agreement between the calling and called procedures. In the absence of such an agreement, all vector registers, including V0 through V15, VLR, VCR, and VMR are scratch registers. Among cooperating procedures, a procedure that preserves or otherwise manipulates the vector registers by agreement with its callers must provide an exception handler to restore them during an unwind.
2.2. Stack Usage
Figure 2.1, “Stack Frame Generated by CALLG or CALLS Instruction” shows the contents of the stack frame created for the called procedure by the CALLG or CALLS instruction.
FP always points to the call frame (the condition-handler longword) of the calling procedure. Other uses of FP within a procedure are prohibited. The bottom of stack frame (end of call stack) is indicated when the stack frame's preserved FP is 0. Unless the procedure has a condition handler, the condition-handler longword contains all zeros. See Chapter 9, OpenVMS Conditions for more information on condition handlers.
The contents of the stack located at addresses higher than the mask/PSW longword belong to the calling program; they should not be read or written by the called procedure, except as specified in the argument list. The contents of the stack located at addresses lower than SP belong to interrupt and exception routines; they are modified continually and unpredictably.
The called procedure allocates local storage by subtracting the required number of bytes from the SP provided on entry. This local storage is freed automatically by the return instruction (RET).
Bit <28> of the mask/PSW longword is reserved to OpenVMS for future extensions to the stack frame.
2.3. Calling Sequence
CALLG arglst, proc CALLS argcnt, proc
argcnt
onto the stack as a
longword and sets the argument pointer, AP, to the top of the stack. The complete
sequence using CALLS follows:
push argn . . . push arg1 CALLS #n, proc
If the called procedure returns control to the calling procedure, control must return to the instruction immediately following the CALLG or CALLS instruction. Skip returns and GOTO returns are allowed only during stack unwind operations.
The called procedure returns control to the calling procedure by executing the RET instruction.
2.4. Argument List
The argument list is the primary means of passing information to and receiving results from a procedure.
2.4.1. Argument List Format
Figure 2.2, “Argument List Format” shows the argument list format.
The first longword is always present and contains the argument count as an unsigned integer in the low byte. The 24 high-order bits are reserved and must be zero. To access the argument count, the called procedure must ignore the reserved bits and access the count as an unsigned byte (for example, MOVZBL, TSTB, or CMPB).
An uninterpreted 32-bit value (by immediate value mechanism). If the called procedure expects fewer than 32 bits, it accesses the low-order bits and ignores the high-order bits.
An address (by reference mechanism). It is typically a pointer to a scalar data item, array, structure, record, or a procedure.
An address of a descriptor (by descriptor mechanism). See Chapter 8, OpenVMS Argument Descriptors for descriptor formats.
The standard permits programs to call by immediate value, by reference, by descriptor, or by combinations of these mechanisms. Interpretation of each argument list entry depends on agreement between the calling and called procedures. High-level languages use the reference or descriptor mechanisms for passing input parameters. OpenVMS system services and VAX BLISS, VAX C, VAX C++, or VAX MACRO programs use all three mechanisms.
CALLS #0, proc
A missing or null argument—for example, CALL SUB(A,,B)—is represented by an argument list entry consisting of a longword 0. Some procedures allow trailing null arguments to be omitted and others require all arguments. See each procedure's specification for details.
The argument list must be treated as read-only data by the called procedure and might be allocated in read-only memory at the option of the calling program.
2.4.2. Argument Lists and High-Level Languages
Arguments are mapped from left to right to increasing argument list offsets. The leftmost (first) argument has an address of
arglst+4
, the next has an address ofarglst+8
, and so on. The only exception to this is whenarglst+4
specifies where a function value is to be returned, in which case the first argument has an address ofarglst+8
, the second argument has an address ofarglst+12
, and so on. See Section 2.5, “Function Value Returns” for more information.Each argument position corresponds to a single VAX argument list entry. For the C and C++ languages, a floating-point argument or a record
struct
that is larger than 32 bits may be passed by value using more than one VAX argument list entry. In this case, the argument count in the argument list reflects the actual number of argument list entries rather than the number of C or C++ language arguments.
2.4.2.1. Order of Argument Evaluation
Because most high-level languages do not specify the order of evaluation of arguments (with respect to side effects), those language processors can evaluate arguments in any convenient order.
Note
The choice of argument evaluation order and code generation strategy is constrained only by the definition of the particular language. Do not write programs that depend on the order of evaluation of arguments.
2.4.2.2. Language Extensions for Argument Transmission
This calling standard permits arguments to be passed by immediate value, by reference, or by descriptor. By default, all language processors except VAX BLISS, VAX C, and VAX MACRO pass arguments by reference or by descriptor.
Language extensions are needed to reconcile the different argument-passing mechanisms. In addition to the default passing mechanism used, each language processor is required to give you explicit control, in the calling program, of the argument-passing mechanism for the data types supported by the language.
%VAL(arg) |
By immediate value mechanism. Corresponding
argument list entry is the value of the argument
|
%REF(arg) |
By reference mechanism. Corresponding
argument list entry contains the address of the
value of the argument |
%DESCR(arg) |
By descriptor mechanism. Corresponding
argument list entry contains the address of a
descriptor of the argument
|
CALL SUB1(%VAL(123), %REF(X), %DESCR(A))
For more information, see the VAX Fortran language documentation.
CALL SUB1 (123, X, A)
2.5. Function Value Returns
A function value is returned in register R0 if its data type can be represented in 32 bits, or in registers R0 and R1 if its data type can be represented in 64 bits, provided the data type is not a string data type (see Section 7.2, “String Data Types”).
If the data type requires fewer than 32 bits, then R1 and the high-order bits of R0 are undefined. If the data type requires 32 or more bits but fewer than 64 bits, then the high-order bits of R1 are undefined. Two separate 32-bit entities cannot be returned in R0 and R1 because high-level languages cannot process them.
If the maximum length of the function value is known (for example, octaword integer, H_floating, or fixed-length string), the calling program can allocate the required storage and pass the address of the storage or a descriptor for the storage as the first argument.
If the maximum length of a string function value is not known to the calling program, the calling program can allocate a dynamic string descriptor. The called procedure then allocates storage for the function value and updates the contents of the dynamic string descriptor using OpenVMS Run-Time Library procedures. For information about dynamic strings, see Section 8.3, “Dynamic String Descriptor (CLASS_D)”.
If the maximum length of a fixed-length string (see Section 8.2, “Fixed-Length Descriptor (CLASS_S)”) or a varying string (see Section 8.8, “Varying String Descriptor (CLASS_VS)”) function value is not known to the calling program, the calling program can indicate that it expects the string to be returned on top of the stack. For more information about the function value return, see Section 2.5.1, “Returning a Function Value on Top of the Stack”.
Some procedures, such as operating system calls and many library procedures, return a success or failure value as a longword function value in R0. Bit <0> of the value is set (Boolean true) for a success and clear (Boolean false) for a failure. The particular success or failure status is encoded in the remaining 31 bits, as described in Section 9.1, “Condition Values”.
2.5.1. Returning a Function Value on Top of the Stack
If the maximum length of the function value is not known, the calling program can optionally allocate certain descriptors with the POINTER field set to 0, indicating that no space has been allocated for the value. If the called procedure finds POINTER 0, it fills in the POINTER, LENGTH, and other extent fields to describe the actual size and placement of the function value. This function value is copied to the top of the stack as control returns to the calling program.
This is an exception to the usual practice because the calling program regains control at the instruction following the CALLG or CALLS sequence with the contents of SP restored to a value different from the one it had at the beginning of its CALLG or CALLS calling sequence.
This technique applies only to the first argument in the argument list. Also, the called procedure cannot assume that the calling program expects the function value to be returned on the stack. Instead, the called procedure must check the CLASS field. If the descriptor is one that can be used to return a value on the stack, the called procedure checks the POINTER field. If POINTER is not 0, the called procedure returns the value using the semantics of the descriptor. If POINTER is 0, the called procedure fills in the POINTER and LENGTH fields and returns the value to the top of the stack.
Also, when POINTER is 0, the contents of R0 and R1 are unspecified by the called procedure. Once the called procedure fills in the POINTER field and other extent fields, the calling program may pass the descriptor as an argument to other procedures.
2.5.1.1. Returning a Fixed-Length or Varying String Function Value
CLASS |
POINTER |
Called Procedure's Action |
---|---|---|
S=1 |
Not 0 |
Copy the function value to the fixed-length area specified by the descriptor and space fill (hex 20 if ASCII) or truncate on the right. The entire area is always written according to Section 8.2, “Fixed-Length Descriptor (CLASS_S)”. |
S=1 |
0 |
Return the function value on top of the stack after filling in POINTER with the first address of the string and LENGTH with the length of the string to complete the descriptor according to Section 8.2, “Fixed-Length Descriptor (CLASS_S)”. |
VS=11 |
Not 0 |
Copy the function value to the varying area specified by the descriptor and fill in CURLEN and BODY according to Section 8.8, “Varying String Descriptor (CLASS_VS)”. |
VS=11 |
0 |
Return the function value on top of the stack after filling in POINTER with the address of CURLEN and MAXSTRLEN with the length of the string in bytes (same value as contents of CURLEN) according to Section 8.8, “Varying String Descriptor (CLASS_VS)”. |
Other |
— |
Error. A condition is signaled. |
In both the fixed-length and varying string cases, the string is unaligned. Specifically, the function value is allocated on top of the stack with no unused bytes between the stack pointer value contained at the beginning of the CALLS or CALLG sequence and the last byte of the string.
2.6. Vector and Scalar Processor Synchronization
There are two kinds of synchronization between a scalar and vector processor pair: memory synchronization and exception synchronization.
Memory synchronization with the caller of a procedure that uses the vector processor
is required because scalar machine writes (to main memory) might still be pending at
the time of entry to the called procedure. The various forms of write-cache
strategies allowed by the VAX architecture combined with the possibly independent
scalar and vector memory access paths imply that a scalar store followed by a
CALLx
followed by a vector load is not safe without an
intervening MSYNC.
Within a procedure that uses the vector processor, proper memory and exception synchronization might require use of an MSYNC instruction, a SYNC instruction, or both, prior to calling or upon being called by another procedure. Further, for calls to other procedures, the requirements can vary from call to call, depending on details of actual vector usage.
An MSYNC instruction (without a SYNC) at procedure entry, at procedure exit, and
prior to a call provides proper synchronization in most cases. A SYNC instruction
without an MSYNC prior to a CALLx
(or RET) is sometimes
appropriate. The remaining two cases, where both or neither MSYNC and SYNC are
needed, are rare.
Refer to the VAX MACRO and Instruction Set Reference Manual for the specific rules on what exceptions are ensured to be reported by MSYNC and other MFVP instructions.
2.6.1. Memory Synchronization
An MSYNC instruction (a form of the MFVP instruction) must be executed before the first vector load and store to synchronize with memory operations issued by the caller. While an MSYNC instruction might typically occur in the entry code sequence of a procedure, exact placement might also depend on a variety of optimization considerations.
An MSYNC instruction must be executed after the last vector load or store to synchronize with memory operations issued after return. While an MSYNC instruction might typically occur in the return code sequence of a procedure, exact placement might also depend on a variety of optimization considerations.
An MSYNC instruction must be executed between each vector load and store and each standard call to other procedures to synchronize with memory operations issued by those procedures.
Any procedure that executes vector loads or stores is responsible for synchronizing with potentially conflicting memory operations in any other procedure. However, execution of an MSYNC instruction to ensure scalar and vector memory synchronization can be omitted when it can be determined for the current procedure that all possibly incomplete vector load and stores operate only on memory not accessed by other procedures.
2.6.2. Exception Synchronization
Every procedure must ensure that no exception can be raised after the current
frame is changed (as a result of a CALLx
or RET). If a
procedure executes any vector instruction that might raise an exception, then a
SYNC instruction (a form of the MFVP instruction) must be executed prior to any
subsequent CALLx
or RET.
However, if the only exceptions that can occur are certain to be reported by an MSYNC instruction that is otherwise needed for memory synchronization, then the SYNC is redundant and can be omitted as an optimization.
Moreover, if the only exceptions that can occur are certain to be reported by one or more MFVP instructions that read the vector control registers, then the SYNC is redundant and can be omitted as an optimization.
Chapter 3. OpenVMS Alpha Conventions
This chapter describes the fundamental concepts and conventions for calling a procedure in an Alpha environment. The following sections identify register usage and addressing, and focus on aspects of the calling standard that pertain to procedure-to-procedure flow control.
3.1. Register Usage
Integer
Floating-point
The first 32 general-purpose registers support integer processing and the second 32 support floating-point operations.
3.1.1. Integer Registers
Register |
Usage |
---|---|
R0 |
Function value register. In a standard call that returns a nonfloating-point function result in a register, the result must be returned in this register. In a standard call, this register may be modified by the called procedure without being saved and restored. This register is not to be preserved by any called procedure. |
R1 |
Conventional scratch register. In a standard call, this register may be modified by the called procedure without being saved and restored. This register is not to be preserved by any called procedure. In addition, R1 is the preferred and recommended register to use for passing the environment value when calling a bound procedure. (See Section 3.6.4, “Simple and Bound Procedures” and Section 6.1.2, “Translated Images on I64 Systems”). |
R2—R15 |
Conventional saved registers. If a standard-conforming procedure modifies one of these registers, it must save and restore it. |
R16—R21 |
Argument registers. In a standard call, up to six nonfloating-point items of the argument list are passed in these registers. In a standard call, these registers may be modified by the called procedure without being saved and restored. |
R22—R24 |
Conventional scratch registers. In a standard call, these registers may be modified by the called procedure without being saved and restored. |
R25 |
Argument information (AI) register. In a standard call, this register describes the argument list. (See Section 3.6.1, “Call Conventions” for a detailed description). In a standard call, this register may be modified by the called procedure without being saved and restored. |
R26 |
Return address (RA) register. In a standard call, the return address must be passed in this register. In a standard call, this register may be modified by the called procedure without being saved and restored. |
R27 |
Procedure value (PV) register. In a standard call, the procedure value of the procedure being called is passed in this register. In a standard call, this register may be modified by the called procedure without being saved and restored. |
R28 |
Volatile scratch register. The contents of this register are always unpredictable after any external transfer of control either to or from a procedure. This applies to both standard and nonstandard calls. This register may be used by the operating system for external call fixup, autoloading, and exit sequences. |
R29 |
Frame pointer (FP). The contents of this register define, among other things, which procedure is considered current. Details of usage and alignment are defined in Section 3.5, “Procedure Call Stack”. |
R30 |
Stack pointer (SP). This register contains a pointer to the top of the current operating stack. Aspects of its usage and alignment are defined by the hardware architecture. Various software aspects of its usage and alignment are defined in Section 3.6.1, “Call Conventions”. |
R31 |
ReadAsZero/Sink (RZ). Hardware defines binary 0 as a source operand and sink (no effect) as a result operand. |
3.1.2. Floating-Point Registers
Register |
Usage |
---|---|
F0 |
Floating-point function value register. In a standard call that returns a floating-point result in a register, this register is used to return the real part of the result. In a standard call, this register may be modified by the called procedure without being saved and restored. |
F1 |
Floating-point function value register. In a standard call that returns a complex floating-point result in registers, this register is used to return the imaginary part of the result. In a standard call, this register may be modified by the called procedure without being saved and restored. |
F2—F9 |
Conventional saved registers. If a standard-conforming procedure modifies one of these registers, it must save and restore it. |
F10—F15 |
Conventional scratch registers. In a standard call, these registers may be modified by the called procedure without being saved and restored. |
F16—F21 |
Argument registers. In a standard call, up to six floating-point arguments may be passed by value in these registers. In a standard call, these registers may be modified by the called procedure without being saved and restored. |
F22—F30 |
Conventional scratch registers. In a standard call, these registers may be modified by the called procedure without being saved and restored. |
F31 |
ReadAsZero/Sink. Hardware defines binary 0 as a source operand and sink (no effect) as a result operand. |
3.2. Address Representation
An address is a 64-bit value used to denote a position in memory. However, for compatibility with OpenVMS VAX, many Alpha applications and user-mode facilities operate in such a manner that addresses are restricted only to values that are representable in 32 bits. This allows Alpha addresses often to be stored and manipulated as 32-bit longword values. In such cases, the 32-bit address value is always implicitly or explicitly sign-extended to form a 64-bit address for use by the Alpha hardware.
3.3. Procedure Representation
One distinguishing characteristic of any calling standard is how procedures are represented. The term used to denote the value that uniquely identifies a procedure is a procedure value. If the value identifies a bound procedure, it is called a bound procedure value.
In the Alpha portion of this calling standard, all procedure values are defined to be the address of the data structure (a procedure descriptor) that describes that procedure. So, any procedure can be invoked by calling the address stored at offset 8 from the address represented by the procedure value.
Note that a simple (unbound) procedure value is defined as the address of that procedure's descriptor (see Section 3.4, “Procedure Types”). This provides slightly different conventions than would be used if the address of the procedure's code were used as it is in many calling standards.
A bound procedure value is defined as the address of a bound procedure descriptor that provides the necessary information for the bound procedure to be called (see Section 3.6.4, “Simple and Bound Procedures”).
3.4. Procedure Types
Stack frame procedure—Maintains its caller's context on the stack.
Register frame procedure—Maintains its caller's context in registers.
Null frame procedure—Does not establish a context and, therefore, executes in the context of its caller.
A compiler can choose which type of procedure to generate based on the requirements of the procedure in question. A calling procedure does not need to know what type of procedure it is calling.
Every procedure must have an associated structure that describes which type of procedure it is and other procedure characteristics. This structure, called a procedure descriptor, is a quadword-aligned data structure that provides basic information about a procedure. This data structure is used to interpret the call stack at any point in a thread's execution. It is typically built at compile time and usually is not accessed at run-time except to support exception processing or other rarely executed code.
Read access to procedure descriptors is done through a procedure interface described in Section 3.5.2, “Procedure Call Tracing”. This allows for future compatible extensions to these structures.
To make invocations of that procedure visible to and interpretable by facilities such as the debugger, exception handling system, and the unwinder.
To ensure that the context of the caller saved by the called procedure can be restored if an unwind occurs. (For a description of unwinding, see Section 9.7, “Request to Unwind from a Signal”).
3.4.1. Stack Frame Procedures
A called routine may use the stack as a means to return certain types of function values (see Section 3.7.7, “Returning Data” for more information).
A called routine that allocates stack space may take an exception in its routine prologue before it becomes current. This situation must be considered because the stack expansion happens in the context of the caller (see Section 3.5, “Procedure Call Stack” and Section 3.6.5, “Entry and Exit Code Sequences” for more information).
For this reason, a fixed-stack usage version of this procedure type cannot make standard calls.
The variable-stack usage version of this type of procedure is referred to as full function and can make standard calls to other procedures.
3.4.2. Procedure Descriptor for Procedures with a Stack Frame
A stack frame procedure descriptor (PDSC) built by a compiler provides information about a procedure with a stack frame. The minimum size of the descriptor is 32 bytes defined by constant C. An optional PDSC extension in 8-byte increments supports exception handling requirements.
The fields defined in the stack frame descriptor are illustrated in Figure 3.1, “Stack Frame Procedure Descriptor (PDSC)” and described in Table 3.3, “Contents of Stack Frame Procedure Descriptor (PDSC)”.
Field Name | Contents | ||
---|---|---|---|
PDSC$W_FLAGS |
The PDSC descriptor flag bits <15:0> are defined as follows: | ||
PDSC$V_KIND |
A 4-bit field <3:0> that identifies the type of procedure descriptor. For a procedure with a stack frame, this field must specify a value 9 (defined by constant PDSC$K_KIND_FP_STACK). | ||
PDSC$V_HANDLER_VALID |
If set to 1, this descriptor has an extension for the stack handler (PDSC$Q_STACK_HANDLER) information. | ||
PDSC$V_HANDLER_ |
If set to 1, the handler can be reinvoked, allowing an occurrence of another exception while the handler is already active. If this bit is set to 0, the exception handler cannot be reinvoked. Note that this bit must be 0 when PDSC$V_HANDLER_VALID is 0. | ||
PDSC$V_HANDLER_DATA_ |
If set to 1, the HANDLER_VALID bit must be 1, the PDSC extension STACK_HANDLER_DATA field contains valid data for the exception handler, and the address of PDSC$Q_ STACK_HANDLER_DATA will be passed to the exception handler as defined in Section 9.2, “Condition Handlers”. | ||
PDSC$V_BASE_REG_IS_FP |
If this bit is set to 0, the SP is the base register to which PDSC$L_SIZE is added during an unwind. A fixed amount of storage is allocated in the procedure entry sequence, and SP is modified by this procedure only in the entry and exit code sequence. In this case, FP typically contains the address of the procedure descriptor for the procedure. A procedure for which this bit is 0 cannot make standard calls. If this bit is set to 1, FP is the base address and the procedure has a minimum amount of stack storage specified by PDSC$L_SIZE. A variable amount of stack storage can be allocated by modifying SP in the entry and exit code of this procedure. | ||
PDSC$V_REI_RETURN |
If set to 1, the procedure expects the stack at entry to be set, so an REI instruction correctly returns from the procedure. Also, if set, the contents of the RSA$Q_SAVED_RETURN field in the register save area are unpredictable and the return address is found on the stack (see Figure 3.4, “Register Save Area (RSA) Layout”). | ||
Bit 9 |
Must be 0 (reserved). | ||
PDSC$V_BASE_FRAME |
For compiled code, this bit must be set to 0. If set to 1, this bit indicates the logical base frame of a stack that precedes all frames corresponding to user code. The interpretation and use of this frame and whether there are any predecessor frames is system software defined (and subject to change). | ||
PDSC$V_TARGET_INVO |
If set to 1, the exception handler for this procedure is invoked when this procedure is the target invocation of an unwind. Note that a procedure is the target invocation of an unwind if it is the procedure in which execution resumes following completion of the unwind. For more information, see Chapter 9, OpenVMS Conditions. If set to 0, the exception handler for this procedure is not invoked. Note that when PDSC$V_HANDLER_VALID is 0, this bit must be 0. | ||
PDSC$V_NATIVE |
For compiled code, this bit must be set to 1. | ||
PDSC$V_NO_JACKET |
For compiled code, this bit must be set to 1. | ||
PDSC$V_TIE_FRAME |
For compiled code, this bit must be 0. Reserved for use by system software. | ||
Bit 15 |
Must be 0 (reserved). | ||
PDSC$W_RSA_ |
Signed offset in bytes between the stack frame base (SP or FP as indicated by PDSC$V_BASE_REG_IS_FP) and the register save area. This field must be a multiple of 8, so that PDSC$W_RSA_OFFSET added to the contents of SP or FP (PDSC$V_BASE_REG_IS_FP) yields a quadword-aligned address. | ||
PDSC$V_FUNC_ |
A 4-bit field <11:8> that describes which registers are used for the function value return (if there is one) and what format is used for those registers. Table 6.4, “Function Return Signature Encodings” lists and describes the possible encoded values of PDSC$V_FUNC_RETURN. | ||
PDSC$V_ |
A 3-bit field <14:12> that encodes the caller's desired exception-reporting behavior when calling certain mathematically oriented library routines. These routines generally search up the call stack to find the desired exception behavior whenever an error is detected. This search is performed independent of the setting of the Alpha FPCR. The possible values for this field are defined as follows: | ||
Value |
Name |
Meaning | |
0 |
PDSC$K_EXC_ |
Raise exceptions for all error conditions except for underflows producing a 0 result. This is the default mode. | |
1 |
PDSC$K_EXC_ |
Raise exceptions for all error conditions (including underflow). | |
2 |
PDSC$K_EXC_ |
Raise no exceptions. Create only finite values (no infinities, denormals, or NaNs). In
this mode, either the function result or the C language | |
3 |
PDSC$K_EXC_ |
Raise no exceptions except as controlled by separate IEEE exception enable bits. Create infinities, denormals, or NaN values according to the IEEE floating-point standard. | |
4 |
PDSC$K_EXC_ |
Perform the exception-mode behavior specified by this procedure's caller. | |
PDSC$W_ |
A 16-bit signed byte offset from the start of the procedure descriptor. This offset designates the start of the procedure signature block (if any). A 0 in this field indicates that no signature information is present. Note that in a bound procedure descriptor (as described in Section 3.6.4, “Simple and Bound Procedures”), signature information might be present in the related procedure descriptor. A 1 in this field indicates a standard default signature. An offset value of 1 is not otherwise a valid offset because both procedure descriptors and signature blocks must be quadword aligned. | ||
PDSC$Q_ENTRY |
Absolute address of the first instruction of the entry code sequence for the procedure. | ||
PDSC$L_SIZE |
Unsigned size, in bytes, of the fixed portion of the stack frame for this procedure. The size must be a multiple of 16 bytes to maintain the minimum stack alignment required by the Alpha hardware architecture and stack alignment during a call (defined in Section 3.6.1, “Call Conventions”). PDSC$L_SIZE cannot be 0 for a stack-frame type procedure, because the stack frame must include space for the register save area. The value of SP at entry to this procedure can be calculated by adding PDSC$L_SIZE to the value SP or FP, as indicated by PDSC$V_BASE_REG_IS_FP. | ||
PDSC$W_ENTRY_ |
Unsigned offset, in bytes, from the entry point to the first instruction in the procedure code segment following the procedure prologue (that is, following the instruction that updates FP to establish this procedure as the current procedure). | ||
PDSC$L_IREG_MASK |
Bit vector (0-31) specifying the integer registers that are saved in the register save area on entry to the procedure. The least significant bit corresponds to register R0. Never set bits 31, 30, 28, 1, and 0 of this mask, because R31 is the integer read-as-zero register, R30 is the stack pointer, R28 is always assumed to be destroyed during a procedure call or return, and R1 and R0 are never preserved registers. In this calling standard, bit 29 (corresponding to the FP) must always be set. | ||
PDSC$L_FREG_MASK |
Bit vector (0-31) specifying the floating-point registers saved in the register save area on entry to the procedure. The least significant bit corresponds to register F0. Never set bit 31 of this mask, because it corresponds to the floating-point read-as-zero register. | ||
PDSC$Q_STACK_ |
Absolute address to the procedure descriptor for a run-time static exception handling procedure. This part of the procedure descriptor is optional. It must be supplied if either PDSC$V_HANDLER_VALID is 1 or PDSC$V_HANDLER_DATA_VALID is 1 (which requires that PDSC$V_HANDLER_VALID be 1). If PDSC$V_HANDLER_VALID is 0, then the contents or existence of PDSC$Q_STACK_HANDLER is unpredictable. | ||
PDSC$Q_STACK_ |
Data (quadword) for the exception handler. This is an optional quadword and needs to be supplied only if PDSC$V_HANDLER_DATA_VALID is 1. If PDSC$V_HANDLER_DATA_VALID is 0, then the contents or existence of PDSC$Q_STACK_HANDLER_DATA is unpredictable. |
3.4.3. Stack Frame Format
Fixed size
Variable size
Even though the exact contents of a stack frame are determined by the compiler, all stack frames have common characteristics.
When PDSC$V_BASE_REG_IS_FP is 0 and PDSC$L_SIZE is 0, then the procedure utilizes no stack storage and SP contains the value of SP at entry to the procedure. (Such a procedure must be a register frame procedure).
When PDSC$V_BASE_REG_IS_FP is 0 and PDSC$L_SIZE is a nonzero value, then the procedure has a fixed amount of stack storage specified by PDSC$L_SIZE, all of which is allocated in the procedure entry sequence, and SP is modified by this procedure only in the entry and exit code sequences. (Such a procedure may not make standard calls).
When PDSC$V_BASE_REG_IS_FP is 1 and PDSC$L_SIZE is a nonzero value, then the procedure has a fixed amount of stack storage specified by PDSC$L_SIZE, and may have a variable amount of stack storage allocated by modifying SP in the body of the procedure. (Such a procedure must be a stack frame procedure).
The combination when PDSC$V_BASE_REG_IS_FP is 1 and PDSC$L_SIZE is 0 is illegal because it violates the rules for R29 (FP) usage that requires R29 to be saved (on the stack) and restored.
3.4.3.1. Fixed-Size Stack Frame
Figure 3.2, “Fixed-Size Stack Frame Format” illustrates the format of the stack frame for a procedure with a fixed amount of stack that uses the SP register as the stack base pointer (when PDSC$V_BASE_REG_IS_FP is 0). In this case, R29 (FP) typically contains the address of the procedure descriptor for the current procedure (see Section 3.5.1, “Current Procedure”).
Some parts of the stack frame are optional and occur only as required by the particular procedure. As shown in the figure, the field names within brackets are optional fields. Use of the arguments passed in memory field appending the end of the descriptor is described in Section 3.4.3.3, “Fixed Temporary Locations for All Stack Frames” and Section 3.7.2, “Argument List Structure”.
For information describing the fixed temporary locations and register save area, see Section 3.4.3.3, “Fixed Temporary Locations for All Stack Frames” and Section 3.4.3.4, “Register Save Area for All Stack Frames”.
3.4.3.2. Variable-Size Stack Frame
Figure 3.3, “Variable-Size Stack Frame Format” illustrates the format of the stack frame for procedures with a varying amount of stack when PDSC$V_BASE_REG_IS_FP is 1. In this case, R29 (FP) contains the address that points to the base of the stack frame on the stack. This frame-base quadword location contains the address of the current procedure's descriptor.
Some parts of the stack frame are optional and occur only as required by the particular procedure. In Figure 3.3, “Variable-Size Stack Frame Format”, field names within brackets are optional fields. Use of the arguments passed in memory field appending the end of the descriptor is described in Section 3.4.3.3, “Fixed Temporary Locations for All Stack Frames” and Section 3.7.2, “Argument List Structure”.
For more information describing the fixed temporary locations and register save area, see Section 3.4.3.3, “Fixed Temporary Locations for All Stack Frames” and Section 3.4.3.4, “Register Save Area for All Stack Frames”.
A compiler can use the stack temporary area pointed to by the SP base register for fixed local variables, such as constant-sized data items and program state, as well as for dynamically sized local variables. The stack temporary area may also be used for dynamically sized items with a limited lifetime, for example, a dynamically sized function result or string concatenation that cannot be stored directly in a target variable. When a procedure uses this area, the compiler must keep track of its base and reset SP to the base to reclaim storage used by temporaries.
3.4.3.3. Fixed Temporary Locations for All Stack Frames
The fixed temporary locations are optional sections of any stack frame that contain language-specific locations required by the procedure context of some high-level languages. This may include, for example, register spill area, language-specific exception handling context (such as language-dynamic exception handling information), fixed temporaries, and so on.
The argument home area (if allocated by the compiler) can be found with the PDSC$L_SIZE offset in the last fixed temporary locations at the end of the stack frame. It is adjacent to the arguments passed in memory area to expedite the use of arguments passed (without copying). The argument home area is a region of memory used by the called procedure for the purpose of assembling in contiguous memory the arguments passed in registers, adjacent to the arguments passed in memory, so all arguments can be addressed as a contiguous array. This area can also be used to store arguments passed in registers if an address for such an argument must be generated. Generally, 6 * 8 bytes of stack storage is allocated for this purpose by the called procedure.
If a procedure needs to reference its arguments as a longword array or construct a structure that looks like an in-memory longword argument list, then it might allocate enough longwords in this area to hold all of the argument list and, optionally, an argument count. In that case, argument items passed in memory must be copied to this longword array.
The high-address end of the stack frame is defined by the value stored in PDSC$L_SIZE plus the contents of SP or FP, as indicated by PDSC$V_BASE_REG_IS_FP. The high-address end is used to determine the value of SP for the predecessor procedure in the calling chain.
3.4.3.4. Register Save Area for All Stack Frames
The register save area is a set of consecutive quadwords in which registers saved and restored by the current procedure are stored (see Figure 3.4, “Register Save Area (RSA) Layout”). The register save area begins at the location pointed to by the offset PDSC$W_RSA_OFFSET from the frame base register (SP or FP as indicated by PDSC$V_BASE_REG_IS_FP), which must yield a quadword-aligned address. The set of registers saved in this area contain the return address followed by the registers specified in the procedure descriptor by PDSC$L_IREG_MASK and PDSC$L_FREG_MASK.
All registers saved in the register save area (other than the saved return address) must have the corresponding bit set in the appropriate procedure descriptor register save mask even if the register is not a member of the set of registers required to be saved across a standard call. Failure to do so will prevent the correct calculation of offsets within the save area.
Figure 3.4, “Register Save Area (RSA) Layout” illustrates the fields in the register save area (field names within brackets are optional fields). Quadword RSA$Q_SAVED_RETURN is the first field in the save area and it contains the contents of the return address register. The optional fields vary in size (8-byte increments) to preserve, as required, the contents of the integer and floating-point hardware registers used in the procedure.
The return address is saved at the lowest address of the register save area (offset 0).
All saved integer registers (as indicated by the corresponding bit in PDSC$L_IREG_MASK being set to 1) are stored, in register-number order, in consecutive quadwords, beginning at offset 8 of the register save area.
- All saved floating-point registers (as indicated by the corresponding bit in PDSC$L_FREG_MASK being set to 1) are stored, in register-number order, in consecutive quadwords, following the saved integer registers.
Note
Floating-point registers saved in the register save area are stored as a 64-bit exact image of the register (for example, no reordering of bits is done on the way to or from memory). Compilers must use an STT instruction to store the register regardless of floating-point type.
The preserved register set must always include R29 (FP), because it will always be used.
If the return address register is not to be preserved (as is the case for a standard call), then it must be stored at offset 0 in the register save area and the corresponding bit in the register save mask must not be set.
However, if a nonstandard call is made that requires the return address register to be saved and restored, then it must be stored in both the location at offset 0 in the register save area and at the appropriate location within the variable part of the save area. In addition, the appropriate bit of PDSC$L_IREG_MASK must be set to 1.
The example register save area shown in Figure 3.5, “Register Save Area (RSA) Example” illustrates the register packing when registers R10, R11, R15, FP, F2, and F3 are being saved for a procedure called with a standard call.
3.4.4. Register Frame Procedure
A register frame procedure does not maintain a call frame on the stack and must, therefore, save its caller's context in registers. This type of procedure is sometimes referred to as a lightweight procedure, referring to the expedient way of saving the call context.
Such a procedure cannot save and restore nonscratch registers. Because a procedure without a stack frame must use scratch registers to maintain the caller's context, such a procedure cannot make a standard call to any other procedure.
Note
Lightweight procedures have more freedom than might be apparent. By using appropriate agreements with callers of the lightweight procedure, with procedures that the lightweight procedure calls, and by the use of unwind handlers, a lightweight procedure can modify nonscratch registers and can call other procedures.
Such agreements may be by convention (as in the case of language-support routines in the RTL) or by interprocedural analysis. However, calls employing such agreements are not standard calls and might not be fully supported by a debugger; for example, the debugger might not be able to find the contents of the preserved registers.
Because such agreements must be permanent (for upwards compatibility of object code), lightweight procedures should, in general, follow the normal restrictions.
3.4.5. Procedure Descriptor for Procedures with a Register Frame
A register frame procedure descriptor built by a compiler provides information about a procedure with a register frame. The minimum size of the descriptor is 24 bytes (defined by PDSC$K_MIN_REGISTER_SIZE). An optional PDSC extension in 8-byte increments supports exception handling requirements.
The fields defined in the register frame procedure descriptor are illustrated in Figure 3.6, “Register Frame Procedure Descriptor (PDSC)” and described in Table 3.4, “Contents of Register Frame Procedure Descriptor (PDSC)”.
Field Name |
Contents | ||
---|---|---|---|
PDSC$W_FLAGS |
The PDSC descriptor flag bits <15:0> are defined as follows: | ||
PDSC$V_KIND |
A 4-bit field <3:0> that identifies the type of procedure descriptor. For a procedure with a register frame, this field must specify a value 10 (defined by constant PDSC$K_KIND_FP_REGISTER). | ||
PDSC$V_HANDLER_VALID |
If set to 1, this descriptor has an extension for the stack handler (PDSC$Q_REG_HANDLER) information. | ||
PDSC$V_HANDLER_ |
If set to 1, the handler can be reinvoked, allowing an occurrence of another exception while the handler is already active. If this bit is set to 0, the exception handler cannot be reinvoked. This bit must be 0 when PDSC$V_HANDLER_VALID is 0. | ||
PDSC$V_HANDLER_ |
If set to 1, the HANDLER_VALID bit must be 1 and the PDSC extension STACK_HANDLER_DATA field contains valid data for the exception handler, and the address of PDSC$Q_STACK_HANDLER _DATA will be passed to the exception handler as defined in Section 9.2, “Condition Handlers”. | ||
PDSC$V_BASE_REG_IS_FP |
If this bit is set to 0, the SP is the base register to which PDSC$L_SIZE is added during an unwind. A fixed amount of storage is allocated in the procedure entry sequence, and SP is modified by this procedure only in the entry and exit code sequence. In this case, FP typically contains the address of the procedure descriptor for the procedure. Note that a procedure that sets this bit to 0 cannot make standard calls. If this bit is set to 1, FP is the base address and the procedure has a fixed amount of stack storage specified by PDSC$L_SIZE. A variable amount of stack storage can be allocated by modifying SP in the entry and exit code of this procedure. | ||
PDSC$V_REI_RETURN |
If set to 1, the procedure expects the stack at entry to be set, so an REI instruction correctly returns from the procedure. Also, if set, the contents of the PDSC$B_SAVE_RA field are unpredictable and the return address is found on the stack. | ||
Bit 9 |
Must be 0 (reserved). | ||
PDSC$V_BASE_FRAME |
For compiled code, this bit must be 0. If set to 1, this bit indicates the logical base frame of a stack that precedes all frames corresponding to user code. The interpretation and use of this frame and whether there are any predecessor frames is system software defined (and subject to change). | ||
PDSC$V_TARGET_INVO |
If set to 1, the exception handler for this procedure is invoked when this procedure is the target invocation of an unwind. Note that a procedure is the target invocation of an unwind if it is the procedure in which execution resumes following completion of the unwind. For more information, see Chapter 9, OpenVMS Conditions. If set to 0, the exception handler for this procedure is not invoked. Note that when PDSC$V_HANDLER_VALID is 0, this bit must be 0. | ||
PDSC$V_NATIVE |
For compiled code, this bit must be set to 1. | ||
PDSC$V_NO_JACKET |
For compiled code, this bit must be set to 1. | ||
PDSC$V_TIE_FRAME |
For compiled code, this bit must be 0. Reserved for use by system software. | ||
Bit 15 |
Must be 0 (reserved). | ||
PDSC$B_SAVE_FP |
Specifies the number of the register that contains the saved value of the frame pointer (FP) register. In a standard procedure, this field must specify a scratch register so as not to violate the rules for procedure entry code as specified in Section 3.6.5, “Entry and Exit Code Sequences”. | ||
PDSC$B_SAVE_RA |
Specifies the number of the register that contains the return address. If this procedure uses standard call conventions and does not modify R26, then this field can specify R26. In a standard procedure, this field must specify a scratch register so as not to violate the rules for procedure entry code as specified in Section 3.6.5, “Entry and Exit Code Sequences”. | ||
PDSC$V_FUNC_ |
A 4-bit field <11:8> that describes which registers are used for the function value return (if there is one) and what format is used for those registers. Table 6.4, “Function Return Signature Encodings” lists and describes the possible encoded values of PDSC$V_FUNC_RETURN. | ||
PDSC$V_ |
A 3-bit field <14:12> that encodes the caller's desired exception-reporting behavior when calling certain mathematically oriented library routines. These routines generally search up the call stack to find the desired exception behavior whenever an error is detected. This search is performed independent of the setting of the Alpha FPCR. The possible values for this field are defined as follows: | ||
Value |
Name |
Meaning | |
0 |
PDSC$K_EXC_ |
Raise exceptions for all error conditions except for underflows producing a 0 result. This is the default mode. | |
1 |
PDSC$K_EXC_ |
Raise exceptions for all error conditions (including underflows). | |
2 |
PDSC$K_EXC_ |
Raise no exceptions. Create only finite values (no infinities, denormals, or NaNs). In
this mode, either the function result or the C language | |
3 |
PDSC$K_EXC_ |
Raise no exceptions except as controlled by separate IEEE exception enable bits. Create infinities, denormals, or NaN values according to the IEEE floating-point standard. | |
4 |
PDSC$K_EXC_ |
Perform the exception-mode behavior specified by this procedure's caller. | |
PDSC$W_ |
A 16-bit signed byte offset from the start of the procedure descriptor. This offset designates the start of the procedure signature block (if any). A 0 in this field indicates no signature information is present. Note that in a bound procedure descriptor (as described in Section 3.6.4, “Simple and Bound Procedures”), signature information might be present in the related procedure descriptor. A 1 in this field indicates a standard default signature. An offset value of 1 is not otherwise a valid offset because both procedure descriptors and signature blocks must be quadword aligned. | ||
PDSC$Q_ENTRY |
Absolute address of the first instruction of the entry code sequence for the procedure. | ||
PDSC$L_SIZE |
Unsigned size in bytes of the fixed portion of the stack frame for this procedure. The size must be a multiple of 16 bytes to maintain the minimum stack alignment required by the Alpha hardware architecture and stack alignment during a call (defined in Section 3.6.1, “Call Conventions”). | ||
PDSC$W_ENTRY_ |
Unsigned offset in bytes from the entry point to the first instruction in the procedure code segment following the procedure prologue (that is, following the instruction that updates FP to establish this procedure as the current procedure). | ||
PDSC$Q_REG_ |
Absolute address to the procedure descriptor for a run-time static exception handling procedure. This part of the procedure descriptor is optional. It must be supplied if either PDSC$V_HANDLER_VALID is 1 or PDSC$V_HANDLER_DATA_VALID is 1 (which requires that PDSC$V_HANDLER_VALID be 1). If PDSC$V_HANDLER_VALID is 0, then the contents or existence of PDSC$Q_REG_HANDLER is unpredictable. | ||
PDSC$Q_REG_ |
Data (quadword) for the exception handler. This is an optional quadword and needs to be supplied only if PDSC$V_HANDLER_DATA_VALID is 1. If PDSC$V_HANDLER_DATA_VALID is 0, then the contents or existence of PDSC$Q_REG_HANDLER_DATA is unpredictable. |
3.4.6. Null Frame Procedures
A procedure may conform to this standard even if it does not establish its own context if, in all circumstances, invocations of that procedure do not need to be visible or debuggable. This is termed executing in the context of the caller and is similar in concept to a conventional VAX JSB procedure. For the purposes of stack tracing or unwinding, such a procedure is never considered to be current.
For example, if a procedure does not establish an exception handler or does not save and restore registers, and does not extend the stack, then that procedure might not need to establish a context. Likewise, if that procedure does extend the stack, it still might not need to establish a context if the immediate caller either cannot be the target of an unwind or is prepared to reset the stack if it is the target of an unwind.
The circumstances under which procedures can run in the context of the caller are complex and are not fully specified by this standard.
As with the other procedure types previously described, the choice of whether to establish a context belongs to the called procedure. By defining a null procedure descriptor format, the same invocation code sequence can be used by the caller for all procedure types.
3.4.7. Procedure Descriptor for Null Frame Procedures
The null frame procedure descriptor built by a compiler provides information about a procedure with no frame. The size of the descriptor is 16 bytes (defined by PDSC$K_NULL_SIZE).
The fields defined in the null frame descriptor are illustrated in Figure 3.7, “Null Frame Procedure Descriptor (PDSC) Format” and described in Table 3.5, “Contents of Null Frame Procedure Descriptor (PDSC)”.
Field Name |
Contents | |
---|---|---|
PDSC$W_FLAGS |
The PDSC descriptor flag bits <15:0> are defined as follows: | |
PDSC$V_KIND |
A 4-bit field <3:0> that identifies the type of procedure descriptor. For a null frame procedure, this field must specify a value 8 (defined by constant PDSC$K_KIND_NULL). | |
Bits 4—7 |
Must be 0. | |
PDSC$V_REI_ |
Bit 8. If set to 1, the procedure expects the stack at entry to be set, so an REI instruction correctly returns from the procedure. Also, if set, the contents of the PDSC$B_SAVE_RA field are unpredictable and the return address is found on the stack. | |
Bit 9 |
Must be 0 (reserved). | |
PDSC$V_BASE_ |
For compiled code, this bit must be 0. If set to 1, indicates the logical base frame of a stack that precedes all frames corresponding to user code. The interpretation and use of this frame and whether there are any predecessor frames is system software defined (and subject to change). | |
Bit 11 |
Must be 0 (reserved). | |
PDSC$V_NATIVE |
For compiled code, this bit must be set to 1. | |
PDSC$V_NO_JACKET |
For compiled code, this bit must be set to 1. | |
PDSC$V_TIE_FRAME |
For compiled code, this bit must be 0. Reserved for use by system software. | |
Bit 15 |
Must be 0 (reserved). | |
PDSC$V_FUNC_RETURN |
A 4-bit field <11:8> that describes which registers are used for the function value return (if there is one) and what format is used for those registers. Table 6.4, “Function Return Signature Encodings” lists and describes the possible encoded values of PDSC$V_FUNC_RETURN. | |
PDSC$W_SIGNATURE_ |
A 16-bit signed byte offset from the start of the procedure descriptor. This offset designates the start of the procedure signature block (if any). A 0 in this field indicates that no signature information is present. Note that in a bound procedure descriptor (as described in Section 3.6.4, “Simple and Bound Procedures”), signature information might be present in the related procedure descriptor. A 1 in this field indicates a standard default signature. An offset value of 1 is not otherwise a valid offset because both procedure descriptors and signature blocks must be quadword aligned. | |
PDSC$Q_ENTRY |
The absolute address of the first instruction of the entry code sequence for the procedure. |
3.5. Procedure Call Stack
Except for null-frame procedures, a procedure is an active procedure while its body is executing, including while any procedure it calls is executing. When a procedure is active, it may handle an exception that is signaled during its execution.
Associated with each active procedure is an invocation context, which consists of the set of registers and space in memory that is allocated and that may be accessed during execution for a particular call of that procedure.
When a procedure begins to execute, it has no invocation context. The initial instructions that allocate and initiallize its context, which may include saving information from the invocation context of its caller, are termed the procedure prologue. Once execution of the prologue is complete, the procedure is said to be active.
When a procedure is ready to return to its caller, the instructions that deallocate and discard the procedure's invocation context (which may include restoring state of the caller's invocation context that was saved during the prologue), are termed a procedure epilogue. A procedure ceases to be active when execution of its epilogue begins.
A procedure may have more than one prologue if there are multiple entry points. A procedure may also have more than one epilogue if there are multiple return points. One of each will be executed during any given invocation of the procedure.
Some procedures, notably null frame procedures (see Section Section 3.4.6, “Null Frame Procedures”), never have an invocation context of their own and are said to execute in the body of their caller. A null frame procedure has no prologue or epilogue, and consists solely of body instructions. Such a procedure never becomes current or active in the sense that its handler may be invoked.
A call stack (for a thread) consists of the stack of invocation contexts that exists at any point in time. New invocation contexts are pushed on that stack as procedures are called and invocations are popped from the call stack as procedures return.
The invocation context of a procedure that calls another procedure is said to precede or be previous to the invocation context of the called procedure.
3.5.1. Current Procedure
The current procedure is the active procedure whose execution began most recently; its invocation context is at the top of the call stack. Note that a procedure executing in its prologue or epilogue is not active, and hence cannot be the current procedure. Similarly, a null frame procedure cannot be the current procedure.
In this calling standard, R29 is the frame pointer (FP) register that defines the current procedure.
Pointer to the procedure descriptor for that procedure.
Pointer to a naturally aligned quadword containing the address of the procedure descriptor for that procedure. For purposes of finding a procedure's procedure descriptor, no assumptions must be made about the quadword location. As long as all other requirements of this standard are met, a compiler is free to use FP as a base register for any arbitrary storage, including a stack frame, provided that while the procedure is current, the quadword pointed to by the value in FP contains the address of that procedure's descriptor.
If 0(FP)<2:0> = 0, then FP points to a quadword that contains a pointer to the procedure descriptor for the current procedure.
If 0(FP)<2:0> ≠ 0, then FP points to the procedure descriptor for the current procedure.
By examining the first quadword of the procedure descriptor, the procedure type can be determined from the PDSC$V_KIND field.
LDQ R0,0(FP) ;Fetch quadword at FP AND R0,#7,R28 ;Mask alignment bits BNEQ R28,20$ ;Is procedure descriptor pointer LDQ R0,0(R0) ;Was pointer to procedure descriptor 10$: AND R0,#7,R28 ;Do sanity check BNEQ R28,20$ ;All is well ;Error - Invalid FP 20$: AND R0,#15,R0 ;Get kind bits ;Procedure KIND is now in R0
IF PDSC$V_KIND is equal to PDSC$K_KIND_FP_STACK, the current procedure has a stack frame.
If PDSC$V_KIND is equal to PDSC$K_KIND_FP_REGISTER, the current procedure is a register frame procedure.
Either type of procedure can use either type of mechanism to point to the procedure descriptor. Compilers may choose the appropriate mechanism to use based on the needs of the procedure involved.
3.5.2. Procedure Call Tracing
To provide the context of a procedure invocation
To walk (navigate) the procedure call stack
To refer to a given procedure invocation
This section describes the data structure mechanisms. The routines that support these functions are described in Section 3.5.3, “Invocation Context Access Routines”.
3.5.2.1. Referring to a Procedure Invocation from a Data Structure
When referring to a specific procedure invocation at run-time, an invocation context handle, shown in Figure 3.8, “Invocation Context Handle Format”, can be used. Defined by constant LIBICB$K_INVO_HANDLE_SIZE, the structure is a single-field longword called HANDLE. HANDLE describes the invocation handle of the procedure.
If PDSC$V_BASE_REG_IS_FP is set to 1 in the corresponding procedure descriptor, then set INVO_HANDLE to the contents of the FP register in that invocation.
If PDSC$V_BASE_REG_IS_FP is set to 0, set INVO_HANDLE to the contents of the SP register in that invocation. (That is, start with the base register value for the frame).
Shift the INVO_HANDLE contents left one bit. Because this value is initially known to be octaword aligned (see Section 3.6.1, “Call Conventions”), the result is a value whose 5 low-order bits are 0.
If PDSC$V_KIND = PDSC$K_KIND_FP_STACK, perform a logical OR on the contents of INVO_HANDLE with the value 1F16, and then set INVO_HANDLE to the value that results.
If PDSC$V_KIND = PDSC$K_KIND_FP_REGISTER, perform a logical OR on the contents of INVO_HANDLE with the contents of PDSC$B_SAVE_RA, and then set INVO_HANDLE to the value that results.
Note
So you can distinguish an invocation of a register frame procedure that calls another register frame procedure (where the called procedure uses no stack space and therefore has the same base register value as the caller), the register number that saved the return address is included in the invocation handle of a register frame procedure. Similarly, the number 3110 in the invocation handle of a stack frame procedure is included to distinguish an invocation of a stack frame procedure that calls a register frame procedure where the called procedure uses no stack space.
3.5.2.2. Invocation Context Block
The context of a specific procedure invocation is provided through the use of a data structure called an invocation context block. The minimum size of the block is 528 bytes and is system defined using the constant LIBICB$K_INVO_CONTEXT_BLK_SIZE. The size of the last field (LIBICB$Q_SYSTEM_DEFINED[n]) defined by the host system determines the total size of the block.
The fields defined in the invocation context block are illustrated in the following figure and described in Table 3.6, “Contents of the Invocation Context Block”.
Field Name |
Contents | |
---|---|---|
LIBICB$L_CONTEXT_LENGTH |
Unsigned count of the total length in bytes of the context block; this represents the sum of the lengths of the standard-defined portion and the system-defined section. | |
LIBICB$R_FRAME_FLAGS |
The procedure frame flag bits <23:0> are defined as follows: | |
LIBICB$V_EXCEPTION_ |
Bit 0. If set to 1, the invocation context corresponds to an exception frame. | |
LIBICB$V_AST_FRAME |
Bit 1. If set to 1, the invocation context corresponds to an asynchronous trap. | |
LIBICB$V_BOTTOM_OF_ |
Bit 2. If set to 1, the invocation context corresponds to a frame that has no predecessor. | |
LIBICB$V_BASE_FRAME |
Bit 3. If set to 1, the BASE_FRAME bit is set in the FLAGS field of the associated procedure descriptor. | |
LIBICB$B_BLOCK_VERSION |
A byte that defines the version of the context block. Because this block is currently the first version, the value is set to 1. | |
LIBICB$PH_PROCEDURE_ |
Address of the procedure descriptor for this context. | |
LIBICB$Q_PROGRAM_ |
Quadword that contains the current value of the procedure's program counter. For interrupted procedures, this is the same as the continuation program counter; for active procedures, this is the return address back into that procedure. | |
LIBICB$Q_PROCESSOR_ |
Contains the current value of the processor status. | |
LIBICB$Q_IREG[ |
Quadword that contains the current value of the integer register in the procedure
(where | |
LIBICB$Q_FREG[ |
Quadword that contains the current value of the floating-point register in the
procedure (where | |
LIBICB$Q_SYSTEM_ |
A variable-sized area with locations defined in quadword increments by the host environment that contains procedure context information. These locations are not defined by this standard. |
3.5.2.3. Getting a Procedure Invocation Context with a Routine
A thread can obtain its own context or the current context of any procedure invocation in the current stack call (given an invocation handle) by calling the run-time library functions defined in Section 3.5.3, “Invocation Context Access Routines”.
3.5.2.4. Walking the Call Stack
During the course of program execution, it is sometimes necessary to walk the call stack. Frame-based exception handling is one case where this is done. Call stack navigation is possible only in the reverse direction (in a latest-to-earliest or top-to-bottom sequence).
Given a program state (which contains a register set), build an invocation context block.
For the current routine, an initial invocation context block can be obtained by calling the LIB$GET_CURR_INVO_CONTEXT routine. See Section 3.5.3.2, “LIB$GET_CURR_INVO_CONTEXT”.
Repeatedly call the LIB$GET_PREV_INVO_CONTEXT routine until the end of the chain has been reached (as signified by 0 being returned). See Section 3.5.3.3, “LIB$GET_PREV_INVO_CONTEXT”.
The bottom of stack frame (end of the call chain) is indicated (LIBICB$V_BOTTOM_OF_STACK) when the target frame's saved FP value is 0.
Compilers are allowed to optimize high-level language procedure calls in such a way that they do not appear in the invocation chain. For example, inline procedures never appear in the invocation chain.
Make no assumptions about the relative positions of any memory used for procedure frame information. There is no guarantee that successive stack frames will always appear at higher addresses.
3.5.3. Invocation Context Access Routines
A thread can manipulate the invocation context of any procedure in the thread's virtual address space by calling the following run-time library functions.
3.5.3.1. LIB$GET_INVO_CONTEXT
LIB$GET_INVO_CONTEXT(invo_handle, invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_handle |
invo_handle |
longword (unsigned) |
read |
by value |
invo_context |
invo_context_blk |
structure |
write |
by reference |
|
Handle for the desired invocation. |
|
Address of an invocation context block into which the procedure context of the frame
specified by |
|
Status value. A value of 1 indicates success; a value of 0 indicates failure. |
Note
If the invocation handle that was passed does not represent any procedure context in the active call stack, the value of the new contents of the context block is unpredictable.
3.5.3.2. LIB$GET_CURR_INVO_CONTEXT
LIB$GET_CURR_INVO_CONTEXT(invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
write |
by reference |
|
Address of an invocation context block into which the procedure context of the caller will be written. |
Zero |
This is to facilitate use in the implementation of the C language unwind
|
3.5.3.3. LIB$GET_PREV_INVO_CONTEXT
LIB$GET_PREV_INVO_CONTEXT(invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
|
Address of an invocation context block. The given invocation context block is updated to represent the context of the previous (calling) frame. The LIBICB$V_BOTTOM_OF_STACK flag of the invocation context block is set if the target frame represents the end of the invocation call chain or if stack corruption is detected. |
|
Status value. A value of 1 indicates success. When the initial context represents the bottom of the call stack, a value of 0 is returned. If the current operation completed without error, but a stack corruption was detected at the next level down, a value of 3 is returned. |
3.5.3.4. LIB$GET_INVO_HANDLE
LIB$GET_INVO_HANDLE(invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
read |
by reference |
|
Address of an invocation context block. Here, only the frame pointer and stack pointer fields of an invocation context block must be defined. |
|
Invocation handle of the invocation context that was passed. If the returned value is LIB$K_INVO_HANDLE_NULL, the invocation context that was passed was invalid. |
3.5.3.5. LIB$GET_PREV_INVO_HANDLE
LIB$GET_PREV_INVO_HANDLE(invo_handle)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_handle |
invo_handle |
longword (unsigned) |
read |
by value |
|
An invocation handle that represents a target invocation context. |
|
An invocation handle for the invocation context that is previous to that which was specified as the target. |
3.5.3.6. LIB$PUT_INVO_REGISTERS
LIB$PUT_INVO_REGISTERS(invo_handle, invo_context, invo_mask)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_handle |
invo_handle |
longword (unsigned) |
read |
by value |
invo_context |
invo_context_blk |
structure |
read |
by reference |
invo_mask |
mask_quadword |
quadword (unsigned) |
read |
by reference |
|
Handle for the invocation to be updated. |
|
Address of an invocation context block that contains new register contents. Each register that is set in the |
|
Address of a 64-bit bit vector, where each bit corresponds to a register field in
the passed |
|
Status value. A value of 1 indicates success. When the initial context represents
the bottom of the call stack or when bit 30 of the |
Caution
While this routine can be used to update the frame pointer (FP), great care must be taken to assure that a valid stack frame and execution environment result; otherwise, execution may become unpredictable.
3.6. Transfer of Control
This standard states that a standard call (see Section 1.4, “Definitions”) may be accomplished in any way that presents the called routine with the required environment. However, typically, most standard-conforming external calls are implemented with a common sequence of instructions and conventions. Because a common set of call conventions is so pervasive, these conventions are included for reference as part of this standard.
One important feature of the calling standard is that the same instruction sequence can be used to call each of the different types of procedure. Specifically, the caller does not have to know which type of procedure is being called.
3.6.1. Call Conventions
Procedure value
The calling procedure must pass to the called procedure its procedure value. This value can be a statically or dynamically bound procedure value. This is accomplished by loading R27 with the procedure value before control is transferred to the called procedure.
Return address
The calling procedure must pass to the called procedure the address to which control must be returned during a normal return from the called procedure. In most cases, the return address is the address of the instruction following the one that transferred control to the called procedure. For a standard call, this address is passed in the return address register (R26).
Argument list
The argument list is an ordered set of zero or more argument items that together constitute a logically contiguous structure known as an argument item sequence. This logically contiguous sequence is typically mapped to registers and memory in a way that produces a physically discontiguous argument list. In a standard call, the first six items are passed in registers R16—21 or registers F16—21. (See Section 3.7.2, “Argument List Structure” for details of argument-to-register correspondence). The remaining items are collected in a memory argument list that is a naturally aligned array of quadwords. In a standard call, this list (if present) must be passed at 0(SP).
The calling procedure must pass to the called procedure information about the argument list. This information is passed in the argument information (AI) register (R25). Defined by AI$K_AI_SIZE, the structure is a quadword as shown in Figure 3.10, “Argument Information Register (R25) Format” with the fields described in Table 3.7, “Contents of the Argument Information Register (R25)”.
Table 3.7. Contents of the Argument Information Register (R25) Field Name
Contents
AI$B_ARG_COUNT
Unsigned byte <7:0> that specifies the number of 64-bit argument items in the argument list (known as the “argument count”).
AI$V_ARG_REG_
INFOAn 18-bit vector field <25:8> divided into six groups of 3 bits that correspond to the six arguments passed in registers. These groups describe how each of the first six arguments are passed in registers with the first group <10:8> describing the first argument. The encoding for each group for the argument register usage follows:
Value
Name
Meaning
0 AI$K_AR_I64
64-bit or 32-bit sign-extended to 64-bit argument passed in an integer register (including addresses). or Argument is not present.
1 AI$K_AR_FF
F_floating argument passed in a floating register.
2 AI$K_AR_FD
D_floating argument passed in a floating register.
3 AI$K_AR_FG
G_floating argument passed in a floating register.
4 AI$K_AR_FS
S_floating argument passed in a floating register.
5 AI$K_AR_FT
T_floating argument passed in a floating register.
6, 7
— Reserved.
Bits 26—63
Reserved and must be 0.
Function result
If a standard-conforming procedure is a function and the function result is returned in a register, then the result is returned in R0, F0, or F0 and F1. Otherwise, the function result is returned via the first argument item or dynamically as defined in Section 3.7.7, “Returning Data”.
Stack usage
Whenever control is transferred to another procedure, the stack pointer (SP) must be octaword aligned; at other times there is no stack alignment requirement. (A side effect of this is that the in-memory portion of the argument list will start on an octaword boundary). During a procedure invocation, the SP (R30) can never be set to a value higher than the SP at entry to that procedure invocation.
The contents of the stack located above the portion of the argument list that is passed in memory (if any) belongs to the calling procedure and is, therefore, not to be read or written by the called procedure, except as specified by indirect arguments or language-controlled up-level references.
Because SP is used by the hardware in raising exceptions and asynchronous interrupts, the contents of the next 2048 bytes below the current SP value are continually and unpredictably modified. Software that conforms to this standard must not depend on the contents of the 2048 stack locations below 0(SP).Note
One implication of the stack alignment requirement is that low-level interrupt and exception-fielding software must be prepared to handle and correct the alignment before calling handler routines, in case the stack pointer is not octaword aligned at the time of an interrupt or exception.
3.6.2. Linkage Section
Because the Alpha hardware architecture has the property of instructions that cannot contain full virtual addresses, it is sometimes referred to as a base register architecture. In a base register architecture, normal memory references within a limited range from a given address are expressed by using displacements relative to the contents of a register containing that address (base register). Base registers for external program segments, either data or code, are usually loaded indirectly through a program segment of address constants.
The fundamental program section containing address constants that a procedure uses to access other static storage, external procedures, and variables is termed a linkage section. Any register used to access the contents of the linkage section is termed a linkage pointer.
A procedure's linkage section includes the procedure descriptor for the procedure, addresses of all external variables and procedures referenced by the procedure, and other constants a compiler may choose to reference using a linkage pointer.
When a standard procedure is called, the caller must provide the procedure value for that procedure in R27. Static procedure values are defined to be the address of the procedure's descriptor. Because the procedure descriptor is part of the linkage section, calling this type of procedure value provides a pointer into the linkage section for that procedure in R27. This linkage pointer can then be used by the called procedure as a base register to address locations in its linkage section. For this reason, most compilers generate references to items in the linkage section as offsets from a pointer to the procedure's descriptor.
Compilers usually arrange (as part of the environment setup) to have the environment setup code (for bound procedures) load R27 with the address of the procedure's descriptor so it can be used as a linkage pointer as previously described. For an example, see Section 3.6.4, “Simple and Bound Procedures”.
Although not required, linkages to external procedures are typically represented in the calling procedure's linkage section as a linkage pair. As shown in Figure 3.11, “Linkage Pair Block Format” and described in Table 3.8, “Contents of the Linkage Pair Block”, a linkage pair (LKP) block with two fields should be octaword aligned and defined by LKP$K_SIZE as 16 bytes.
Field Name |
Contents |
---|---|
LKP$Q_ENTRY |
Absolute address of the first instruction of the called procedure's entry code sequence. |
LKP$Q_PROC_VALUE |
Contains the procedure value of the procedure to be called. Normally, this field is the absolute address of a procedure descriptor for the procedure to be called, but in certain cases, it could be a bound procedure value (such as for procedures that are called through certain types of transfer vectors). |
In general, an object module contains a procedure descriptor for each entry point in the module. The descriptors are allocated in a linkage section. For each external procedure Q that is referenced in a module, the module's linkage section also contains a linkage pair denoting Q (which is a pointer to Q's procedure descriptor and entry code address).
LDQ R26,Q_DESC-MY_DESC(R4) ;Q's entry address into R26 LDQ R27,Q_DESC-MY_DESC+8(R4) ;Q's procedure value into R27 MOVQ #AI_LITERAL,R25 ;Load Argument Information register JSR R26,(R26) ;Call to Q. Return address in R26
Because Q's procedure descriptor (statically defined procedure value) is in Q's linkage section, Q can use the value in R27 as a base address for accessing data in its linkage section. Q accesses external procedures and data in other program sections through pointers in its linkage section. Therefore, R27 serves as the root pointer through which all data can be referenced.
3.6.3. Calling Computed Addresses
Most calls are made to a fixed address whose value is determined by the time the program starts execution. However, certain cases are possible that cause the exact address to be unknown until the code is finally executed. In this case, the procedure value representing the procedure to be called is computed in a register.
LDQ R26,8(R4) ;Entry address to scratch register MOV R4,R27 ;Procedure value to R27 MOV #AI_LITERAL,R25 ;Load Argument Information register JSR R26,(R26) ;Call entry address.
For interoperation with translated images, see Chapter 6, Signature Information and Translated Images (Alpha and I64 Systems).
3.6.4. Simple and Bound Procedures
Simple procedure
Bound procedure
A simple procedure is a procedure that does not need direct access to the stack of its execution environment. A bound procedure is a procedure that does need direct access to the stack of its execution environment, typically to reference an up-level variable or to perform a nonlocal GOTO operation. Both a simple procedure and a bound procedure have an associated procedure descriptor, as described in previous sections.
When a bound procedure is called, the caller must pass some kind of pointer to the called code that allows it to reference its up-level environment. Typically, this pointer is the frame pointer for that environment, but many variations are possible. When the caller is executing its program within that outer environment, it can usually make such a call directly to the code for the nested procedure without recourse to any additional procedure descriptors. However, when a procedure value for the nested procedure must be passed outside of that environment to a call site that has no knowledge of the target procedure, a bound procedure descriptor is created so that the nested procedure can be called just like a simple procedure.
Bound procedure values, as defined by this standard, are designed for multilanguage use and utilize the properties of procedure descriptors to allow callers of procedures to use common code to call both bound and simple procedures.
3.6.4.1. Bound Procedure Descriptors
Bound procedure descriptors provide a mechanism to interpose special processing between a call and the called routine without modifying either. The descriptor may contain (or reference) data used as part of that processing. Between native and translated images, the OpenVMS Alpha operating system uses linker and image-activator created bound procedure descriptors to mediate the handling of parameter and result passing (see Section 6.2, “Signature Information Blocks”). Language processors on OpenVMS Alpha systems use bound procedure descriptors to implement bound procedure values (see Section 3.6.4.2, “Bound Procedure Value”). Other uses are possible.
The minimum size of the descriptor is 24 bytes (defined by PDSC$K_MIN_BOUND_SIZE). An optional PDSC extension in 8-byte increments provides the specific environment values as defined by the implementation.
The fields defined in the bound procedure descriptor are illustrated in Figure 3.12, “Bound Procedure Descriptor (PDSC)” and described in Table 3.9, “Contents of the Bound Procedure Descriptor (PDSC)”.
Field Name |
Contents | |
---|---|---|
PDSC$W_FLAGS |
Vector of flag bits <15:0> that must be a copy of the flag bits (except for KIND bits) contained in the quadword pointed to by PDSC$Q_PROC_VALUE. | |
PDSC$V_KIND |
A 4-bit field <3:0> that identifies the type of procedure descriptor. For a procedure with bound values, this field must specify a value of 0. | |
PDSC$V_FUNC_RETURN |
A 4-bit field <11:8> that describes which registers are used for the function value return (if there is one) and what format is used for those registers. PDSC$V_FUNC_RETURN in a bound procedure descriptor must be the same as the PDSC$V_FUNC_RETURN of the procedure descriptor for the procedure for which the environment is established. Table 6.4, “Function Return Signature Encodings” lists and describes the possible encoding values of PDSC$V_FUNC_RETURN. | |
Bits 12—15 |
Reserved and must be 0. | |
PDSC$W_SIGNATURE_OFFSET |
A 16-bit signed byte offset from the start of the procedure descriptor. This offset designates the start of the procedure signature block (if any). In a bound procedure, a 0 in this field indicates the actual signature block must be sought in the procedure descriptor indicated by the PDSC$Q_PROC_VALUE field. A 1 in this field indicates a standard default signature. (An offset value of 1 is not a valid offset because both procedure descriptors and signature blocks must be quadword aligned. See Section 6.2, “Signature Information Blocks” for details of the procedure signature block). Note that a nonzero signature offset in a bound procedure value normally occurs only in the case of bound procedures used as part of the implementation of calls from native OpenVMS Alpha code to translated OpenVMS VAX images. In any case, if a nonzero offset is present, it takes precedence over signature information that might occur in any related procedure descriptor. | |
PDSC$Q_ENTRY |
Address of the transfer code sequence. | |
PDSC$Q_PROC_VALUE |
Value of the procedure to be called by the transfer code. The value can be either the address of a procedure descriptor for the procedure or possibly another bound procedure value. | |
PDSC$Q_ENVIRONMENT |
An environment value to pass to the procedure. The choice of environment value is system implementation specific. For more information, see Section 3.6.4.2, “Bound Procedure Value”. |
3.6.4.2. Bound Procedure Value
The procedure value for a bound procedure is a pointer to a bound procedure descriptor that, like all other procedure descriptors, contains the address to which the calling procedure must transfer control at offset 8 (see Figure 3.12, “Bound Procedure Descriptor (PDSC)”). This transfer code is responsible for setting up the dynamic environment needed by the target nested procedure and then completing the transfer of control to the code for that procedure. The transfer code receives in R27 a pointer to its corresponding bound procedure descriptor and thus can fetch any required environment information from that descriptor. A bound procedure descriptor also contains a procedure value for the target procedure that is used to complete the transfer of control.
When the transfer code sequence addressed by PDSC$Q_ENTRY of a bound procedure descriptor is called (by a call sequence such as the one given in Section 3.6.3, “Calling Computed Addresses”), the procedure value will be in R27, and the transfer code must finish setting up the environment for the target procedure. The preferred location for this transfer code is directly preceding the code for the target procedure. This saves a memory fetch and a branching instruction and optimizes instruction caches and paging.
Q_TRANSFER: LDQ R1,24(R27) ;Environment value to R1 LDQ R27,16(R27) ;Procedure descriptor address to R27 Q_ENTRY:: ;Normal procedure entry code starts here
After the transfer code has been executed and control is transferred to Q's entry address, R27 contains the address of Q's procedure descriptor, R26 (unmodified by transfer code) contains the return address, and R1 contains the environment value.
When a bound procedure value such as this is needed, the bound procedure descriptor is usually allocated on the parent procedure's stack.
3.6.5. Entry and Exit Code Sequences
To ensure that the stack can be interpreted at any point during thread execution, all procedures must adhere to certain conventions for entry and exit as defined in this section.
3.6.5.1. Entry Code Sequence
All registers specified by this standard as saved across a standard call must contain their original (at entry) contents.
No standard calls may be made.
Note
If an exception is raised or if an exception occurs in the entry code of a procedure, that procedure's exception handler (if any) will not be invoked because the procedure is not current yet. Therefore, if a procedure has an exception handler, compilers may not move code into the procedure prologue that might cause an exception that would be handled by that handler.
If PDSC$L_SIZE is not 0, set register SP = SP − PDSC$L_SIZE.
If PDSC$V_BASE_REG_IS_FP is 1, store the address of the procedure descriptor at 0(SP).
If PDSC$V_KIND = PDSC$K_KIND_FP_REGISTER, copy the return address to the register specified by PDSC$B_SAVE_RA, if it is not already there, and copy the FP register to the register specified by PDSC$B_SAVE_FP.
If PDSC$V_KIND = PDSC$K_KIND_FP_STACK, copy the return address to the quadword at the RSA$Q_SAVED_RETURN offset in the register save area denoted by PDSC$W_RSA_OFFSET, and store the registers specified by PDSC$L_IREG_MASK and PDSC$L_FREG_MASK in the register save area denoted by PDSC$W_RSA_OFFSET. (This step includes saving the value in FP).
Execute TRAPB if required (see Section 9.5.3.2, “Exception Synchronization (Alpha Only)” for details).
If PDSC$V_BASE_REG_IS_FP is 0, load register FP with the address of the procedure descriptor or the address of a quadword that contains the address of the procedure descriptor.
If PDSC$V_BASE_REG_IS_FP is 1, copy register SP to register FP.
The ENTRY_LENGTH value in the procedure descriptor provides information that is redundant with the setting of a new frame pointer register value. That is, the value could be derived by starting at the entry address and scanning the instruction stream to find the one that updates FP. The ENTRY_LENGTH value included in the procedure descriptor supports the debugger or PCA facility so that such a scan is not required.
Entry Code Example for a Stack Frame Procedure
This is a stack frame procedure
Registers R2—4 and F2—3 are saved and restored
PDSC$W_RSA_OFFSET = 16
The procedure has a static exception handler that does not reraise arithmetic traps
The procedure uses a variable amount of stack
If the code sequence in Example 3.1, “Entry Code for a Stack Frame Procedure” is interrupted by an asynchronous software interrupt, SP will have a different value than it did at entry, but the calling procedure will still be current.
Entry Code Example for a Register Frame
3.6.5.2. Exit Code Sequence
If PDSC$V_BASE_REG_IS_FP is 1, then copy FP to SP.
If PDSC$V_KIND = PDSC$K_KIND_FP_STACK, and this procedure saves or restores any registers other than FP and SP, reload those registers from the register save area as specified by PDSC$W_RSA_OFFSET.
If PDSC$V_KIND = PDSC$K_KIND_FP_STACK, load a scratch register with the return address from the register save area as specified by PDSC$W_RSA_OFFSET. (If PDSC$V_KIND = PDSC$K_KIND_FP_REGISTER, the return address is already in scratch register PDSC$B_SAVE_RA).
Execute TRAPB if required (see Section 9.5.3.2, “Exception Synchronization (Alpha Only)” for details).
If PDSC$V_KIND = PDSC$K_KIND_FP_REGISTER, copy the register specified by PDSC$B_SAVE_FP to register FP.
If PDSC$V_KIND = PDSC$K_KIND_FP_STACK, reload FP from the saved FP in the register save area.
If a function value is not being returned using the stack (PDSC$V_STACK_RETURN_VALUE is 0), then restore SP to the value it had at procedure entry by adding the value that was stored in PDSC$L_SIZE to SP. (In some cases, the returning procedure will leave SP pointing to a lower stack address than it had on entry to the procedure, as specified in Section 3.7.7, “Returning Data”).
Jump to the return address (which is in a scratch register).
The called routine does not adjust the stack to remove any arguments passed in memory. This responsibility falls to the calling routine that may choose to defer their removal because of optimizations or other considerations.
Exit Code Example for a Stack Frame
Interruption of the code sequence in Example 3.3, “Exit Code Sequence for a Stack Frame” by an asynchronous software interrupt can result in the calling procedure being the current procedure, but with SP not yet restored to its value in that procedure. The discussion of that situation in entry code sequences applies here as well.
Exit Code Example for a Register Frame
3.7. Data Passing
This section defines the OpenVMS Alpha calling standard conventions of passing data between procedures in a call stack. An argument item represents one unit of data being passed between procedures.
3.7.1. Argument Passing Mechanisms
Immediate value
Reference
Descriptor
Argument items are not self-defining; interpretation of each argument item depends on agreement between the calling and called procedures.
This standard does not dictate which passing mechanism must be used by a given language compiler. Language semantics and interoperability considerations might require different mechanisms in different situations.
Immediate value
An immediate value argument item contains the value of the data item. The argument item, or the value contained in it, is directly associated with the parameter.
Reference
A reference argument item contains the address of a data item such as a scalar, string, array, record, or procedure. This data item is associated with the parameter.
Descriptor
A descriptor argument item contains the address of a descriptor, which contains structural information about the argument's type (such as array bounds) and the address of a data item. This data item is associated with the parameter.
3.7.2. Argument List Structure
The argument list in an OpenVMS Alpha call is an ordered set of zero or more argument items, which together comprise a logically contiguous structure known as the argument item sequence. An argument item is specified using up to 64 bits.
A 64-bit argument item can be used to pass arguments by immediate value, by reference, and by descriptor. Any combination of these mechanisms in an argument list is permitted.
Although the argument items form a logically contiguous sequence, they are in practice mapped to integer and floating-point registers and to memory in a method that can produce a physically discontiguous argument list. Registers R16—21 and F16—21 are used to pass the first six items of the argument item sequence. Additional argument items must be passed in a memory argument list that must be located at 0(SP) at the time of the call.
Item |
Integer Register |
Floating-Point Register |
Stack |
---|---|---|---|
1 |
R16 |
F16 | |
2 |
R17 |
F17 | |
3 |
R18 |
F18 | |
4 |
R19 |
F19 | |
5 |
R20 |
F20 | |
6 |
R21 |
F21 | |
7—n |
0(SP) - (n-7)*8(SP) |
All argument items are passed in the integer registers or on the stack, except for argument items that are floating-point data passed by immediate value.
Floating-point data passed by immediate value is passed in the floating-point registers or on the stack.
Only one location (across an item row in Table 3.10, “Argument Item Locations”) can be used by any given argument item in a list. For example, if argument item 3 is an integer passed by value, and argument item 4 is a single-precision floating-point number passed by value, then argument item 3 is assigned to R18 and argument item 4 is assigned to F19.
A single- or double-precision complex value is treated as two arguments for the purpose of argument-item sequence rules. In particular, the real part of a complex value might be passed as the sixth argument item in register F21, in which case the imaginary part is then passed as the seventh argument item in memory.
An extended precision complex value is passed by reference using a single integer or stack argument item. (An extended precision complex value is not passed by immediate value because the component extended precision floating values are not passed by value. See also Section 3.7.5.1, “Sending Mechanism”).
The argument list that includes both the in-memory portion and the portion passed in registers can be read from and written to by the called procedure. Therefore, the calling procedure must not make any assumptions about the validity of any part of the argument list after the completion of a call.
3.7.3. Argument Lists and High-Level Languages
Arguments are mapped from left to right to increasing offsets in the argument item sequence. R16 or F16 is allocated to the first argument, and the last quadword of the memory argument list (if any) is allocated to the last argument.
Each source language argument corresponds to one or more contiguous Alpha calling standard argument items.
Each argument item consists of 64 bits.
A null or omitted argument—for example, CALL SUB(A,,B)—is represented by an argument item containing the value 0.
Arguments passed by immediate value cannot be omitted unless a default value is supplied by the language. (This is to enable called procedures to distinguish an omitted immediate argument from an immediate value argument with the value 0).
Trailing null or omitted arguments—for example, CALL SUB(A,,)—are passed by the same rules as for embedded null or omitted arguments.
3.7.4. Unused Bits in Passed Data
Whenever data is passed by value between two procedures in registers (for the first six input arguments and return values), or in memory (for arguments after the first six), the bits not used by the data are sign-extended or zero-extended as appropriate.
Data Type |
Type Designator |
Data Size (bytes) |
Register Extension Type |
Memory Extension Type |
---|---|---|---|---|
Byte logical |
BU |
1 |
Zero64 |
Zero64 |
Word logical |
WU |
2 |
Zero64 |
Zero64 |
Longword logical |
LU |
4 |
Sign64 |
Sign64 |
Quadword logical |
QU |
8 |
Data64 |
Data64 |
Byte integer |
B |
1 |
Sign64 |
Sign64 |
Word integer |
W |
2 |
Sign64 |
Sign64 |
Longword integer |
L |
4 |
Sign64 |
Sign64 |
Quadword integer |
Q |
8 |
Data64 |
Data64 |
F_floating |
F |
4 |
Hard |
Data32 |
D_floating |
D |
8 |
Hard |
Data64 |
G_floating |
G |
8 |
Hard |
Data64 |
F_floating complex |
FC |
2 * 4 |
2*Hard |
2*Data32 |
D_floating complex |
DC |
2 * 8 |
2*Hard |
2*Data64 |
G_floating complex |
GC |
2 * 8 |
2*Hard |
2*Data64 |
S_floating |
FS |
4 |
Hard |
Data32 |
T_floating |
FT |
8 |
Hard |
Data64 |
X_floating |
FX |
16 |
N/A |
N/A |
S_floating complex |
FSC |
2 * 4 |
2*Hard |
2*Data32 |
T_floating complex |
FTC |
2 * 8 |
2*Hard |
2*Data64 |
X_floating complex |
FXC |
2 * 16 |
N/A |
N/A |
Small structures of 8 bytes or less |
N/A |
≤8 |
Nostd |
Nostd |
Small arrays of 8 bytes or less |
N/A |
≤8 |
Nostd |
Nostd |
32-bit address |
N/A |
4 |
Sign64 |
Sign64 |
64-bit address |
N/A |
8 |
Data64 |
Data64 |
Sign Extension Type |
Defined Function |
---|---|
Sign64 |
Sign-extended to 64 bits. |
Zero64 |
Zero-extended to 64 bits. |
Data32 |
Data is 32 bits. The state of bits <63:32> is unpredictable. |
2*Data32 |
Two single-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data32). |
Data64 |
Data is 64 bits. |
2*Data64 |
Two double-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data64). |
Hard |
Passed in the layout defined by the hardware SRM. |
2*Hard |
Two floating-point parts of the complex value are stored in a pair of registers as independent floating-point values (each handled as Hard). |
Nostd |
State of all high-order bits not occupied by the data is unpredictable across a call or return. |
Because of the varied rules for sign extension of data when passed as arguments, both calling and called routines must agree on the data type of each argument. No implicit data-type conversions can be assumed between the calling procedure and the called procedure.
3.7.5. Sending Data
This section defines the OpenVMS Alpha calling standard requirements for mechanisms to send data and the order of argument evaluation.
3.7.5.1. Sending Mechanism
- By immediate value. An argument may be passed by immediate value only if the argument is one of the following:
One of the noncomplex scalar data types with a size known (at compile time) to be ≤ 64 bits
Either single or double precision complex
A record with a known size (at compile time)
A set, implemented as a bit vector, with a size known (at compile time) to be ≤ 64 bits
No form of string or array data type may be passed by immediate value in a standard call.
Unused high-order bits must be zero or sign-extended, as appropriate depending on the date type, to fill all bits of each argument list item (as specified in Table 3.11, “Unused Bits in Passed Data”).
A single- or double- precision complex value is passed as two single or double precision floating-point values, respectively. Note that the argument count reflects that two argument positions are used rather than just one actual argument.
A record value, which may be larger than 64 bits, is passed by immediate value as follows:Allocate as many fully occupied argument item positions to the argument value as are needed to represent the argument.
The value of the unoccupied bits is undefined in a final, partially occupied argument item position, if any.
If an argument position is passed in one of the registers, it can only be passed in an integer register (never in a floating-point register).
Other argument values that are larger than 64 bits can be passed by immediate value using nonstandard conventions, typically using a method similar to those for passing records. Thus, for example, a 26-byte string can be passed by value in four integer registers.
By reference. Nonparametric arguments (arguments for which associated information such as string size and array bounds are not required) can be passed by reference in a standard call. This includes extended precision floating and extended precision complex values.
By descriptor. Parametric arguments (arguments for which associated information such as string size and array bounds must be passed to the caller) are passed by a single descriptor in a standard call.
Note that extended floating values are not passed using the immediate value mechanism; rather, they are passed using the by reference mechanism. (However, when by value semantics is required, it may be necessary to make a copy of the actual parameter and pass a reference to that copy in order to avoid improper alias effects).
Also note that when a record is passed by immediate value, the component types are not material to how the argument is aligned; the record will always be quadword aligned.
3.7.5.2. Order of Argument Evaluation
Because most high-level languages do not specify the order of evaluation (with respect to side effects) of arguments, those language processors can evaluate arguments in any convenient order. The choice of argument evaluation order and code generation strategy is constrained only by the definition of the particular language. Programs should not depend on the order of evaluation of arguments.
3.7.6. Receiving Data
When it cannot be determined at compile time whether a given in-register argument item is passed in a floating-point register or an integer register, the argument information register can be interpreted at run-time to establish where the argument was passed. (See Section 3.6.1, “Call Conventions” for details).
3.7.7. Returning Data
Immediate value
Reference
Descriptor
These mechanisms are the only standard means available for returning function values, and they support the important language-independent data types. Functions that return values by any mechanism other than those specified here are nonstandard, language-specific functions.
3.7.7.1. Function Value Return by Immediate Value
Nonfloating function value return
Floating function value return
Nonfloating Function Value Return by Immediate Value
Nonfloating-point scalar data type with size known to be ≤ 64 bits
Record with size known to be ≤ 64 bits
Set, implemented as a bit vector, with size known to be ≤ 64 bits
No form of string or array can be returned by immediate value, and two separate 32-bit entities cannot be combined and returned in R0.
A function value of less than 64 bits returned in R0 must be zero-extended or sign-extended as appropriate, depending on the data type (see Table 3.11, “Unused Bits in Passed Data”), to a full quadword.
Floating Function Value Return by Immediate Value
A function value is returned by immediate value in register F0 only if it is a noncomplex single- or double-precision floating-point value (F, D, G, S, or T).
A function value is returned by immediate value in registers F0 and F1 only if it is a complex single or double-precision floating-point value (complex F, D, G, S, or T).
Note that extended floating-point and extended complex values are returned by reference as described next.
3.7.7.2. Function Value Return by Reference
Its size is known to both the calling procedure and the called procedure, but the value cannot be returned by immediate value. (Because the function value requires more than 64 bits, the data type is a string or an array type).
It can be returned in a contiguous region of storage.
The actual-argument list and the formal-argument list are shifted to the right by one argument item. The new, first argument item is reserved for the function value. This hidden first argument is included in the count and register usage information that is passed in the argument information register (see Section 3.6.1, “Call Conventions” for details).
The calling procedure must provide the required contiguous storage and pass the address of the storage as the first argument. This address must specify storage naturally aligned according to the data type of the function value.
The called function must write the function value to the storage described by the first argument.
The this
Pointer
For C++, when the this
pointer is passed as an implicit first parameter
and a pointer to a return value buffer is also required, then the this
pointer becomes the first parameter, the buffer pointer becomes the second parameter, and the
remaining normal parameters are shifted two slots to make this possible.
3.7.7.3. Function Value Return by Descriptor
It cannot be returned by immediate value. (Because the function value requires more than 64 bits, the data type is a string or an array type, and so on).
Its size is not known to either the calling procedure or the called procedure.
It can be returned in a contiguous region of storage.
Noncontiguous function values are language specific and cannot be returned as a standard-conforming return value.
Records, noncontiguous arrays, and arrays with more than one dimension cannot be returned by descriptor in a standard call.
Both 32-bit and 64-bit descriptor forms can be used for function values returned by descriptor. See Chapter 8, OpenVMS Argument Descriptors, for details of the descriptor forms.
Dynamic text—Heap-managed strings of arbitrary and dynamically changeable length
Return objects created by the calling routine—Function values that are to be returned in an object allocated by and having attributes (bounds, lengths, and so on) specified by the calling routine
Return objects created by the called routine—Function values that are returned in an object allocated by and having attributes (bounds, lengths, and so on) specified by the called routine
For correct results to be obtained from this type of function return, the calling and called routines must agree by prior arrangement which of these three major cases applies, and whether 64-bit descriptor forms may be used.
The following paragraphs describe the specialized requirements for each major case:
Dynamic Text
For dynamic text return by descriptor, the calling routine passes a valid (completely initialized) dynamic string descriptor (DSC$B_CLASS = DSC$K_CLASS_D). The called routine must assign a value to the variable represented by this descriptor using the same rules that apply to a dynamic text descriptor used as an ordinary parameter.
Return Object Created by Calling Routine
For a return object created by the calling routine, the calling routine passes a descriptor in which all fields are completely loaded.
The called routine must supply a return value that satisfies that description. In particular, the called routine must truncate or pad the returned value to satisfy the requirements of the descriptor according to the semantics of the language in which the called routine is written.
The calling and called routines must agree by prior arrangement on the DSC$B_CLASS and DSC$B_DTYPE of descriptor to be used.
Return Object Created by Called Routine
DSC$A_POINTER field is set to 0.
DSC$B_CLASS field is loaded.
DSC$B_DTYPE field is loaded.
DSC$B_DIMCT field is loaded and the DSC$B_AFLAGS field is set to 0 if the descriptor is an array descriptor.
All other fields are unpredictable.
If the passed descriptor is an array descriptor, it must contain space for bounds information to be returned even though the DSC$B_AFLAGS field is set to 0.
The called routine must return the function value using stack return conventions and load the DSC$A_POINTER field to point to the returned data. Other descriptor information, such as origin, bounds (if supplied), and DSC$B_AFLAGS fields must be filled in appropriately to correspond to the returned data.
An important implication of a call that uses this kind of value return is that the stack pointer normally is not restored to its value prior to the call as part of the return from the called procedure. The returned value typically (but not necessarily) is left by the called routine somewhere on the stack. For that reason, this mechanism is sometimes known as the stack return mechanism.
After a return of this type, the calling routine must assume that the stack has been extended by some unknown amount (or possibly none) by the called procedure. In particular, the stack cannot be cut back until the returned value is no longer needed (which may be ensured by copying it to another location).
However, this type of return does not imply that the actual storage used by the called routine to hold the returned value must be at the address pointed to by the stack pointer; it need not even be on the stack. It could be in some read-only, static memory. (This latter case might arise when the returned value is constant or is obtained from some constant structure). For this reason, the calling routine must not assume that the data described by the return descriptor is writable.
3.8. Data Allocation
This section describes the standard static data requirements that define the Alpha alignment of data structures, record formats, and record layout. These conventions help to ensure proper data compatibility with all OpenVMS Alpha and VAX languages.
3.8.1. Data Alignment
In the Alpha environment, memory references to data that is not naturally aligned can result in alignment faults, which can severely degrade the performance of all procedures that reference the unaligned data.
Data Type |
Alignment Starting Position |
---|---|
8-bit character string |
Byte boundary |
16-bit integer |
Address that is a multiple of 2 (word alignment) |
32-bit integer |
Address that is a multiple of 4 (longword alignment) |
64-bit integer |
Address that is a multiple of 8 (quadword alignment) |
|
Address that is a multiple of 4 (longword) |
|
Address that is a multiple of 8 (quadword) |
|
Address that is a multiple of 8 (quadword) |
|
Address that is a multiple of 4 (longword) |
|
Address that is a multiple of 8 (quadword) |
|
Address that is a multiple of 16 (octaword) |
For aggregates such as strings, arrays, and records, the data type to be considered for purposes of alignment is not the aggregate itself, but rather the elements of which the aggregate is composed. The alignment requirement of an aggregate is that all elements of the aggregate be naturally aligned. For example, varying 8-bit character strings must start at addresses that are a multiple of at least 2 (word alignment) because of the 16-bit count at the beginning of the string; 32-bit integer arrays start at a longword boundary, irrespective of the extent of the array.
The rules for passing a record in an argument that is passed by immediate value (see Section 3.7.5.1, “Sending Mechanism”) always provide quadword alignment of the record value independent of the normal alignment requirement of the record. If deemed appropriate by an implementation, normal alignment can be established within the called procedure by making a copy of the record argument at a suitably aligned location.
3.8.2. Record Layout Conventions
The OpenVMS Alpha calling standard rules for record layout are designed to provide good run-time performance on all implementations of the Alpha architecture and to provide the required level of compatibility with conventional VAX operating environments.
Those optimized for optimal access characteristics (referred to as aligned record layouts)
Those compatible with conventions that are traditionally used by VAX languages (referred to as VAX compatible record layouts)
Only these two record layouts may be used across standard interfaces or between languages. Languages can support other language-specific record layout conventions, but such layouts are nonstandard.
The aligned record layout conventions should be used unless interchange is required with conventional VAX applications that use the OpenVMS VAX compatible record layouts.
3.8.2.1. Aligned Record Layout
All components of a record or subrecord are naturally aligned.
Layout and alignment of record elements and subrecords are independent of any record or subrecord in which they are embedded.
Layout and alignment of a subrecord is the same as if it were a top-level record.
Declaration in high-level languages of standard records for interlanguage use is straightforward and obvious, and meets the requirements for source-level compatibility between Alpha and VAX languages.
The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.
The first bit of a record or subrecord must be directly addressable (byte aligned).
Records and subrecords must be aligned according to the largest natural alignment requirements of the contained elements and subrecords.
Bit fields (packed subranges of integers) are characterized by an underlying integer type that is a byte, word, longword, or quadword in size together with an allocation size in bits. A bit field is allocated at the next available bit boundary, provided that the resulting allocation does not cross an alignment boundary of the underlying type. Otherwise, the field is allocated at the next byte boundary that is aligned as required for the underlying type. (In the later case, the space skipped over is left permanently not allocated). In addition, if necessary, the alignment of the record as a whole is increased to that of the underlying integer type.
Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.
All other components of a record must start at the next available naturally aligned address for the data type.
The length of a record must be a multiple of its alignment. (This includes the case when a record is a component of another record).
Strings and arrays must be aligned according to the natural alignment requirements of the data type of which the string or array is composed.
The length of an array element is a multiple of its alignment, even if this leaves unused space at its end. The length of the whole array is the sum of the lengths of its elements.
3.8.2.2. OpenVMS VAX Compatible Record Layout
The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.
Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.
All other components of a record must start at the next available byte in the record. Any unused bits following the last-used bit in the last-used byte of each component must be filled out to the next byte boundary so that any following data starts on a byte boundary.
Subrecords must be aligned according to the largest alignment of the contained elements and subrecords. A subrecord always starts at the next available byte unless it consists entirely of unaligned bit data and it immediately follows an unaligned bit string, unaligned bit array, or a subrecord consisting entirely of unaligned bit data.
Records must be aligned on byte boundaries.
3.9. Multithreaded Execution Environments
This section defines the conventions to support the execution of multiple threads in a multilanguage Alpha environment. Specifically defined is how compiled code must perform stack limit checking. While this standard is compatible with a multithreaded execution environment, the detailed mechanisms, data structures, and procedures that support this capability are not specified in this manual.
There can be one or more threads executing within a single process.
The state of a thread is represented in a thread environment block (TEB).
The TEB of a thread contains information that determines a stack limit below which the stack pointer must not be decremented by the executing code (except for code that implements the multithread mechanism itself).
Exception handling is fully reentrant and multithreaded.
3.9.1. Stack Limit Checking
A program that is otherwise correct can fail because of stack overflow. Stack overflow occurs when extension of the stack (by decrementing the stack pointer, SP) allocates addresses not currently reserved for the current thread's stack.
Detection of a stack overflow situation is necessary because a thread, attempting to write into stack storage, could modify data allocated in that memory for some other purpose. This would most likely produce unpredictable and undesirable results or irreproducible application failures.
The requirements for procedures that can execute in a multithread environment include checking for stack overflow. This section defines the conventions for stack limit checking in a multithreaded program environment.
In the following sections, the term new stack region refers to the region of the stack from one less than the old value of SP to the new value of the SP.
Stack Guard Region
In a multithread environment, the memory beyond the limit of each thread's stack is protected by contiguous guard pages, which form the stack's guard region.
Stack Reserve Region
In some cases, it is desirable to maintain a stack reserve region, which is a minimum-sized region that is immediately above a thread's guard region. A reserve region may be desirable to ensure that exceptions or asynchronous system traps (ASTs) have stack space to execute on a thread's stack, or to ensure that the exception dispatcher and any exception handler that it might call have stack space to execute after detection of an invalid attempt to extend the stack.
This standard does not require a reserve region.
3.9.1.1. Methods for Stack Limit Checking
Because accessible memory may be available at addresses lower than those occupied by the guard region, compilers must generate code that never extends the stack past the guard pages into accessible memory that is not allocated to the thread's stack.
Note
An access can be performed by using either a load or a store operation; however, be sure
to use an instruction that is guaranteed to make an access to memory. For example, do not use
an LDQ R31,*
instruction, because the Alpha architecture does not allow
any memory access, even a read access, whose result is discarded because of the R31
destination.
This standard defines two methods for stack limit checking: implicit and explicit.
Implicit Stack Limit Checking
If the lowest addressed byte of the new stack region is guaranteed to be accessed prior to any further stack extension, then the stack can be extended by an increment that is equal in size to the guard region (without any further accesses).
If some byte (not necessarily the lowest) of the new stack region is guaranteed to be accessed prior to any further stack extension, then the stack can be extended by an increment that is equal in size to one-half the guard region (without any further accesses).
The stack frame format (see Section 3.4.3, “Stack Frame Format”) and entry code rules (see Section 3.6.5, “Entry and Exit Code Sequences”) generally do not ensure access to the lowest address of a new stack region without introducing an extra access solely for that purpose. Consequently, this standard uses the second strategy. While the amount of implicit stack extension that can be achieved is smaller, the check is achieved at no additional cost.
This standard requires that the minimum guard region size is 8192 bytes, the size of the smallest memory protection granularity allowed by the Alpha architecture.
Explicit stack limit checking must be performed unless the amount by which the SP is decremented is known to be less than or equal to 4096 and a reserve region is not required.
Some byte in the new stack region must be accessed before the SP can be decremented for a subsequent stack extension.
This access can be performed either before or after the SP is decremented for this stack extension, but it must be done before the SP can be decremented again.
No standard procedure call can be made before some byte in the new stack region is accessed.
The system exception dispatcher ensures that the lowest addressed byte in the new stack region is accessed if any kind of asynchronous interrupt occurs after the SP is decremented, but before the access in the new stack region occurs.
These conventions ensure that the stack pointer is not decremented so that it points to accessible storage beyond the stack limit without this error being detected (either by the guard region being accessed by the thread or by an explicit stack limit check failure).
As a matter of practice, the system can provide multiple guard pages in the guard region. When a stack overflow is detected as a result of access to the guard region, one or more guard pages can be unprotected for use by the exception handling facility, and one or more guard pages can remain protected to provide implicit stack limit checking during exception processing. However, the size of the guard region and the number of guard pages is system defined and is not defined by this standard.
Explicit Stack Limit Checking
If the stack is being extended by an amount of unknown size or by a known size greater than the maximum implicit check size (4096), then a code sequence that follows the rules for implicit stack limit checking can be executed in a loop to access the new stack region incrementally in segments lesser than or equal to the minimum page size (8192 bytes). At least one access must occur in each such segment.
The first access must occur between SP and SP-4096 because, in the absence of more specific information, the previous guaranteed access relative to the current stack pointer may be as much as 4096 bytes greater than the current stack pointer address.
The last access must be within 4096 bytes of the intended new value of the stack pointer. These accesses must occur in order, starting with the highest addressed segment and working toward the lowest addressed segment.
Perform a read access using the intended new value of the stack pointer. This is nondestructive, even if the read is beyond the stack guard region, and may facilitate OS mapping of new stack pages, if appropriate, in a single operation.
Proceed with sequential accesses as just described.
Note
A simple algorithm that is consistent with this requirement (but achieves up to twice the minimum number of accesses) is to perform a sequence of accesses in a loop starting with the previous value of SP, decrementing by the minimum no-check extension size (4096) to, but not including, the first value that is less than the new value for the stack pointer.
Note
An explicit stack limit check can be performed either by inline code that is part of a prologue or by a run-time support routine that is tailored to be called from a procedure prologue.
Stack Reserve Region Checking
The size of the reserve region must be included in the increment size used for stack limit checks, after which it is not included in the amount by which the stack is actually extended. (Depending on the size of the reserve region, this may partially or even completely eliminate the ability to use implicit stack limit checking).
3.9.1.2. Stack Overflow Handling
Exception SS$_ACCVIO may be raised.
The system may transparently extend the thread's stack, reset the TEB stack limit value appropriately, and continue execution of the thread.
Note that if a transparent stack extension is performed, a stack overflow that occurs in a called procedure might cause the stack to be extended. Therefore, the TEB stack limit value must be considered volatile and potentially modified by external procedure calls and by handling of exceptions.
Chapter 4. OpenVMS I64 Conventions
This chapter describes the fundamental concepts and conventions for calling a procedure in an OpenVMS I64 environment.
4.1. I64 Register Usage
General
Floating-point
Predicate
Branch
Application
4.1.1. I64 Register Classes
Scratch registers—may be modified by a procedure call; the caller must save these registers before a call if needed (caller save).
Preserved registers—must not be modified by a procedure call; the callee must save and restore these registers if used (callee save). A procedure using one of the preserved general registers must save and restore the caller's original contents, including the NaT bits associated with the registers, without generating a NaT consumption fault.
One way to preserve a register is not to use it at all.
Automatic registers—saved and restored automatically by the hardware call/return mechanism.
Constant or Read-only registers—contain a fixed value that cannot be changed by the program.
Special registers—used in the calling standard call/return mechanism.
Global registers—shared across a set of cooperating routines as global static storage that happens to be allocated in a register. (Details regarding the dynamic lifetime of such storage are not addressed here).
Special registers—used in the calling standard call/return mechanism. (These are the same as the set of special registers in the preceding list of registers used within a procedure).
Input registers—may be used to pass information into a procedure (in addition to the normal stacked input registers).
Output registers—may be used to pass information back from a called procedure to its caller (in addition to the normal return value registers).
Volatile registers—may be used as scratch registers within a procedure and are not preserved across a call; may not be used to pass information between procedures either as input or output.
4.1.2. I64 General Register Usage
Register |
Class |
Usage |
---|---|---|
R0 |
Constant |
Always 0. |
R1 |
Special |
Global data pointer (GP). Designated to hold the address of the currently addressable
global data segment. Its use is subject to the following conventions:
The effect of these rules is that GP must be treated as a scratch register at a point of call (that is, it must be saved by the caller), and it must be preserved from entry to exit. |
R2 |
Volatile |
May not be used to pass information between procedures, either as inputs or outputs. See also Section 4.1.9, “Additional Register Usage Information”. |
R3 |
Scratch |
May be used within and between procedures in any mutually consistent combination of ways under explicit user control. See also Section 4.1.9, “Additional Register Usage Information”. |
R4—R7 |
Preserved |
General-purpose preserved registers. Used for any value that needs to be preserved across a procedure call. May be used within and between procedures in any mutually consistent combination of ways under explicit user control. See also Section 4.1.9, “Additional Register Usage Information”. |
R8—R9 |
Scratch |
Return Value. Can also be used as input (whether or not the procedure has a return value), but not in any additional ways. In addition, R9 is the preferred and recommended register to use when passing the environment value when calling a bound procedure. (See Section 4.7.7, “Simple and Bound Procedures” and Section 6.1.2, “Translated Images on I64 Systems”). |
R10—R11 |
Scratch |
May be used within and between procedures in any mutually consistent combination of ways under explicit user control. See also Section 4.1.9, “Additional Register Usage Information”. |
R12 |
Special |
Memory stack pointer (SP). Holds the lowest address of the current stack frame. At a call, the stack pointer must point to a 0 mod 16 aligned area. The stack pointer is also used to access any memory arguments upon entry to a function. Except in the case of dynamic stack allocation, code can use the stack pointer to reference stack items without having to set up a frame pointer for this purpose. |
R13 |
Special |
Reserved as a thread pointer (TP). |
R14—R18 |
Volatile |
May not be used to pass information between procedures, either as inputs or outputs. See also Section 4.1.9, “Additional Register Usage Information”. |
R19—R24 |
Scratch |
May be used within and between procedures in any mutually consistent combination of ways under explicit user control. See also Section 4.1.9, “Additional Register Usage Information”. |
R25 |
Special |
Argument information (see Section 4.7.5.3, “Argument Information (AI) Register”). |
R26—R31 |
Scratch |
May be used within and between procedures in any mutually consistent combination of ways under explicit user control. See also Section 4.1.9, “Additional Register Usage Information”. |
IN0—IN7 |
Automatic |
Stacked input registers. Code may allocate a register stack frame of up to 96 registers with the ALLOC instruction, and partition this frame into three regions: input registers (IN0, IN1, ...), local registers (LOC0, LOC1, ...), and output registers (OUT0, OUT1, ...). R32—R39 (IN0—IN7) are used as incoming argument registers. Arguments beyond these registers appear in memory, as explained in Section 4.7.4, “Parameter Passing”. |
LOC0—LOC95 |
Automatic |
Stacked local registers. Code may allocate a register stack frame of up to 96 registers with the ALLOC instruction, and partition this frame into three regions: input registers (IN0, IN1, ...), local registers (LOC0, LOC1, ...), and output registers (OUT0, OUT1, ...). LOC0-LOC95 are used for local storage. See Section 4.7.4, “Parameter Passing” for more information. |
OUT0—OUT7 |
Scratch |
Stacked output registers. Code may allocate a register stack frame of up to 8 registers with the ALLOC instruction, and partition this frame into three regions: input registers (IN0, IN1, ...), local registers (LOC0, LOC1, ...), and output registers (OUT0, OUT1, ...). OUT0-OUT7 are used to pass the first eight arguments in calls. See Section 4.7.4, “Parameter Passing” for more information. |
4.1.3. I64 Floating-Point Register Usage
Register |
Class |
Usage |
---|---|---|
F0 |
Constant |
Always 0.0. |
F1 |
Constant |
Always 1.0. |
F2-F5 |
Preserved |
Can be used for any value that needs to be preserved across a procedure call. A procedure using one of the preserved floating-point registers must save and restore the caller's original contents without generating a NaT consumption fault. |
F6—F7 |
Scratch |
May be used within and between procedures in any mutually consistent combination of ways under explicit user control. |
F8—F9 |
Scratch |
Argument/Return values. See Section 4.7.4, “Parameter Passing” and Section 4.7.6, “Return Values” for the OpenVMS specifications for use of these registers. |
F10—F15 |
Scratch |
Argument values. See Section 4.7.4, “Parameter Passing” for the OpenVMS specifications for use of these registers. |
F16—F31 |
Preserved |
Can be used for any value that needs to be preserved across a procedure call. A procedure using one of the preserved floating-point registers must save and restore the caller's original contents without generating a NaT consumption fault. |
F32—F127 |
Scratch |
Rotating registers or scratch registers. |
Note
VAX floating-point data is never loaded or manipulated in the Itanium floating-point registers. However, VAX floating-point values may be converted to IEEE floating-point values, which are then manipulated in the I64 floating-point registers.
4.1.4. I64 Predicate Register Usage
Register |
Class |
Usage |
---|---|---|
P0 |
Constant |
Always 1. |
P1—P5 |
Preserved |
Can be used for any predicate value that needs to be preserved across a procedure call. A procedure using one of the preserved predicate registers must save and restore the caller's original contents. |
P6—P13 |
Scratch |
Can be used within a procedure as a scratch register. |
P14—P15 |
Volatile |
May not be used to pass information between procedures, either as input or output. See also Section 4.1.9, “Additional Register Usage Information”. |
P16—P63 |
Preserved |
Rotating registers. |
4.1.5. I64 Branch Register Usage
Register |
Class |
Usage |
---|---|---|
B0 |
Scratch |
Contains the return address on entry to a procedure; otherwise a scratch register. |
B1—B5 |
Preserved |
Can be used for branch target addresses that need to be preserved across a procedure call. |
B6—B7 |
Volatile |
May not be used to pass information between procedures, either as input or output. See also Section 4.1.9, “Additional Register Usage Information”. |
4.1.6. I64 Application Register Usage
Register |
Class |
Usage |
---|---|---|
AR.FPSR |
See Usage |
Floating-point status register. This register is divided into the following fields:
|
AR.RNAT |
Automatic |
RSE NaT collection register. Holds the NaT bits for values stored by the register stack engine. These bits are saved automatically in the register stack backing store. |
AR.UNAT |
Preserved |
User NaT collection register. Holds the NaT bits for values stored by the ST8.SPILL instruction. As a preserved register, it must be saved before a procedure can issue any ST8.SPILL instructions. The saved copy of AR.UNAT in a procedure's frame holds the NaT bits from the registers spilled by its caller; these NaT bits are thus associated with values local to the caller's caller. |
AR.PFS | Special |
Previous function state. Contains information that records the state of the caller's register stack frame and epilogue counter. It is overwritten on a procedure call; therefore, it must be saved before issuing any procedure calls, and restored prior to returning. |
AR.BSP |
Read-only |
Backing store pointer. Contains the address in the backing store corresponding to the base of the current frame. This register may be modified only as a side effect of writing AR.BSPSTORE while the Register Stack Engine (RSE) is in enforced lazy mode. |
AR.BSPSTORE | Special |
Backing store pointer. Contains the address of the next RSE store operation. It may be read or written only while the RSE is in enforced lazy mode. Under normal operation, this register is managed by the RSE, and application code should not write to it, except when performing a stack switching operation. |
AR.RSC |
See Usage |
RSE control; the register stack configuration register. This register is divided into
the following fields:
|
AR.LC |
Preserved |
Loop counter. |
AR.EC |
Automatic |
Epilogue counter (preserved in AR.PFS). |
AR.CCV | Scratch |
Compare and exchange comparison value. |
AR.ITC |
Read-only |
Interval time counter. |
AR.K0—AR.K7 |
Read-only |
Kernel registers. |
AR.CSD | Scratch |
Reserved for use as implicit operand registers in future extensions to the Itanium architecture. To ensure forward compatibility, OpenVMS considers these registers as part of the thread and process state. |
AR.SSD |
Scratch |
Reserved for use as implicit operand registers in future extensions to the Itanium architecture. To ensure forward compatibility, OpenVMS considers these registers as part of the thread and process state. |
4.1.7. Floating-Point Status
The AR.FPSR hardware register
A supplementary software register (a quadword)
SYS$IEEE_SET_FP_CONTROL
SYS$IEEE_SET_PRECISION_MODE
SYS$IEEE_SET_ROUNDING_MODE
The AR.FPSR hardware register is described in the Intel IA-64 Architecture Software Developer's Manual. The supplementary software register is internal to OpenVMS and is not documented for general use. This register holds information used by OpenVMS to implement the three system services and floating-point exception handling generally. It can only be accessed indirectly using the system services.
Floating-point control status bits are those bits or flags that control the operation of floating-point arithmetic operations. These bits include the trap disable flags (traps.vd, .dd, .zd, .od, ud, and .id) as well as the the ftz, wre, pc, rc, and td fields in each of the status fields (sf0, sf1, sf2, and sf3) of the AR.FPSR hardware register.
Floating-point information status bits are those bits or flags that record summary information about the execution of previous floating-point arithmetic operations. These bits include the v, d, z, o, u, and i flags in each of the status fields (sf0, sf1, sf2, and sf3).
Note
The floating-point control status is sometimes informally also called the floating-point mode or IEEE mode.
Using a compiler or linker switch, you can associate a floating-point control status with the main procedure of a program to set the floating-point state prior to the beginning of program execution. If no control status is explicitly set, a default status appropriate for full IEEE computation is used.
Full IEEE-format floating-point control status—the default, unless the status is explicitly set to another value.
VAX-format floating-point control status—can be set for programs that use VAX-format floating-point processing.
Status Field |
Flags |
td |
rc |
pc |
wre |
ftz |
---|---|---|---|---|---|---|
sf0 |
000000 |
0 |
00 |
11 |
0 |
0 |
sf1 |
000000 |
1 |
00 |
11 |
1 |
0 |
sf2 and sf3 |
000000 |
1 |
00 |
11 |
0 |
0 |
global trap disable bits: .id, .ud, .od, .zd, .dd, .vd |
111111 | |||||
inherit floating-point mode on thread creation |
0 |
Status Field |
Flags |
td |
rc |
pc |
wre |
ftz |
---|---|---|---|---|---|---|
sf0 |
000000 |
0 |
00 |
11 |
0 |
0 |
sf1 |
000000 |
1 |
00 |
11 |
1 |
0 |
sf2 and sf3 |
000000 |
1 |
00 |
11 |
0 |
0 |
global trap disable bits: .id, .ud, .od, .zd, .dd, .vd |
110010 | |||||
inherit floating-point mode on thread creation |
0 |
For both IEEE-format and VAX-format floating-point processing, additional floating-point status settings may be available. See your compiler documentation for other optional settings.
It is generally assumed that the initial floating-point control status will remain unchanged throughout execution of the whole program. However, a procedure (or cooperating group of procedures) may temporarily modify the floating-point control status provided the control status is restored to its value on entry. The control status can be restored by one of three methods: a normal return, resignalling, or unwinding for an exception. See Section 9.5.3.4, “Floating-Point Control Status (I64 and x86-64)” for additional information.
Because the floating-point control status can vary and can be changed dynamically (even if later restored), the state of the floating-point control status is generally indeterminate when a routine (especially a shared library routine) is called. Usually this is acceptable. For example, returning a NaN or raising an exception are both valid ways to handle exceptional conditions. However, if correct operation of a routine depends on a particular floating-point control setting, then the called routine must save the control status on entry, set the needed control status, perform its operation, and restore the control status when it exits. (Whether the informational status is similarly saved and restored is unspecified).
4.1.8. User Mask
BE (Big Endian Memory Access Enable) — This bit must never be set on OpenVMS.
UP (User Performance Monitor Enable) — This bit is reserved.
AC (Alignment Check) — The application may set or clear this bit as desired. If the AC bit is clear, an unaligned memory reference may cause the system to deliver an exception to the application, or the system may emulate the unaligned reference. If the AC bit is set, an unaligned reference will always cause the system to deliver an exception to the application. At program start, the value of this bit on OpenVMS is clear.
MFL/MFH (Lower/Upper floating-point registers written) — The application should not clear either of these bits unless the values in the corresponding registers are no longer needed (for example, it may clear the MFH bit when returning from a procedure, because the upper set of floating-point registers is all scratch). Doing so otherwise may cause unpredictable behavior.
4.1.9. Additional Register Usage Information
As described in earlier sections, some registers are volatile and cannot be used to communicate information between routines (see Tables 4.1, 4.3, and 4.4). For example, B6 is used by OTS$JUMP_TO_BPV (see Section 4.7.7, “Simple and Bound Procedures”).
Static general registers R17—R18
Predicate register P15
Branch register B7
For example, R17 and R18 are used by OTS$CALL_PROC (see Section 6.1.2.3, “Indirect Calls From Native to Translated Code”).
R3—R7
R10—R11
R19—R24
R26—R31
The normal or default use for these registers is shown in the Class column of Table 4.1, “I64 General Register Usage”. However, using suitable programming language features, it is valid for any of these registers to be used as preserved, scratch, input, output, global or not used. Of course, the unwind information (see Section A.4, “Data Structures”) for each procedure must accurately describe the actual usage.
Registers R8 and R9 may also be used as inputs (whether or not the procedure has a return value), but not in any additional ways.
General registers whose class is described as constant, special, volatile or automatic in Section 4.1.1, “I64 Register Classes” cannot be used in any other way.
Floating-point, predicate, branch, and application registers can be used only according to the class described in Sections 4.1.2 through 4.1.6.
4.2. Address Representation
An address is a 64-bit value used to denote a position in memory. However, for compatibility with OpenVMS VAX and Alpha, many OpenVMS applications and user-mode facilities operate in such a manner that addresses are restricted to values that are representable in 32 bits. This means that OpenVMS addresses can often be stored and manipulated as 32-bit longword values. In such cases, the 32-bit address value is always implicitly or explicitly sign-extended to form a 64-bit address for use by the Itanium hardware.
4.3. Procedure Representation
A procedure value, sometimes called a function pointer, is a value that uniquely identifies a procedure and can be used to call it.
For OpenVMS, a procedure value is the address of a function descriptor, which consists of at least two quadword fields: the address of the entry point and the GP value required by that procedure.
Every procedure whose address is taken, or might be taken, must have a unique official function descriptor. The address of this function descriptor is used for the procedure value that is passed as a parameter or when two procedure values are compared. For other purposes, additional local function descriptors may be used for efficiency (notably in images other than the image that contains the procedure).
An official function descriptor for any procedure which might be callable from a VAX or Alpha translated image must include signature information. A local function descriptor used to call a procedure that might be part of a VAX or Alpha translated image must also include additional fields to facilitate the call. Both of these cases are described in Section 6.1.2, “Translated Images on I64 Systems”.
A function descriptor for a bound procedure uses a special pseudo-GP value and includes an uplevel frame pointer. Such function descriptors are described in Section 4.7.7, “Simple and Bound Procedures”.
Kinds and Roles |
Size (Quadwords) |
---|---|
Local function descriptor without translated image support |
2 |
Local function descriptor with translated image support (jacket function descriptor) |
4 |
Official function descriptor without translated image support |
3 |
Official function descriptor with translated image support |
3 |
Bound function descriptor |
6 |
Note that the different kinds of function descriptor are not self-identifying (that is, they do not contain any form of tag or kind field).
4.4. Procedure Types
Memory stack procedure—allocates a memory stack and may maintain part or all of its caller's context on that stack.
Register stack procedure—allocates only a register stack and maintains its caller's context in registers.
- Null frame procedure—allocates neither a memory stack nor a register stack and therefore preserves no context of its caller.
Note
Unlike an Alpha null frame procedure (see Section 3.4, “Procedure Types” and Section 3.4.6, “Null Frame Procedures”), an I64 null frame procedure does not execute in the context of its caller because the Intel® Itanium® call instruction (br.call) changes the register set so that only the caller's output registers are accessible in the called routine. The caller's input and local registers cannot be accessed at all. The call instruction also changes the previous frame state (PFS) of the Itanium processor.
A compiler may choose which type of procedure to generate based on the requirements of the procedure in question. A calling procedure does not need to know what type of procedure it is calling.
Every memory stack procedure or register stack procedure must have an associated unwind description (see Appendix A, Stack Unwinding and Exception Handling on OpenVMS I64) which describes what type of procedure it is and other procedure characteristics. A null frame procedure may also have an associated unwind description. (A default description applies if not). This data structure is used to interpret the call stack at any given point in a thread's execution. It is typically built at compile time and usually is not accessed at run-time except to support exception processing or other rarely executed code.
Read access to unwind descriptions is provided through the procedural interfaces described in Section 4.8, “Procedure Call Stack” and Section A.6, “Default Unwind Information”.
To make invocations of that procedure visible to and interpretable by facilities such as the debugger, exception handling system, and the unwinder.
To ensure that the context of the caller saved by the called procedure can be restored if an unwind occurs. (For a description of unwinding, see Section 9.7, “Request to Unwind from a Signal”).
4.5. Memory Stack
The memory stack is used for local dynamic storage, spilled registers, and parameter passing. It is organized as a stack of procedure frames, beginning with the main program's frame at the base of the stack, and continuing towards the top of the stack with nested procedure calls. At the top of the stack is the frame for the currently active procedure. (There may be some system-dependent frames at the base of the stack, prior to the main program's frame, but an application program may not make any assumptions about them).
The memory stack begins at an address determined by the operating system, and grows towards lower addresses in memory. The stack pointer register (SP) always points to the lowest address in the current, top-most, frame on the stack.
Each procedure creates its frame on entry by subtracting its frame size from the stack pointer, and removes its frame from the stack on exit by restoring the previous value of SP (usually by adding its frame size, but a procedure may save the original value of SP when its frame size may vary).
Because the register stack is also used for the same purposes as the memory stack, not all procedures need a memory stack frame. However, every non-leaf procedure must save at least its return link and the previous frame marker, either on the register stack or on the memory stack. This ensures that there is an invocation context for every non-leaf procedure on one or both of the stacks.
4.5.1. Procedure Frames
A memory stack procedure frame consists of five regions, as illustrated in Figure 4.1, “Procedure Frame”.
Scratch area. This 16-byte region is provided as scratch storage for procedures that are called by the current procedure. Leaf procedures need not allocate this region. A procedure may use the 16 bytes pointed to by the stack pointer (SP) as scratch memory, but the contents of this area are not preserved by a procedure call.
Outgoing parameters. Parameters in excess of those passed in registers are stored in this region of the stack frame. A procedure accesses its incoming parameters in the outgoing parameter region of its caller's stack frame.
Frame marker (optional). This region may contain information required for unwinding through the stack (for example, a copy of the previous stack pointer).
Dynamic allocation. This variable-sized region (initially zero length) can be created as needed.
Local storage. A procedure can store local variables, temporaries, and spilled registers in this region. For conventions affecting the layout of this area for spilled registers, see Section A.3, “Coding Conventions for Reliable Unwinding”.
Note
A stack pointer that is not octaword aligned is valid only in a variable-sized frame (see below) because the unwind descriptor (MEM_STACK_F, see Section A.4.1.3, “Descriptor Records for Prologue Regions”) for a fixed-size frame specifies the size in 16-byte units.
An application may not write to memory addresses lower than the stack pointer, because this memory area may be written to asynchronously (for example, as a result of exception processing).
Most procedures are expected to have a fixed-size frame, and the conventions are biased in favor of this. A procedure with a fixed-size frame may reference all regions of the frame with a compile-time constant offset relative to the stack pointer. Compilers should determine the total size required for each region, and pad the local storage area to make the total frame size a multiple of 16 bytes. The procedure can then create the frame by subtracting an immediate constant from the stack pointer in the prologue, and remove the frame by adding the same immediate constant to the stack pointer in the epilogue.
If a procedure has a variable-size frame (for example, a C routine that calls the alloca built-in), it should make a copy of SP to serve as a frame pointer before subtracting the initial frame size from the stack pointer. The procedure can then restore the previous value of the stack pointer in the epilogue without regard for how much dynamic storage has been allocated within the frame. It can also use the frame pointer to access the local storage region, because offsets from SP will vary.
The procedure uses an equivalent method of addressing the local storage region correctly before and after dynamic allocation.
The code satisfies the conditions imposed by the stack unwind mechanism.
To expand a stack frame dynamically, the scratch area, outgoing parameters, and frame marker regions (which are always located relative to the current stack pointer), must be relocated to the new top of stack. If the scratch area and outgoing parameter area are both clear of any live values, there is no actual work involved in relocating these areas. For procedures with dynamically-sized frames, it is recommended that the previous stack pointer value be stored in a local stacked general register instead of the frame marker, so that the frame marker is also empty. If the previous stack pointer is stored in the frame marker, the code must take care to ensure that the stack is always unwindable while the stack is being expanded (see Appendix A, Stack Unwinding and Exception Handling on OpenVMS I64).
Other issues depend on the compiler and the code being compiled. The standard calling sequence does not define a maximum stack frame size, nor does it restrict how a language system uses any stack frame region beyond those purposes described here. For example, the outgoing parameter region can be used as scratch storage whenever it is not needed for passing parameters.
4.5.2. Stack Overflow Detection
This section defines the conventions to support the execution of multiple threads in a multilanguage OpenVMS environment. Specifically defined is how compiled code must perform stack limit checking. While this standard is compatible with a multithreaded execution environment, the detailed mechanisms, data structures, and procedures that support this capability are not specified in this manual.
There can be one or more threads executing within a single process.
The state of a thread is represented in a thread environment block (TEB).
The TEB of a thread contains information that determines a stack limit below which the stack pointer must not be decremented by the executing code (except for code that implements the multithreaded mechanism itself).
Exception handling is fully reentrant and multithreaded.
4.5.2.1. Stack Limit Checking
A program that is otherwise correct can fail because of stack overflow. Stack overflow occurs when extension of the stack (by decrementing the stack pointer, SP) allocates addresses not currently reserved for the current thread's stack. This section defines the conventions for stack limit checking in a multithreaded environment.
In the following sections, the term new stack region refers to the region of the stack from one less than the old value of SP to the new value of SP.
Stack Guard Region
In a multithreaded environment, the address space beyond each thread's stack is protected by contiguous guard pages, which trap on any access. These pages form the stack guard region.
Stack Reserve Region
Exceptions or asynchronous system traps (ASTs, analogous to asynchronous signals) have stack space to execute on a thread's stack.
The exception dispatcher and any exception handler that it might call have stack space to execute after detection of an invalid attempt to extend the stack.
This calling standard does not require a stack reserve region, but it does allow a language (for example, Ada) and its run-time system to implement one.
4.5.2.1.1. Methods for Stack Limit Checking
Because accessible memory may be available at addresses lower than those occupied by the stack guard region, compilers must generate code that never extends the stack past the stack guard region into accessible memory that is not allocated to the thread's stack.
A general strategy to prevent extending the stack past the stack guard region is to access each page of memory down to and possibly including the page corresponding to the intended new value of the SP. If the stack is to be extended by an amount larger than the size of a memory page, then a series of accesses is required that works from higher to lower addressed pages. If any access results in a memory access violation, then the code has made an invalid attempt to extend the stack of the current thread.
This calling standard defines two methods for stack limit checking, implicit and explicit, which are explained in the following sections.
Implicit Stack Limit Checking
If a byte (not necessarily the lowest) of the new stack region is guaranteed to be accessed prior to any further stack extension, then the stack can be extended by an increment that is up to one-half the stack guard region (without any additional accesses).
This standard requires that the minimum stack guard region size is 8192 bytes.
Explicit stack limit checking must be performed unless the amount by which the SP is decremented is known to be less than or equal to 4096 and the application does not use a stack reserve region.
Some byte in the new stack region must be accessed before the SP can be further decremented for a subsequent stack extension.
This access can be performed either before or after the SP is decremented for this stack extension, but it must be done before the SP can be decremented again.
No standard procedure call can be made before some byte in the new stack region is accessed.
The system exception dispatcher ensures that the lowest addressed byte in the new stack region is accessed if any kind of asynchronous interrupt occurs both after the SP is decremented and before the access in the new stack region occurs.
These conventions ensure that the stack pointer is not decremented so that it points to accessible storage beyond the stack limit without this error being detected (either by the guard region being accessed by the thread or by an explicit stack limit check failure).
As a matter of practice, the system can provide multiple guard pages in the stack guard region. When a stack overflow is detected as a result of access to the stack guard region, one or more guard pages can be unprotected for use by the exception handling facility, as long as one or more guard pages remain protected to provide implicit stack limit checking during exception processing.
Explicit Stack Limit Checking
If the stack is being extended by an unknown amount or by a known amount that is greater than the maximum implicit check size 4096, then a code sequence that follows the rules for implicit stack limit checking can be executed in a loop to access the new stack region incrementally in segments that are less than or equal to the minimum stack guard region size 8192. At least one access must occur in each such segment.
The first access must occur between SP and SP-4096, because in the absence of more specific information, the previous guaranteed access relative to the current stack may be as much as 4096 bytes greater than the current stack pointer address.
The last access must be within 4096 of the intended new value of the stack pointer. These accesses must occur in order, starting with the highest addressed segment and working toward the lowest addressed segment.
Perform a read access using the intended new value of the stack pointer. This is nondestructive, even if the read is beyond the stack guard region, and may facilitate OS mapping of new stack pages, if appropriate, in a single operation.
Proceed with sequential accesses as just described.
Note
A simple algorithm that is consistent with this requirement (but achieves up to twice the minimum number of accesses) is to perform a sequence of accesses in a loop starting with the previous value of SP, decrementing by the minimum no-check extension size (4096) to, but not including, the first value that is less than the new value for the stack pointer.
Note
An explicit stack limit check can be performed either by inline code that is part of a prologue or by a run-time support routine that is tailored to be called from a procedure prologue.
Stack Reserve Region Checking
The size of the stack reserve region must be included in the increment size used for stack limit checks, after which it is not included in the amount by which the stack is actually extended. (Depending on the size of the stack reserve region, this may partially or even completely eliminate the ability to use implicit stack limit checking).
4.6. Register Stack
General registers R32 through R127 form a register stack that is automatically managed across procedure calls and returns. Each procedure frame on the register stack is divided into two dynamically-sized regions: one for input parameters and local variables, and one for output parameters.
On a procedure call, the registers are automatically renamed by the hardware so that the caller's output registers form the base of the register stack frame of the callee. On return, the registers are restored to the previous state, so that the input and local registers are preserved across the call.
The ALLOC instruction is used at the beginning of a procedure to allocate the input, local, and output regions; the sizes of these regions are supplied as immediate operands. A procedure is not required to issue an ALLOC instruction if it does not need to store any values in its register stack frame. It may write to the first N stacked registers, where N is the value of the argument count passed in the argument information (AI) register (see Section 4.7.5.3, “Argument Information (AI) Register”). It may not write to any other stack register without first issuing an ALLOC instruction.
ALLOC R36=rspfs, 4, 6, 5, 0
The actual registers to which the stacking registers are physically mapped are not directly addressable by the application software.
4.6.1. Input and Local Registers
The hardware makes no distinction between input and local registers. The caller's output registers automatically become the callee's register stack frame on a procedure call, with all registers initially allocated as output registers. An ALLOC instruction may increase or decrease the total size of the register stack frame, and may adjust the boundary between the input and local region and the output region.
The software conventions specify that up to eight general registers are used for parameter passing. Any registers in the input and local region beyond those eight may be allocated for use as preserved locals. Floating-point parameters may produce holes in the parameter list that is passed in the general registers; those unused input registers may also be used for preserved locals.
The caller's output registers do not need to be preserved for the caller. Once an input parameter is no longer needed, or has been copied elsewhere, that register may be reused for any other purpose within the procedure.
4.6.2. Output Registers
Up to eight output registers are used for passing parameters. If a procedure call requires fewer than eight general registers for its parameters, the calling procedure does not need to allocate more than are needed. If the called procedure expects more parameters, it will allocate extra input registers; these registers will be uninitialized.
A procedure may also allocate more than eight registers in the output region. While the extra registers may not be used for passing parameters, they can be used as extra scratch registers. On a procedure call, they will show up in the called procedure's output area as excess registers, and may be modified by that procedure. The called procedure may also allocate few enough total registers in its stack frame that the top of the called procedure's frame is lower than the caller's top-of-frame, but those registers will become available again when control returns to the caller.
4.6.3. Rotating Registers
A subset of the registers in the procedure frame may be designated as rotating registers. The rotating register region always starts with R32, and may be any multiple of eight registers in number, up to a maximum of 96 rotating registers. The renaming is under control of the Register Rename Base (RRB).
If the rotating registers include any or all of the output registers, software must be careful when using the output registers for passing parameters, because a non-zero RRB will change the virtual register numbers that are part of the output region. In general, software should ensure either that the rotating region does not overlap the output region, or that the RRB is cleared to zero before setting output parameter registers.
4.6.4. Frame Markers
The current application-visible state of the register stack is stored in an architecturally inaccessible register called the current frame marker. On a procedure call, this register is automatically saved by copying it to an application register, the previous function state (AR.PFS). The current frame marker is modified to describe a new stack frame whose input and local area is initially zero size, and whose output area is equal in size to the previous output area. On return, the previous frame state register is used to restore the current frame marker to its earlier value, and the base of the register stack is adjusted accordingly.
It is the responsibility of a procedure to save the previous function state register before issuing any procedure calls of its own, and to restore it before returning.
4.6.5. Backing Store for Register Stack
When the depth of the procedure call stack exceeds the capacity of the physical register file, the hardware frees physical registers by saving them into a memory stack. This backing store is distinct from the memory stack described in Section 4.5, “Memory Stack”.
As returns unwind the procedure call stack, the hardware also restores previously-saved physical registers from the backing store.
The operation of this register stack engine (RSE) is mostly transparent to application software. While the RSE is running, application software may not examine the contents of the backing store, and may not make any assumptions about how much of the register stack is still in physical registers or in the backing store. In order to examine previous stack frames, application software must synchronize the RSE with the FLUSHRS instruction. Synchronizing the RSE forces all stack frames up to, but not including, the current frame to be saved in backing store, allowing the software to examine the contents of the backing store without asynchronous operations modifying the memory. Modifications to the backing store require setting the RSE to enforced lazy mode after synchronizing it, which prevents the RSE from doing any operations other than those required by calls and returns. The procedure for synchronizing the RSE and setting the mode is described in the Itanium® Software Conventions and Runtime Architecture Guide.
The backing store grows towards higher addresses. The top of the stack, which corresponds to the top of the previous procedure frame, is available in the Backing Store Pointer (BSP) application register. The BSP must always point to a valid backing store address, because the operating system may need to start the RSE to process an exception.
Backing store overflow is automatically detected by the OpenVMS operating system, which will either extend the backing store to allow continued operation or will raise an exception. Unlike for the memory stack (see Section 4.5, “Memory Stack”), there are no specific rules or requirements that must be satisfied to facilitate detection of backing store overflow.
A NaT collection register is stored into the backing store following each group of 63 physical registers. The NaT bit of each register stored is shifted into the collection register. When the BSP reaches the quadword just before a 64-quadword boundary, the RSE stores the collection register. Software can determine the position of the NaT collection registers in the backing store by examining the memory address. This process is described in greater detail in the Intel IA-64 Architecture Software Developer Manual.
4.7. Procedure Linkage
This calling standard states that a standard call (see Section 1.4, “Definitions”) can be accomplished in any way that presents the called routine with the required environment. However, typically, most standard-conforming external calls are implemented with a common sequence of instructions and conventions. Because a common set of call conventions is so pervasive, these conventions are included for reference as part of this standard.
4.7.1. The GP Register
Every procedure that references statically-allocated data or calls another procedure requires a pointer to an associated short data segment in the GP register, so that it can access its static data and its linkage tables. Typically, an image has one such data segment, and the GP register must be set correctly prior to calling any entry point within that image. Optionally, an image may be partitioned into subcomponents called clusters in which case each cluster may have its own associated data segment (clusters may also share a common data segment). For further information on images and clusters, see the VSI OpenVMS Linker Utility Manual.
Throughout this chapter, rules regarding the use of the GP register are described in terms of images. However, these same rules apply between clusters within an image (keeping in mind that clusters within an image may share a common GP address and short data segment, while images cannot share a common GP address and short data segment).
The linkage conventions require that each image (or cluster) define exactly one GP value to refer to a location within its short data segment. This location should be chosen to maximize the usefulness of short-displacement immediate instructions for addressing scalars and linkage table entries. The image activator determines the absolute value of the GP register for each image after loading its data segment into memory.
Because the GP register remains unchanged for calls within an image, calls known to be local can be optimized accordingly. For calls between images, the GP register must be initialized with the correct GP value for the new image, and the calling function must ensure that its own GP value is saved and restored.
Note that there is a small set of compiler run-time support procedures that take a special pseudo-GP value as a kind of input parameter. See Section 4.7.7, “Simple and Bound Procedures” for more information about support for bound function descriptors. See Section 6.1.2, “Translated Images on I64 Systems” for information about support for translated images.
4.7.2. Types of Calls
Direct local calls. Direct calls within the same image can be made directly to the entry point of the target procedure. In this case, the GP register does not need to be changed.
Direct non-local calls. Calls made outside the same image are routed through an import stub (which can be inlined at compile time if the call is known or suspected to be to another image). The import stub obtains the address of the main entry point and the GP register value from the linkage table. Although coded in source as a direct call, a dynamically-linked call therefore becomes indirect.
Indirect calls. A function pointer points to a descriptor that contains both the address of the function entry point and the GP register value for the target function. The compiler must generate code for an indirect call that sets the new GP value before transferring control to the target procedure.
Special calls. Other special calling conventions are allowed to the extent that the compiler and the run-time library agree on the conventions, and provided that the stack can be unwound through such a call. Such calls are outside the scope of this document. See Section A.3.1, “Requirements for Unwinding the Stack” for a discussion of stack unwind requirements.
4.7.3. Calling Sequence
Direct and indirect procedure calls are described in the following sections. Because the compiler is not required to know whether any given call is local or to a dynamically linked image, the two types of direct calls are described together in Section 4.7.3.1, “Direct Calls”.
4.7.3.1. Direct Calls
Direct procedure calls follow the sequence of steps shown in the following figure. The following paragraphs describe these steps in detail.
Caller: Prepare call. Values in scratch registers that must be kept live across the call must be saved. They can be saved by copying them into local stacked registers, or by saving them on the memory stack. If the NaT bits associated with any live scratch registers must be saved, the compiler should use ST8.SPILL or STF.SPILL instructions. The User NaT collection register itself is preserved by the call, so the NaT bits need no further treatment at this point.
If the call is not known (at compile time) to be within the same image, the GP register must be saved.
The parameters must be set up in registers and memory as described in Section 4.7.4, “Parameter Passing”.
Caller: Call. All direct calls are made with a BR.CALL instruction, specifying B0 for the return link.
For direct local calls, the PC-relative displacement is computed at link time. Compilers may assume that the standard displacement field in the BR.CALL instruction is sufficiently wide to reach the target of the call. If the displacement is too large, the linker must supply a branch stub at some convenient point in the code; compilers must guarantee the existence of such a point by ensuring that code sections in the relocatable object files are no larger than the maximum reach of the BR.CALL instruction. With a 25-bit displacement, the maximum reach is 16 megabytes in either direction from the point of call.
Because direct calls to other images cannot be statically bound at link time, the linker must supply an import stub for the target procedure; the import stub obtains the address of the target procedure from the linkage table. The BR.CALL instruction can then be statically bound to the import stub using the PC-relative displacement.
The BR.CALL instruction performs the following actions:Saves the return link in the return branch register
Saves the current frame marker in the AR.PFS register
Sets the base of the new register stack frame to the beginning of the output region of the old frame
Caller: Import stub (direct non-local calls only). The import stub is allocated in the image of the caller, so that the BR.CALL instruction can be statically bound to the address of the import stub. It must access the linkage table via the current GP (which means that GP must be valid at the point of call), and obtain the address of the target procedure's entry point and its GP value. The import stub then establishes the new GP value and branches to the target entry point.
If the compiler knows or suspects that the target of a call is in a separate image, it can generate calling code that performs the functions of the import stub, which saves an extra branch.
When the target of a call is in the same image, an import stub is not used (which also means that GP must be valid at the point of call).
Callee: Entry. The prologue code in the target procedure is responsible for allocating the register stack frame. It is also responsible for allocating a frame on the memory stack when necessary. It may use the 16 bytes at the top of its caller's stack frame as a scratch area.
A non-leaf procedure must save the return branch register and previous function state, either in the memory stack frame or in a local stacked general register.
The prologue must also save any preserved registers to be used in this procedure. The NaT bits for those registers must be preserved as well, by copying the NaT bits to local stacked general registers, or by using ST8.SPILL or STF.SPILL instructions. However, the User NaT collection register (AR.UNAT) must be saved first because it is guaranteed to be preserved by the call.
Callee: Exit. The epilogue code is responsible for restoring the return branch register and previous function state, if necessary, and any preserved registers that were saved. The NaT bits must be restored using the LD8.FILL or LDF.FILL instructions. The User NaT collection register must also be restored if it was saved.
If a memory stack frame was allocated, the epilogue code must deallocate it.
Finally, the procedure exits by branching through the return branch register with the BR.RET instruction.
Caller: After the call. Any saved values (including GP) should be restored.
4.7.3.2. Indirect Calls
Indirect procedure calls follow nearly the same sequence as direct calls (see Section 4.7.3.1, “Direct Calls”), except that the branch target is established indirectly. This sequence is illustrated in Figure 4.4, “Indirect Procedure Calls”.
Caller: Function Pointer. A function pointer is always the address of a function descriptor for the target procedure (see Section 4.3, “Procedure Representation”). An indirect call loads the GP value into the GP register before branching to the entry point address.
In order to guarantee the uniqueness of a function pointer, and because its value is determined at program invocation time, code must materialize function pointers only by loading a pointer from the data segment.
Caller: Prepare call. Indirect calls are made by first loading the function pointer into a general register, loading the entry point address and the new GP value, and using the Move to Branch Register operation to move the address of the procedure entry point into the branch register to be used for the call.
Values in scratch registers that must be kept live across the call must be saved. They can be saved by copying them into local stacked registers, or by saving them on the memory stack. If the NaT bits associated with any live scratch registers must be saved, the compiler should use ST8.SPILL or STF.SPILL instructions. The User NaT collection register itself is preserved by the call, so the NaT bits need no further treatment at this point.
Unless the call is known (at compile time) to be within the same image, the GP register must be saved before the new GP value is loaded.
The parameters must be set up in registers and memory as described in Section 4.7.4, “Parameter Passing”
Caller: Call. All indirect calls are made with the indirect form of the BR.CALL instruction, specifying B0 for the return link.
The BR.CALL instruction saves the return link in the return branch register, saves the current frame marker in the AR.PFS register, and sets the base of the new register stack frame to the beginning of the output region of the old frame. Because the indirect call sequence obtains the entry point address and new GP value from the function descriptor, control flows directly to the target procedure, without the need for any intervening stubs.
Callee: Entry; Exit. The remainder of the calling sequence is the same as for direct calls (see Section 4.7.3.1, “Direct Calls”).
4.7.4. Parameter Passing
Parameters are passed in a combination of general registers, floating-point registers, and memory, as described below, and as illustrated in Figure 4.5, “Parameter Passing in Registers and Memory”.
The parameter list is formed by placing each individual parameter into fixed-size elements of the parameter list, referred to as parameter slots. Each parameter slot is 64 bits wide; parameters larger than 64 bits are placed in as many consecutive parameter slots as are needed to contain the entire parameter. The rules for allocation and alignment of parameter slots are described in Section 4.7.5.1, “Allocation of Parameter Slots”.
The contents of the first eight parameter slots are always passed in registers, while the remaining parameters are always passed on the memory stack, beginning at the caller's stack pointer plus 16 bytes. The caller uses up to eight of the registers in the output region of its register stack for integer and VAX floating-point parameters, and up to eight floating-point registers for IEEE floating-point parameters. The maximum number of registers used is eight.
To accommodate variable argument lists in the C language, there is a fixed correspondence between parameter slots; the first parameter slot is always in either the first general output register or the first floating-point register (never both), the second parameter slot is always in the second general output register or the second floating-point register (never both), and so on. This allows a procedure to spill its register parameters easily to memory to form the argument home area before stepping through the parameter list with a pointer. The Argument Information register (AI) makes this possible, as explained in Section 4.7.5.3, “Argument Information (AI) Register”.
A procedure can assume that the NaT bits on its incoming general register arguments are clear, and that the incoming floating-point register arguments are not NaTVals. A procedure making a call must ensure only that registers containing actual parameters are clear of NaT bits or NaTVals; registers not used for actual parameters are undefined.
4.7.5. Parameter Passing Mechanisms
Immediate value
Reference
Descriptor
Argument items are not self-defining; interpretation of each argument item depends on agreement between the calling and called procedures.
This standard does not dictate which passing mechanism must be used by a given language compiler. Language semantics and interoperability considerations might require different mechanisms in different situations.
Immediate value
An immediate value argument item contains the value of the data item. The argument item, or the value contained in it, is directly associated with the parameter.
Reference
A reference argument item contains the address of a data item such as a scalar, string, array, record, or procedure. This data item is associated with the parameter.
Descriptor
A descriptor argument item contains the address of a descriptor, which contains structural information about the argument's type (such as array bounds) and the address of a data item. This data item is associated with the parameter.
- By immediate value. An argument may be passed by immediate value only if the argument is one of the following:
One of the noncomplex scalar data types with a size known (at compile time) to be ≤ 64 bits
Either single or double precision complex
A record with a known size (at compile time)
A set, implemented as a bit vector, with a size known (at compile time) to be ≤ 64 bits
No form of string or array data type may be passed by immediate value in a standard call.
Unused high-order bits must be zero or sign-extended, as appropriate depending on the date type, to fill all bits of each argument list item (as specified in Table 4.10, “Unused Bits in Passed Data”).
A single-precision or double-precision complex value is passed as two single- or double-precision floating-point values, respectively. Note that the argument count reflects that two argument positions are used rather than just one actual argument.
A record value, which may be larger than 64 bits, is passed by immediate value as follows:Allocate as many fully occupied argument item positions to the argument value as are needed to represent the argument.
If the final argument position is only partially occupied by the argument, the contents of the remaining bits are undefined.
If an argument position is passed in one of the registers, it can only be passed in an integer register (never in a floating-point register).
Other argument values that are larger than 64 bits can be passed by immediate value using nonstandard conventions, typically using a method similar to those for passing records. Thus, for example, a 26-byte string can be passed by value in four integer registers.
By reference. Nonparametric arguments (arguments for which associated information such as string size and array bounds are not required) can be passed by reference in a standard call. This includes extended precision floating and extended precision complex values.
By descriptor. Parametric arguments (arguments for which associated information such as string size and array bounds must be passed to the caller) are passed by a single descriptor in a standard call.
Note that extended floating values are not passed using the immediate value mechanism; rather, they are passed using the by reference mechanism. (However, when by value semantics is required, it may be necessary to make a copy of the actual parameter and pass a reference to that copy in order to avoid improper alias effects).
Also note that when a record is passed by immediate value, the component types are not material to how the argument is aligned; the record will always be quadword aligned.
4.7.5.1. Allocation of Parameter Slots
Type |
Size (Bits) |
Number of Slots |
---|---|---|
Integer, small set |
1-64 |
1 |
Address/pointer (including all types passed by reference or descriptor) |
64 |
1 |
IEEE single-precision floating-point (S_floating) |
32 |
1 |
IEEE single-precision floating-point complex (S_floating) |
64 |
2 |
IEEE double-precision floating-point (T_floating) |
64 |
1 |
IEEE double-precision floating-point complex (T_floating) |
128 |
2 |
IEEE quad-precision floating-point (X_floating) |
64 (by reference) |
1 |
IEEE quad-precision floating-point complex (X_floating) |
64 (by reference) |
1 |
Aggregates (noncomplex) |
any |
(size+63)/64 |
VAX single-precision floating-point (F_floating) |
32 |
1 |
VAX single-precision floating-point complex (F_floating) |
64 |
2 |
VAX double-precision floating-point (D_ & G_floating) |
64 |
1 |
VAX double-precision floating-point complex (D_ & G_floating) |
128 |
2 |
Note
These rules are applied based on the type of the parameter after any type-promotion rules specified by the language have been applied. For example, a short integer passed without a function prototype in C is promoted to the int type, and is then passed according to the rules for the int type.
OpenVMS does not support passing the Itanium double-precision extended
floating-point type (__float80
), although that type may be used from time to
time in code generation sequences.
This placement policy does not ensure that parameters greater than 64 bits in size will fall on a natural alignment boundary if passed in memory. Such parameters may need to be copied by the called procedure into an aligned temporary prior to use, or accessed in a way that does not depend on natural alignment.
4.7.5.2. Normal Register Parameters
These eight argument slots are associated, one-to-one, with the stacked output general registers, as shown in Figure 4.5, “Parameter Passing in Registers and Memory”.
Integral scalar parameters, (including addresses and pointers), VAX floating-point parameters, and aggregate parameters in these slots are passed only in the corresponding output general registers.
Aggregate parameters in these slots are passed by value only in the corresponding output general registers. The aggregate is treated as a sequence of 64-bit integral values, with each value allocated into the next available slot in aggregate memory address order. If the size of the aggregate is not an even multiple of 64 bits, then the unused bits in the last slot are undefined.
If an aggregate or VAX floating-point complex parameter straddles the boundary between slot 7 and slot 8, the part that lies within the first eight slots is passed in general registers, and the remainder is passed in memory, as described in Table 4.10, “Unused Bits in Passed Data”.
Complex values (other than IEEE quad-precision floating-point complex), in those languages that include complex types, are passed as a pair of floating-point values (either single-precision or double-precision as appropriate). It is possible for the first of the two floating-point values in a complex value to occupy the last output register slot; in this case, the second floating-point value is passed in memory. IEEE quad-precision floating-point complex values are passed by reference.
IEEE single-precision and double-precision floating-point scalar parameters are passed in the corresponding floating-point register slot. IEEE quad-precision floating-point scalar parameters are passed by reference in the corresponding output general registers.
When IEEE floating-point parameters are passed in floating-point registers, they are passed in the register format, rounded to the appropriate precision. They are never passed in the general registers unless part of an aggregate, in which case they are passed in the aggregate memory format. When VAX floating-point parameters are passed in general registers, they are passed in memory format.
Parameters allocated beyond the eighth parameter slot are never passed in registers.
Note
Bit 31 is replicated in bits 32—63, even for unsigned 32-bit integers.
Data Type |
Type Designator? |
Data Size (bytes) |
Register Extension Type |
Memory Extension Type |
---|---|---|---|---|
Byte logical |
DSC$K_DTYPE_BU |
1 |
Zero64 |
Zero64 |
Word logical |
DSC$K_DTYPE_WU |
2 |
Zero64 |
Zero64 |
Longword logical |
DSC$K_DTYPE_LU |
4 |
Sign64 |
Sign64 |
Quadword logical |
DSC$K_DTYPE_QU |
8 |
Data64 |
Data64 |
Byte integer |
DSC$K_DTYPE_B |
1 |
Sign64 |
Sign64 |
Word integer |
DSC$K_DTYPE_W |
2 |
Sign64 |
Sign64 |
Longword integer |
DSC$K_DTYPE_L |
4 |
Sign64 |
Sign64 |
Quadword integer |
DSC$K_DTYPE_Q |
8 |
Data64 |
Data64 |
F_floating |
DSC$K_DTYPE_F |
4 |
VAXF64 |
Data32 |
D_floating |
DSC$K_DTYPE_D |
8 |
VAXDG64 |
Data64 |
G_floating |
DSC$K_DTYPE_G |
8 |
VAXDG64 |
Data64 |
F_floating complex |
DSC$K_DTYPE_FC |
2 * 4 |
2*VAXF64 |
2*Data32 |
D_floating complex |
DSC$K_DTYPE_DC |
2 * 8 |
2*VAXDG64 |
2*Data64 |
G_floating complex |
DSC$K_DTYPE_GC |
2 * 8 |
2*VAXDG64 |
2*Data64 |
S_floating |
DSC$K_DTYPE_FS |
4 |
Hard |
Data32 |
T_floating |
DSC$K_DTYPE_FT |
8 |
Hard |
Data64 |
X_floating |
DSC$K_DTYPE_FX |
16 |
N/A |
N/A |
S_floating complex |
DSC$K_DTYPE_FSC |
2 * 4 |
2*Hard |
2*Data32 |
T_floating complex |
DSC$K_DTYPE_FTC |
2 * 8 |
2*Hard |
2*Data64 |
X_floating complex |
DSC$K_DTYPE_FXC |
2 * 16 |
N/A |
N/A |
Small structures of 8 bytes or less |
N/A |
≤8 |
Nostd |
Nostd |
Small arrays of 8 bytes or less |
N/A |
≤8 |
Nostd |
Nostd |
32-bit address |
N/A |
4 |
Sign64 |
Sign64 |
64-bit address |
N/A |
8 |
Data64 |
Data64 |
Sign Extension Type |
Defined Function |
---|---|
Sign64 |
Sign-extended to 64 bits. |
Zero64 |
Zero-extended to 64 bits. |
Data32 |
Data is 32 bits. The state of bits <63:32> is unpredictable. |
2*Data32 |
Two single-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data32). |
Data64 |
Data is 64 bits. |
2*Data64 |
Two double-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data64). |
VAXF64 |
Data is 64 bits. Low-order 32 bits are the same as the F_floating memory format and the high-order 32 bits are zero. (Used only in a general register, never in a floating-point register). |
VAXDG64 |
Data is 64 bits. Uses the corresponding D_floating or G_floating memory format. (Used only in a general register, never in a floating-point register). |
2*VAXF64 |
Two single-precision parts of the complex value are stored in memory as independent floating-point values (each handled as VAXF64). |
2*VAXDG64 |
Two double-precision parts of the complex value are stored in memory as independent floating-point values (each handled as VAXDG64). |
Hard |
Passed in the layout defined by the hardware SRM. |
2*Hard |
Two floating-point parts of the complex value are stored in a pair of registers as independent floating-point values (each handled as Hard). |
Nostd |
State of all high-order bits not occupied by the data is unpredictable across a call or return. |
4.7.5.3. Argument Information (AI) Register
In addition to the normal parameters, an implicit argument information value is passed in register R25, the Argument Information (AI) register. This value is shown in Figure 4.6, “Argument Information Register Representation”.
Argument Count is an unsigned byte that specifies the number of 64-bit argument slots used for the argument list. (Note that single and double-precision complex values use two slots, which is reflected in this count).
Value |
OpenVMS Name |
Meaning |
---|---|---|
0 |
AI$K_AR_I64 |
64-bit or 32-bit sign-extended to 64-bit argument passed in an integer register (including addresses). or Argument is not present. |
1 |
AI$K_AR_FF |
F_floating (also known as VAX single-precision floating-point) argument passed in a general register. |
2 |
AI$K_AR_FD |
D_floating (also known as VAX double-precision floating-point) argument passed in a general register. |
3 |
AI$K_AR_FG |
G_floating (also known as VAX double-precision floating-point) argument passed in a general register. |
4 |
AI$K_AR_FS |
S_floating (also known as IEEE single-precision floating-point) argument passed in a floating-point register. |
5 |
AI$K_AR_FT |
T_floating (also known as IEEE double-precision floating-point) argument passed in a floating-point register. |
6,7 |
— |
Reserved. |
4.7.5.4. Memory Stack Parameters
The remainder of the parameter list, beginning with slot 8, is passed in the outgoing parameter area of the memory stack frame, as described in Section 4.5.1, “Procedure Frames”. Parameters are mapped directly to memory, with slot 8 placed at location SP+16, slot 9 placed at location SP+24, and so on. Each argument is stored in memory as a series of one or more 64-bit storage units, with unused bits in the last unit undefined.
4.7.5.5. Variable Argument Lists
The rules above support variable-argument list functions in both the K&R and the ANSI dialects of the C language. (Note that argument location is independent of whether a prototype is in scope).
The nth argument is in either Rn or Fn regardless of the type of parameter in the preceding register slot. Therefore, a function with variable arguments may assume that the variable arguments that lie within the first eight argument slots can be found in either the stacked input integer registers (IN0-IN7), or in the floating-point parameter registers (F8-F15). Using the information codes from the AI (Argument Information) register (see Table 4.12, “Argument Information Register Codes”), the function can then store these registers to memory using the 16-byte scratch area for IN6/F14 and IN7/F15, and up to 48 bytes at the base of its own stack frame for IN0/F8-IN5/F13, as necessary. This arrangement places all of the variable parameters in one contiguous block of memory.
4.7.5.6. Pointers to Formal Parameters
Whenever the address is formed of a formal parameter that is passed in a register, the compiler must store the parameter to the stack, as it would for a variable argument list.
4.7.5.7. Languages Other than C
The placement of arguments in general registers versus floating-point registers does not depend on any notion or concept of a prototype being in scope. It is therefore applicable to all languages at all times.
4.7.5.8. Rounding Floating-point Values
There must be no difference in behavior between a floating-point parameter passed directly in a register and a floating-point parameter that has been stored to memory and reloaded. In either case, the floating-point value must be the same. This implies that floating-point parameters passed in floating-point registers must be explicitly rounded to the proper precision by the caller.
4.7.5.9. Order of Argument Evaluation
Because most high-level languages do not specify the order of evaluation (with respect to side effects) of arguments, those language processors can evaluate arguments in any convenient order. The choice of argument evaluation order and code generation strategy is constrained only by the definition of the particular language. Programs should not depend on the order of evaluation of arguments.
4.7.5.10. Examples
The following examples illustrate the parameter passing conventions. Floating-point types are IEEE floating-point representations.
Scalar Integers and Floats, With or Without Prototype
extern int func(int, double, double, int); func(i, a, b, j);
Slot |
Variable |
Allocation |
Argument Register Information |
---|---|---|---|
0 |
i |
OUT0 |
AI$K_AR_I64 |
1 |
a |
F9 |
AI$K_AR_FT |
2 |
b |
F10 |
AI$K_AR_FT |
3 |
j |
OUT3 |
AI$K_AR_I64 |
Aggregates Passed by Value
extern int func(); struct { int array[20]; } a; func(i, a);
Slot |
Variable |
Allocation |
Argument Register Information |
---|---|---|---|
0 |
i |
OUT0 |
AI$K_AR_I64 |
1-7 |
a.array[0—13] |
OUT1—OUT7 |
AI$K_AR_I64 (all 7 slots) |
8-24 |
a.array[14—19] |
In memory, at SP+16 through SP+39 |
Not applicable |
extern int func(); struct { __float128 x; int array[20]; } a; func(i, a);
Slot |
Variable |
Allocation |
Argument Register Information |
---|---|---|---|
0 |
i |
OUT0 |
AI$K_AR_I64 |
1-2 |
a.x |
OUT1—OUT2 |
AI$K_AR_I64 (both slots) |
3-7 |
a.array[0—9] |
OUT3—OUT7 |
AI$K_AR_I64 (all 5 slots) |
8-21 |
a.array[10—19] |
In memory, at SP+16 through SP+55 |
Not applicable |
Floating-Point Aggregates, With or Without Prototype
struct s { float a, b, c; } x; extern func(); func(x);
Slot |
Variable |
Allocation |
Argument Register Information |
---|---|---|---|
0 |
x.a & x.b |
OUT0 |
AI$K_AR_I64 |
1 |
x.c |
OUT1 |
AI$K_AR_I64 (low 32 bits) |
4.7.6. Return Values
Values up to 128 bits are returned directly in the registers, according to the rules in Table 4.13, “Rules for Return Values”.
Integer, enumeration, record, and set values (bit vectors) smaller than 64 bits must be zero-filled (unsigned integers, enumerations, records, sets) or sign-extended (signed integrals) to a full 64 bits. However, for unsigned 32-bit integers, bit 31 is replicated in bits 32—63.
When floating-point values are returned in floating-point registers, they are returned in the register format, rounded to the appropriate precision. When they are returned in the general registers (for example, as part of a record), they are returned in their memory format.
Type |
Size (Bits) |
Location of Return Value |
Alignment |
---|---|---|---|
Integer/Pointer, small Record, Set |
1—64 |
R8 |
LSB |
IEEE single-precision floating-point (S_floating) |
32 |
F8 |
N/A |
IEEE double-precision floating-point (T_floating) |
64 |
F8 |
N/A |
IEEE single-precision complex (S_floating) |
64 |
F8, F9 |
N/A |
IEEE double-precision complex (T_floating) |
128 |
F8, F9 |
N/A |
VAX single-precision floating-point (F_floating) |
32 |
R8 |
N/A |
VAX double-precision floating-point (D_ and G_floating) |
64 |
R8 |
N/A |
VAX single-precision floating-point complex (F_floating) |
64 |
R8, R9 |
N/A |
VAX double-precision floating-point complex (D_ and G_floating) |
128 |
R8, R9 |
N/A |
Note
X_floating and X_floating complex are not included in this table because they are returned using the hidden parameter method (see below).
The rules in Table 4.13, “Rules for Return Values” are expressed in more detail in Table 4.10, “Unused Bits in Passed Data”. F_floating and F_floating complex values in the general registers are zero-extended (Zero64), because this most closely approximates the effect of using the Alpha register format.
Hidden Parameter
Return values other than those covered by Table 4.13, “Rules for Return Values” are returned in a buffer allocated by the caller. A pointer to the buffer is passed to the called procedure as a hidden first parameter, and all normal parameters are shifted one slot to make this possible. The return buffer must be aligned at a 16-byte boundary.
4.7.7. Simple and Bound Procedures
Simple procedure
Bound procedure
A simple procedure is a procedure that does not need direct access to the stack of its execution environment. In order to call a simple procedure, a simple function descriptor is created, as shown in Figure 4.7, “Simple Function Descriptor”, and described in Table 4.14, “Simple Function Descriptor”.
FDSC$Q_ENTRY |
Entry code address for the procedure to be called. |
FDSC$Q_GP |
GP value for the procedure to be called. |
A bound procedure is a procedure that does need direct access to the stack of its execution environment, typically to reference an up-level variable or to perform a nonlocal GOTO operation.
When a bound procedure is called, the caller must pass some kind of pointer to the called code that allows it to reference its up-level environment. Typically, this pointer is a frame pointer for that environment, but many variations are possible. When the caller itself is executing within that outer environment, it can usually make such a call directly to the code for the nested procedure without recourse to any additional function descriptors. However, when a procedure value for the nested procedure must be passed outside of that environment to a call site that has no knowledge of the target procedure, a bound function descriptor is created so that the nested procedure can be called just like a simple procedure.
Bound procedure values, as defined by this standard, are designed for multilanguage use and utilize the properties of function descriptors to allow callers of procedures to use common code to call both bound and simple procedures.
A bound function descriptor is similar to a simple function descriptor, with several additional fields as shown in Figure 4.8, “Bound Function Descriptor” and described in Table 4.15, “Contents of Bound Function Descriptor”.
Field Name | Contents |
---|---|
FDSC$Q_OTS_ENTRY |
Code address for a suitable library helper routine, for example, OTS$JUMP_TO_BPV |
FDSC$Q_OTS_PSEUDO_GP |
Address of this bound function descriptor |
FDSC$Q_SIGNATURE |
Signature information field (see Section 6.1.3, “Signature Information Fields in Function Descriptors”) |
FDSC$Q_TARGET_ENTRY |
Entry code address for the procedure to be called |
FDSC$Q_TARGET_GP |
GP value for the procedure to be called |
FDSC$Q_TARGET_ENVIR |
Environment value for the procedure to be called |
A bound procedure descriptor is inherently dynamic because the environment value must be determined at runtime by code executing within the bound procedure environment. Therefore, when a bound procedure descriptor such as this is needed, it is usually allocated on the creating procedure's stack.
Load the "real" target entry address into a volatile branch register, for example, B6.
Load the dynamic environment value into the appropriate uplevel-addressing register for the target function, for example, OTS$JUMP_TO_BPV uses R9.
Load the "real" target GP address into the GP register
Transfer control (branch, not call) to the target entry address.
Control arrives at the real target procedure address with both the GP and environment register values established appropriately.
OTS$JUMP_TO_BPV::
add gp=gp,24 ; Adjust GP to point to entry address
ld8 r9=[gp],16 ; Load target entry address
mov b6=r9
ld8 r9=[gp],-8 ; Load target environment value
ld8 gp=[gp] ; Load target GP
br b6 ; Transfer to target
Because the address of a bound function descriptor is a valid function pointer, it may be passed to translated code which uses it to call back into native code; therefore, the value of the signature information field must be the same as that in the official function descriptor for the real target procedure (see Section 6.1.2, “Translated Images on I64 Systems”).
Note that there can be multiple OTS$JUMP_TO_BPV-like support routines, corresponding to different target registers where the environment value should be placed. The code that creates the bound function descriptor is also necessarily compiled by the same compiler that compiles the target procedure, thus can correctly select an appropriate support routine.
4.8. Procedure Call Stack
A procedure is an active procedure while its body is executing, including while any procedure it calls is executing. When a procedure is active, its designated condition handler may handle an exception that is signaled during its execution.
Associated with each active procedure is an invocation context, informally called a frame, which consists of the set of registers and space in memory that is allocated and that may be accessed during execution for a particular call of that procedure.
When a procedure begins to execute, it has a limited invocation context that includes the output registers of its caller (which have been "shifted" to start at register R32). The initial instructions may allocate and initialize additional context, including possibly saving information from the invocation context of its caller. Such instructions, if any, are termed a procedure prologue. Once execution of the prologue is complete, the procedure is said to be active.
When a procedure is ready to return to its caller, the procedure ceases to be active after it begins to execute the instructions that deallocate and discard the procedure's invocation context (which may include restoring state of the caller's invocation context that was saved during the prologue). These instructions are termed a procedure epilogue.
A null frame procedure has no prologue and no epilogue, and consists solely of body instructions. Such a procedure becomes active immediately.
A procedure may have more than one prologue if there are multiple entry points. A procedure may also have more than one epilogue if there are multiple return points. One of each will be executed during any given invocation of the procedure.
A procedure call stack (for a thread) consists of the stack of invocation contexts that exists at any point in time. New invocation contexts are pushed on that stack as procedures are called and invocations are popped from the call stack as procedures return.
The invocation context of a procedure that calls another procedure is said to precede or be previous to the invocation context of the called procedure.
4.8.1. Current Procedure
The current procedure is the active procedure whose execution began most recently; its invocation context is at the top of the call stack. Note that a procedure executing in its prologue or epilogue is not active, and hence cannot be the current procedure.
For OpenVMS, the PC (instruction pointer) register in combination with associated unwind information determines what procedure is current (for exception handling purposes). See Section A.4, “Data Structures” for a description of the unwind information data structures.
The PC is in a range described by any body region unwind descriptor but not in an epilogue
The PC is in a range not described by any unwind descriptor, and therefore by default must be within a null frame procedure (see Section A.4.1, “Unwind Table and Unwind Information Block”):
4.8.2. Procedure Call Tracing
To provide the context of a procedure invocation
To walk (navigate) the procedure call stack
To refer to a given procedure invocation
To examine or modify the register context of an active procedure
This section describes the data structure mechanisms. The run-time library functions that support these functions are described in Section 4.8.3, “Invocation Context Block Access Routines”
4.8.2.1. Invocation Context Block
The context of a specific procedure invocation is provided through the use of a data structure called an invocation context block (ICB). Table 4.16, “Contents of the Invocation Context Block” describes the contents of the OpenVMS I64 invocation context block.
Field |
Size |
Description |
---|---|---|
LIBICB$L_CONTEXT_LENGTH |
Longword |
Unsigned total length in bytes of the invocation context block. See Section 4.8.3.1, “Initializing the Invocation Context Block”. |
LIBICB$V_FRAME_FLAGS |
3 Bytes |
See Table 4.17, “Flags in LIBICB$V_FRAME_FLAGS Field of the Invocation Context Block”. |
LIBICB$B_BLOCK_VERSION |
Byte |
ICB version; initial value of 2 for OpenVMS I64 (1 is for OpenVMS Alpha). See Section 4.8.3.1, “Initializing the Invocation Context Block”. |
LIBICB$IH_IREG |
128 Quadwords |
Array of general registers (only those allocated; unallocated registers are uninitialized).
|
LIBICB$IH_GRNAT |
2 Quadwords |
General register NaT collection.? |
LIBICB$FO_F2_F31 |
30 Octawords |
Floating-point registers F2-F31. Array of floating-point register values in register format, as saved by a SPILL instruction. |
LIBICB$PH_F32_F127 |
Quadword |
Pointer to array of floating-point values in register format for registers F32-F127, as saved by SPILL instruction. A pointer value of 0 indicates that the contents of registers F32-F127 are not defined. |
LIBICB$IH_BRANCH |
8 Quadwords |
Array of branch registers. |
LIBICB$IH_RSC |
Quadword |
Register Stack Configuration register. |
LIBICB$IH_BSP |
Quadword |
Backing store pointer. |
LIBICB$IH_BSPSTORE |
Quadword |
Backing store write pointer. |
LIBICB$IH_RNAT |
Quadword |
RSE NaT collection register. |
LIBICB$IH_CCV |
Quadword |
Compare and Exchange Value register. |
LIBICB$IH_UNAT |
Quadword |
User NaT collection register. |
LIBICB$IH_PFS |
Quadword |
Previous function state. |
LIBICB$IH_LC |
Quadword |
Loop count register. |
LIBICB$IH_EC |
Quadword |
Epilogue Count register. |
LIBICB$IH_CSD |
Quadword |
Copy of the AR.CSD. |
LIBICB$IH_SSD |
Quadword |
Copy of the AR.SSD. |
LIBICB$Q_PRED |
Quadword |
Predicate collection register, P0—P63. This field is a bitvector with bit 0 reserved. |
LIBICB$IH_PC |
Quadword |
Current instruction pointer; the slot number overlays <1:0>. |
LIBICB$IH_CFM |
Quadword |
Current Frame Marker. |
LIBICB$IH_UM |
Quadword |
User mask bits from PSR. |
LIBICB$O_GR_VALID |
Octaword |
General Register validity mask.? |
LIBICB$L_FR_VALID |
Longword |
Floating-Point Register validity mask for registers F2-F31.? |
LIBICB$Q_BR_VALID |
Quadword |
Branch Register validity mask.? |
LIBICB$Q_AR_VALID |
Quadword |
Application Register validity mask.? |
LIBICB$Q_OTHER_VALID |
Quadword |
PC and CFM validity mask.? |
LIBICB$Q_PR_VALID |
Quadword |
Predicate Register validity mask.? |
LIBICB$IH_ORIGINAL_ |
Quadword |
Original address of the general register spill area (normally &icb->LIBICB$IH_IREG[0]).? |
LIBICB$IH_PSP |
Quadword |
Previous stack pointer. |
LIBICB$IH_RETURN_PC |
Quadword |
Return PC. |
LIBICB$IH_PREV_BSP |
Quadword |
Previous BSP |
LIBICB$PH_CHFCTX_ADDR |
Quadword |
Pointer to condition handler facility context block. |
LIBICB$IH_OSSD |
Quadword |
Copy of OSSD from Unwind Information Block. |
LIBICB$IH_HANDLER_FV |
Quadword |
Condition Handler Function Value. |
LIBICB$PH_LSDA |
Quadword |
Address of the Language Specific Data Area of the Unwind Information Block |
Beginning of User Override Parameters (offset LIBICB$R_UO_BASE) | ||
LIBICB$Q_UO_FLAGS |
Quadword |
Operational flags: LIBICB$V_UO_FLAG_CACHE_UNWIND – Cache unwind information during a walk of the call stack. See Section 4.8.3.2, “Walking the Call Stack”. |
LIBICB$IH_UO_IDENT |
Quadword |
User context variable; passed by value to the callback routines. See Section 4.8.5, “Invocation Context Callback Routines”. |
LIBICB$PH_UO_READ_MEM |
Quadword |
Pointer to user read memory routine. See Section 4.8.5.3, “The Read Memory Routine”. |
LIBICB$PH_UO_GETUEINFO |
Quadword |
Pointer to user get unwind entry information routine. See Section 4.8.5.1, “The Get Unwind Information Routine”. |
LIBICB$PH_UO_GETCONTEXT |
Quadword |
Pointer to user get initial context routine. See Section 4.8.5.2, “The Get Initial Context Routine”. |
LIBICB$PH_UO_WRITE_MEM |
Quadword |
Pointer to user write memory routine. See Section 4.8.5.4, “The Write Memory Routine”. |
LIBICB$PH_UO_WRITE_REG |
Quadword |
Pointer to user write register routine. See Section 4.8.5.5, “The Write Register Routine”. |
LIBICB$PH_UO_MALLOC |
Quadword |
Pointer to user memory allocate routine. See Section 4.8.5.6, “The Memory Allocation Routine”. |
LIBICB$PH_UO_FREE |
Quadword |
Pointer to user memory free routine. See Section 4.8.5.7, “The Memory Deallocation Routine”. |
End of user override parameters (length of LIBICB$K_UO_LENGTH) | ||
LIBICB$L_ALERT_CODE |
Longword |
Stack walk detailed status. Alert codes are enumerated in the LIBICB include files. See Section 4.8.3.7, “LIB$I64_GET_CURR_INVO_CONTEXT”. |
LIBICB$IH_SYSTEM_ |
n Quadwords |
Variable-sized area; unused and undefined at this time. |
Flag |
Description |
---|---|
LIBICB$V_BOTTOM_OF_STACK |
Set to 1 if this is the bottom of the stack and there is absolutely no previous frame. |
LIBICB$V_HANDLER_PRESENT |
Set to 1 if this frame has a condition handler. |
LIBICB$V_IN_PROLOGUE |
Set to 1 if the PC is in a prologue region. |
LIBICB$V_IN_EPILOGUE |
Set to 1 if the PC is in an epilogue region. |
LIBICB$V_HAS_MEM_STK_FRAME |
Set to 1 if this frame has a memory stack. |
LIBICB$V_HAS_REG_STK_FRAME |
Set to 1 if this frame has a register stack. |
Static scratch registers, unless saved and described in the unwind table information, are not realizable except for an invocation context preceding an exception or AST frame.
4.8.2.2. Invocation Context Handle
To refer to a specific procedure invocation at run-time, an invocation context handle (ICH) can be used. The invocation context handle is a quadword that uniquely identifies any one of the active frames on a call stack, even when one or more of the frames correspond to procedures that have no associated stack storage.
The characteristics of the caller are used to determine the invocation context handle. If the caller has a register frame, then the RSE Backing Store Pointer (BSP) is used as the handle; otherwise, the caller's Stack Pointer is used. (The caller's Stack Pointer is sometimes called Stack Pointer on Entry or Previous Stack Pointer (PSP)).
4.8.3. Invocation Context Block Access Routines
Note
The OpenVMS I64 stack tracing routines use heap storage during the analysis of unwind descriptors. The default heap storage mechanism uses a LIBRTL implementation of the C RTL function malloc, the use of which may result in virtual memory being expanded using the $EXPREG system service. See Section 4.8.5, “Invocation Context Callback Routines” on how to override the defaults. See also Section 4.8.3.12, “LIB$I64_PREV_INVO_END”.
4.8.3.1. Initializing the Invocation Context Block
Allocate the block on an octaword (16-byte) boundary.
Clear (set to all zero bytes) the entire block.
Initialize the LIBICB$L_CONTEXT_LENGTH field to LIBICB$K_INVO_CONTEXT_BLK_SIZE and the LIBICB$B_BLOCK_VERSION field to LIBICB$K_INVO_CONTEXT_VERSION.
Set any required parameters in the user override portion of the invocation context block.
Set the LIBICB$V_UO_FLAG_CACHE_UNWIND flag if appropriate. See also Section 4.8.3.2, “Walking the Call Stack” and Section 4.8.3.12, “LIB$I64_PREV_INVO_END” regarding subsequent use of LIB$I64_PREV_INVO_END.
Failure to do so will cause these routines to return an error status. Note that this is a change from Alpha, where initialization was not necessary.
LIB$I64_CREATE_INVO_CONTEXT (see Section 4.8.3.3, “LIB$I64_CREATE_INVO_CONTEXT”)
LIB$I64_FREE_INVO_CONTEXT (see Section 4.8.3.4, “LIB$I64_FREE_INVO_CONTEXT”)
LIB$I64_INIT_INVO_CONTEXT (see Section 4.8.3.5, “LIB$I64_INIT_INVO_CONTEXT”)
4.8.3.2. Walking the Call Stack
During the course of program execution, it is sometimes necessary to walk the call stack. Frame-based exception handling is one case where this is done. Call stack navigation is possible only in the reverse direction (in a latest-to-earliest or top-to-bottom sequence).
Given a program state (which contains a register set), build an invocation context.
For the current routine, an initial invocation context block can be obtained by calling the LIB$I64_GET_CURR_INVO_CONTEXT routine (see Section 4.8.3.7, “LIB$I64_GET_CURR_INVO_CONTEXT”).
Repeatedly call the LIB$I64_GET_PREV_INVO_CONTEXT routine (see Section 4.8.3.8, “LIB$I64_GET_PREV_INVO_CONTEXT”) until the desired invocation context, or the end of the call chain, has been reached.
LIB$I64_GET_PREV_INVO_CONTEXT indicates the end of the invocation call chain if either of the following conditions is true:The OSSD$V_BOTTOM_OF_STACK flag is set for the target frame (see Table A.14, “Operating System-Specific Data Area”).
The return address (IP) of the target frame is zero.
To make the stack walk more efficient, you can set the LIBICB$V_UO_FLAG_CACHE_UNWIND flag. This causes unwind information to be carried over from one call to LIB$I64_GET_PREV_INVO_CONTEXT to the next. At the conclusion of the stack walk, you must call LIB$I64_PREV_INVO_END to free any cached unwind information. This is the recommended practice, but not the default behavior.
Compilers are allowed to optimize high-level language procedure calls in such a way that they do not appear in the invocation chain. For example, inline procedures never appear in the invocation chain.
Make no assumptions about the relative positions of any memory used for procedure frame information. There is no guarantee that successive stack frames will always appear at higher addresses.
4.8.3.3. LIB$I64_CREATE_INVO_CONTEXT
This convenience routine simplifies creating and properly initializing an invocation context block. The routine allocates an invocation context block from heap storage and initializes it according to the steps described in Section 4.8.3.1, “Initializing the Invocation Context Block”. Users of this routine should call LIB$I64_FREE_INVO_CONTEXT when the invocation context block is no longer required.
LIB$I64_CREATE_INVO_CONTEXT ([malloc] [, free] [, ident])
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
malloc |
function_value |
procedure |
read |
by value |
free |
function_value |
procedure |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
A procedure reference for a user callback routine that allocates memory. See Section 4.8.5.6, “The Memory Allocation Routine” for details of this routine. This is an optional argument. The default is to use an implementation of the C RTL routine malloc. If specified, this routine is used to allocate the invocation context block and is also placed in the invocation context block field LIBICB$PH_UO_MALLOC for use during the stack walk. |
|
A procedure reference for a user callback routine that deallocates memory. This value is placed in the invocation context block field LIBICB$PH_UO_FREE. See Section 4.8.5.7, “The Memory Deallocation Routine” for details on this routine. This is an optional argument; however, it must be specified if malloc is specified. The default is to use an implementation of the C RTL routine free. |
|
Specifies a user ident value to be placed in the invocation context block LIBICB$IH_UO_IDENT field. In turn, this value is passed to the malloc and free routines, described in Section 4.8.5.6, “The Memory Allocation Routine” and Section 4.8.5.7, “The Memory Deallocation Routine” respectively. This is an optional argument; the default value is zero. |
|
A non-zero value represents the address of the invocation context block allocated. A value of 0 indicates failure. |
4.8.3.4. LIB$I64_FREE_INVO_CONTEXT
LIB$I64_FREE_INVO_CONTEXT (invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
|
Address of an invocation context block. |
None. |
4.8.3.5. LIB$I64_INIT_INVO_CONTEXT
LIB$I64_INIT_INVO_CONTEXT (invo_context, invo_version [, cache_unwind_flag])
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
invo_version |
version_number |
byte |
read |
by value |
cache_unwind_flag |
flag |
longword |
read |
by value |
|
Address of an invocation context block. |
|
The value LIBICB$K_INVO_CONTEXT_VERSION. This is used to verify the operating environment. |
|
A flag indicating if the cache unwind flag, LIBICB$V_UO_FLAG_CACHE_UNWIND, should be set in the invocation context block. A value of zero clears the flag; a value of one sets the flag. This is an optional argument. The default is zero. |
|
A value of 1 indicates success. A value of 0 indicates a version number mismatch. |
4.8.3.6. LIB$I64_GET_INVO_CONTEXT
LIB$I64_GET_INVO_CONTEXT(invo_handle, invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_handle |
invo_handle |
quadword |
read |
by reference |
invo_context |
invo_context_blk |
structure |
modify |
by reference |
|
Address of the location that contains the handle for the desired invocation. |
|
Address of an invocation context block into which the procedure context of the frame
specified by |
Note
The invocation context block must be properly initialized as described in Section 4.8.3.1, “Initializing the Invocation Context Block” before calling this routine.
|
Status value. A value of 1 indicates success; a value of 0 indicates failure. |
Note
If the invocation handle that was passed does not represent any procedure context in the active call stack, the new contents of the context block is unpredictable.
4.8.3.7. LIB$I64_GET_CURR_INVO_CONTEXT
LIB$I64_GET_CURR_INVO_CONTEXT(invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
|
Address of an invocation context block into which the procedure context of the caller will be written. |
Note
The invocation context block must be properly initialized as described in Section 4.8.3.1, “Initializing the Invocation Context Block” before calling this routine.
Zero |
This facilitates use in the implementation of the C language unwind
|
4.8.3.8. LIB$I64_GET_PREV_INVO_CONTEXT
LIB$I64_GET_PREV_INVO_CONTEXT(invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
|
Address of a valid invocation context block. The given invocation context block is updated to represent the context of the previous (calling) frame. The LIBICB$V_BOTTOM_OF_STACK flag of the invocation context block is set if the target frame represents the end of the invocation call chain or if stack corruption is detected. |
|
Status value. A value of 1 indicates success. When the initial context represents the bottom of the call stack, a value of 0 is returned. |
4.8.3.9. LIB$I64_GET_INVO_HANDLE
LIB$I64_GET_INVO_HANDLE(invo_context, invo_handle)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
read |
by reference |
invo_handle |
invo_handle |
quadword |
write |
by reference |
|
Address of a valid invocation context block. |
|
Address of the location into which the invocation context handle is to be written. If the call fails, the value of the invocation context handle is LIB$K_INVO_HANDLE_NULL. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.3.10. LIB$I64_GET_CURR_INVO_HANDLE
LIB$I64_GET_CURR_INVO_HANDLE(invo_handle)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_handle |
invo_handle |
quadword |
write |
by reference |
|
Address of a quadword into which the invocation handle of the caller will be written. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.3.11. LIB$I64_GET_PREV_INVO_HANDLE
LIB$I64_GET_PREV_INVO_HANDLE (invo_handle_in, invo_handle_out)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_handle_in |
invo_handle |
quadword |
read |
by reference |
invo_handle_out |
invo_handle |
quadword |
write |
by reference |
|
The address of an invocation handle that represents a target invocation context. |
|
Address of the location into which the invocation context handle of the previous context is to be written. If the call fails, the value of the previous invocation context handle is LIB$K_INVO_HANDLE_NULL. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
Note
Each call to this routine involves a stack walk from the top of the stack to find the procedure matching the input handle. Consequently, using this routine repeatedly is an inefficient way to walk the stack, compared to using LIB$I64_GET_PREV_INVO_CONTEXT.
4.8.3.12. LIB$I64_PREV_INVO_END
This routine should be called at the conclusion of call tracing operations to free the memory used to process unwind descriptors. The call tracing routines are LIB$I64_GET_INVO_CONTEXT, LIB$I64_GET_PREV_INVO_CONTEXT, LIB$I64_GET_CURR_INVO_CONTEXT.
To provide efficient call tracing, some unwind information is tracked in heap storage from one call to the next. This heap storage should be freed before you release or reuse the invocation context block.
LIB$I64_PREV_INVO_END (invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
|
Address of a valid invocation context block previously used for call tracing. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.3.13. LIB$I64_PUT_INVO_REGISTERS
LIB$I64_PUT_INVO_REGISTERS (invo_handle, invo_context [,gr_mask] [,fr_mask] [,br_mask] [,pr_mask] [,misc_mask])
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_handle |
invo_handle |
quadword |
read |
by reference |
invo_context |
invo_context_blk |
structure |
read |
by reference |
gr_mask |
mask_octaword |
128-bit vector |
read |
by reference |
fr_mask |
mask_octaword |
128-bit vector |
read |
by reference |
br_mask |
mask_byte |
8-bit vector |
read |
by reference |
pr_mask |
mask_quadword |
64-bit vector |
read |
by reference |
misc_mask |
mask_quadword |
64-bit vector |
read |
by reference |
|
Handle for the invocation to be updated. |
|
Address of a valid invocation context block that contains new register contents. |
At least one of the
following mask arguments ( | |
|
Address of a 128-bit bit vector, where each bit corresponds to a register field in
the
|
|
Address of a 128-bit bit vector, where each bit corresponds to a register field in
the passed |
|
Address of a 8-bit bit vector, where each bit corresponds to a register field in the
passed |
|
Address of a 64-bit bit vector, where each bit corresponds to a register field in
the passed |
|
Address of a 64-bit bit vector, where each bit corresponds to a register field in
the passed
invo_context as follows:
Note that PC can only be updated when the invocaton in question has been interrupted (either by exception or by an interrupt) and is logically previous to an invocation with the OSSD$V_EXCEPTION_FRAME bit set. |
|
A value of 1 indicates success. A value of 0 is returned (and nothing is changed) in
the following circumstances:
|
Caution
Great care must be taken to assure that a valid stack frame and execution environment result; otherwise, execution may become unpredictable.
4.8.4. Supplemental Invocation Context Access Routines
The routines described in this section can be used to perform some of the more common operations involving invocation contexts.
4.8.4.1. LIB$I64_GET_FR
Given an invocation context block and floating-point register index such that 0 <=
index
< 128, copy the register value to
fr_copy
. For example, an index
value of 4
fetches the value, which represents the contents of F4 for the context.
LIB$I64_GET_FR returns failure status if the index represents a scratch register whose contents have not been realized.
LIB$I64_GET_FR (invo_context, index, fr_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
read |
by reference |
index | index | longword | read | by value |
fr_copy | floating-point value | octaword | write |
by reference |
|
Address of a valid invocation context block. |
index |
Floating-point register index. |
fr_copy |
Address of an octaword to receive the contents of the specified floating-point register. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.4.2. LIB$I64_SET_FR
Given an invocation context block, a floating-point register index, and a floating-point
register value in fr_copy
, writes the corresponding invocation context
block FREG entry, and calls LIB$I64_PUT_INVO_REGISTERS to write the
actual context. The invocation context block remains unchanged if the routine fails.
LIB$I64_SET_FR fails if LIB$I64_PUT_INVO_REGISTERS fails.
LIB$I64_SET_FR (invo_context, index, fr_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
index | index | longword | read | by value |
fr_copy | floating-point value | octaword | read |
by reference |
|
Address of a valid invocation context block. |
index |
Index into the FREG array of the invocation context block. |
fr_copy |
Address of an octaword that contains the floating-point value to be written to the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.4.3. LIB$I64_GET_GR
Given an invocation context block and general register index such that 0 <=
index
< 128, copy the register value to
gr_copy
, for example, index
4 fetches the
invocation context block IREG[4] value, which represents the contents of R4 for the
context.
If the register represented by index
has its corresponding NaT bit
set, the read succeeds and the return status is set to 3. If the register represented by
index
lies beyond the allocated general registers, the read fails and
gr_copy
is unchanged. That is, the highest allowed
index
is 32 + ICB.CFM.SOF - 1.
LIB$I64_GET_GR fails if the index represents a scratch register whose contents have not been realized.
LIB$I64_GET_GR (invo_context, index, gr_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
read |
by reference |
index | index | longword | read | by value |
gr_copy | integer value | quadword | write |
by reference |
|
Address of a valid invocation context block. |
index |
Index into the IREG array of the invocation context block. |
gr_copy |
Address of a quadword to receive the value from the invocation context block. |
|
A value of 3 indicates success, and the NaT bit was set. A value of 1 indicates success, and the NaT bit was clear. A value of 0 indicates failure. |
4.8.4.4. LIB$I64_SET_GR
Given an invocation context block, a general register index such that 1 <=
index
< 128, and a quadword value gr_copy
,
writes the corresponding invocation context block general register, clears the corresponding
NaT bit and uses LIB$I64_PUT_INVO_REGISTERS to write to the actual
context. The invocation context block remains unchanged if the routine fails.
LIB$I64_SET_GR fails if LIB$I64_PUT_INVO_REGISTERS fails.
LIB$I64_SET_GR (invo_context, index, gr_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
index | index | longword | read | by value |
gr_copy | integer value | quadword | read |
by reference |
|
Address of a valid invocation context block. |
index |
Index into the IREG array of the invocation context block. |
gr_copy |
Address of a quadword that contains the value to be written to the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.4.5. LIB$I64_SET_PC
Given an invocation context block and a quadword PC value in
pc_copy
, write the pc_copy
value to the
invocation context block PC and then use LIB$I64_PUT_INVO_REGISTERS to
write to the actual context. The invocation context block remains unchanged if the routine
fails.
LIB$I64_SET_PC fails if LIB$I64_PUT_INVO_REGISTERS fails.
LIB$I64_SET_PC (invo_context, pc_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
pc_copy | PC value | quadword | read |
by reference |
|
Address of a valid invocation context block. |
pc_copy |
Address of a quadword that contains the PC value to be written to the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.4.6. LIB$I64_GET_UNWIND_LSDA
Given a pc_value
, find the address of the unwind information block
language specific data area (LSDA), and write it to unwind_lsda_p
. If
not present, then write 0 to unwind_lsda_p
.
LIB$I64_GET_UNWIND_LSDA (pc_value, unwind_lsda_p)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
pc_value | PC value | quadword | read |
by reference |
unwind_lsda_p | address | quadword | write | by reference |
|
Address of a location that contains the PC value. |
unwind_lsda_p |
Address of a quadword to receive the address of the language-specific data area, if there is one. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.4.7. LIB$I64_GET_UNWIND_OSSD
Given a pc_value
, find the address of the unwind information block
operating system-specific data area, if present, and write it to
unwind_ossd_p
. If not present, then write 0 to
unwind_ossd_p
.
LIB$I64_GET_UNWIND_OSSD (pc_value, unwind_ossd_p)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
pc_value |
PC value |
quadword |
read |
by reference |
unwind_ossd_p | address | quadword | write |
by reference |
|
Address of a location that contains the PC value. |
unwind_ossd_p |
Address of a quadword to receive the address of the operating system-specific data area. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.4.8. LIB$I64_GET_UNWIND_HANDLER_FV
Given a pc_value
, find the function value (address of the procedure
descriptor) for the condition handler, if present, and write it to
handler_fv
. If not present, then write 0 to
handler_fv
.
LIB$I64_GET_UNWIND_HANDLER_FV (pc_value, handler_fv)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
pc_value |
PC value |
quadword |
read |
by reference |
handler_fv | address | quadword | write |
by reference |
|
Address of a location that contains the PC value. |
handler_fv |
A quadword to receive the function value of the procedure descriptor for the condition handler, if there is one. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.4.9. LIB$I64_IS_EXC_DISPATCH_FRAME
Used to determine whether a given PC value represents an exception dispatch frame.
LIB$I64_IS_EXC_DISPATCH_FRAME (pc_value)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
pc_value |
PC value |
quadword |
read |
by reference |
|
Address of a quadword that contains the PC value. The
|
|
Returns 1 if the operating system-specific data area is present and the EXCEPTION_FRAME flag is set. Returns 0 if the operating system-specific data area is present and the EXCEPTION_FRAME flag is clear. Returns 0 if the operating system-specific data area is not present. |
4.8.4.10. LIB$I64_IS_AST_DISPATCH_FRAME
Used to determine whether a given PC value represents an AST dispatch frame.
LIB$I64_IS_AST_DISPATCH_FRAME (pc_value)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
pc_value |
PC value |
quadword |
read |
by reference |
|
Address of a quadword that contains the PC value. The
|
|
Returns 1 if the operating system-specific data area is present and the AST_FRAME flag is set. Returns 0 if the operating system-specific data area is present and the AST_FRAME flag is clear. Returns 0 if the operating system-specific data area is not present. |
4.8.5. Invocation Context Callback Routines
Perform a call trace on a process other than the current process.
Override the heap storage mechanism used to allocate memory used during the analysis of unwind descriptors.
The user override callback mechanism provides a user ident value that is passed to each callback routine. The user ident value is stored in the LIBICB$IH_UO_IDENT field of the invocation context block.
Note
LIB$I64_GET_CURR_INVO_HANDLE
LIB$I64_GET_PREV_INVO_HANDLE
4.8.5.1. The Get Unwind Information Routine
int (* getueinfo) (uint64 pc, void *get_ue_block, void *name, ...);
This routine should mimic SYS$GET_UNWIND_ENTRY_INFO for the target process. See Section A.7, “System Unwind Routines” for detailed argument descriptions and return status, with the following notes:
The name argument is not used, and can be ignored. If a read memory callback has been specified, the contents of LIBICB$PH_UO_READ_MEM are passed as a fourth argument, and the contents of LIBICB$PH_UO_IDENT are passed as a fifth argument, otherwise the routine is called with three arguments.
4.8.5.2. The Get Initial Context Routine
Place a function pointer for this routine in the LIBICB$PH_UO_GETCONTEXT field of the invocation context block.
int (* getcontext) (void *invo_context, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
ident |
user_value |
quadword |
read |
by value |
|
The address of the invocation context block. |
|
Specifies a user ident value from the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.5.3. The Read Memory Routine
Place a function pointer for this routine in the LIBICB$PH_UO_READ_MEM field of the invocation context block.
int (* read_mem) (void *dst, uint64 src, size_t length, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
dst |
memory_access |
byte_array |
write |
by reference |
src |
memory_address |
quadword |
read |
by value |
length |
size_t |
longword |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
A local memory address and the destination for the read operation. |
|
An address in the target process to be read. |
|
The length in bytes to be read. |
|
Specifies a user ident value from the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.5.4. The Write Memory Routine
Place a function pointer for this routine in the LIBICB$PH_UO_WRITE_MEM field of the invocation context block.
int (* write_mem) (void *src, uint64 dst, size_t length, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
src |
memory_access |
byte_array |
read |
by value |
dst |
memory_address |
quadword |
write |
by reference |
length |
size_t |
longword |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
A local memory address and the source for the write operation. |
|
An address in the target process to be written. |
|
The length in bytes to be written. |
|
Specifies a user ident value from the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.5.5. The Write Register Routine
Place a function pointer for this routine in the LIBICB$PH_UO_WRITE_REG field of the invocation context block.
The write register routine is used to write a register in the target process. It is used by LIB$I64_PUT_INVO_REGISTERS for a register that has not been saved in memory.
int (* write_reg) (int whichReg, uint64 value_1, uint64 value_2, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
whichReg |
enumeration |
longword |
read |
by value |
value_1 |
register_value |
quadword |
read |
by value |
value_2 |
register_value |
quadword |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
Indicates the register to be written (see enum in libicb.h). |
|
Specifies the register contents, or lower quadword for a FR fill operation. |
|
Specifies the NaT bit for GRs, or upper quadword for a FR fill. |
|
Specifies a user ident value from the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
4.8.5.6. The Memory Allocation Routine
void * (* malloc) (size_t size, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
length |
size_t |
longword |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
The length in bytes of memory to be allocated. The returned memory block should be aligned on a 16-byte boundary. |
|
Specifies a user ident value from the invocation context block. |
|
Address of the memory block allocated, or 0 for failure. |
One Unwind Context block of size LIBICB$K_CONTEXT_BLK_SIZE
One Unwind Descriptor block of size LIBICB$K_DESCRIPTOR_BLK_SIZE
Several Unwind region blocks of size LIBICB$K_REGION_BLK_SIZE
Several Unwind region label blocks of size LIBICB$K_REGIONLABEL_BLK_SIZE
The number of the last two required depends on the complexity of the unwind descriptors for a given procedure being traced.
4.8.5.7. The Memory Deallocation Routine
void (* free) (void * ptr, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
ptr |
address |
quadword |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
Address of a memory block previously allocated by a call to the user malloc routine. |
|
Specifies a user ident value from the invocation context block. |
Function Value Returned:
None.
4.9. Data Allocation
A linkage table, containing pointers to imported data and functions, and to data in the code segments and long data segments. This area is generally protected by OpenVMS against being written after image activation is complete.
A read-only short data area, containing small initialized own data items. This area is generally protected by OpenVMS against being written after image activation is complete. (This area is optional).
A read-write short data area, containing small initialized own data items.
A read-write short bss area, containing small uninitialized own data items.
One or more long data areas, which contain large initialized data items, and initialized non-own data items of any size.
One or more long bss areas, which contain large uninitialized data items, and uninitialized non-own data items of any size.
Own data items are those that are either local to an image, or are such that all references to these items from the same image will always refer to these items. Because non-own variables cannot be referenced directly, there is no benefit to placing them in the short data area or bss area. Small own data items are placed in the short bss area or short data areas, and are guaranteed to be within 2 megabytes (in either direction) of the GP address; this allows compilers to use a short direct addressing sequence (using the add with 22-bit immediate instruction) to access any data item allocated in these areas.
The compiler should place all own data items that are 8 bytes or less in size (regardless of structure) in one of the short data areas or the short bss area. All other data items, including items that are larger than 8 bytes in size, must be placed in one of the long data areas or long bss areas. The compiler must address these items indirectly, using a linkage table entry. Linkage table entries are typically allocated by the linker in response to a relocation request generated by the compiler; an entry in the linkage table is either a pointer to a data item, or a function descriptor. A function descriptor placed in the linkage table is a local copy of an official function descriptor that is generally allocated by the linker or image activator.
This design allows for a maximum size of 4 megabytes for the short data segment, because everything must be addressable via the GP register using the 22-bit add immediate instruction. This allows for up to 256,000 individually-named variables and functions. If an image requires more than this, linker options may be used to divide the image into multiple clusters (see Section 4.7.1, “The GP Register”).
4.9.1. Data Alignment
On Itanium hardware, memory references to data that is not naturally aligned can result in alignment faults, which can severely degrade the performance of all procedures that reference the unaligned data. To avoid such performance degradation, all data values should be naturally aligned, as shown in Table 4.18, “Natural Alignment Requirements”.
malloc
), and global data items greater than 8 bytes must be aligned on a
16-byte boundary.
Data Type |
Alignment Starting Position |
---|---|
8-bit character string |
Byte boundary |
16-bit integer |
Address that is a multiple of 2 (word alignment) |
32-bit integer |
Address that is a multiple of 4 (longword alignment) |
64-bit integer |
Address that is a multiple of 8 (quadword alignment) |
|
Address that is a multiple of 4 (longword) |
|
Address that is a multiple of 8 (quadword) |
|
Address that is a multiple of 8 (quadword) |
|
Address that is a multiple of 4 (longword) |
|
Address that is a multiple of 8 (quadword) |
|
Address that is a multiple of 16 (octaword) |
For aggregates such as strings, arrays, and records, the data type to be considered for purposes of alignment is not the aggregate itself, but rather the elements of which the aggregate is composed. The alignment requirement of an aggregate is that all elements of the aggregate be naturally aligned. For example, varying 8-bit character strings must start at addresses that are a multiple of at least 2 (word alignment) because of the 16-bit count at the beginning of the string; 32-bit integer arrays start at a longword boundary, irrespective of the extent of the array.
The rules for passing a record in an argument that is passed by immediate value (see Section 4.7.4, “Parameter Passing”) always provide quadword alignment of the record value independent of the normal alignment requirement of the record. If deemed appropriate by an implementation, normal alignment can be established within the called procedure by making a copy of the record argument at a suitably aligned location.
4.9.2. Global Data
Access to global variables that are not known (at compile time) to be defined in the same image must be indirect. Each image has a linkage table in its data segment, pointed to by the GP register; code must load a pointer to the global variable from the linkage table, then access the global variable through the pointer. Access to global variables known to be defined in the same image or to static locals that are placed in the short data area may be made with a GP-relative offset.
4.9.3. Local Static Data
Access to short local static data can be made with a GP-relative offset; access to long local static data must be indirect.
4.9.4. Constants and Literals
Constants and literals may be placed in the text segment or in the data segment. If placed in the text segment, the access must be PC-relative or indirect using a linkage table entry. Literals placed in the data segment may be placed in the short initialized data area if they are 8 bytes or less in size. Larger literals must be placed in the long initialized data area or in the text segment. Literals in the long initialized data area require an indirect access using a linkage table entry.
4.9.5. Record Layout Conventions
The OpenVMS I64 calling standard rules for record layout are designed to provide good run-time performance on all implementations of the Itanium architecture and to provide the required level of compatibility with conventional VAX and Alpha operating environments.
Those optimized for optimal access characteristics (referred to as aligned record layouts)
Those compatible with conventions that are traditionally used by VAX languages (referred to as VAX compatible record layouts)
Only these record layouts may be used across standard interfaces or between languages. Languages can support other language-specific record layout conventions, but such layouts are nonstandard.
The aligned record layout conventions should be used unless interchange is required with conventional VAX applications that use the OpenVMS VAX compatible record layouts.
4.9.5.1. Aligned Record Layout
All components of a record or subrecord are naturally aligned.
Layout and alignment of record elements and subrecords are independent of any record or subrecord in which they are embedded.
Layout and alignment of a subrecord is the same as if it were a top-level record.
Declaration in high-level languages of standard records for interlanguage use is straightforward and obvious, and meets the requirements for source-level compatibility between OpenVMS I64 languages and OpenVMS Alpha and VAX languages.
The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.
The first bit of a record or subrecord must be directly addressable (byte aligned).
Records and subrecords must be aligned according to the largest natural alignment requirements of the contained elements and subrecords.
Bit fields (packed subranges of integers) are characterized by an underlying integer type that is a byte, word, longword, or quadword in size together with an allocation size in bits. A bit field is allocated at the next available bit boundary, provided that the resulting allocation does not cross an alignment boundary of the underlying type. Otherwise, the field is allocated at the next byte boundary that is aligned as required for the underlying type. (In the later case, the space skipped over is left permanently not allocated). In addition, if necessary, the alignment of the record as a whole is increased to that of the underlying integer type.
Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.
All other components of a record must start at the next available naturally aligned address for the data type.
The length of a record must be a multiple of its alignment. (This includes the case when a record is a component of another record).
Strings and arrays must be aligned according to the natural alignment requirements of the data type of which the string or array is composed.
The length of an array element is a multiple of its alignment, even if this leaves unused space at its end. The length of the whole array is the sum of the lengths of its elements.
4.9.5.2. OpenVMS VAX Compatible Record Layout
The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.
Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.
All other components of a record must start at the next available byte in the record. Any unused bits following the last-used bit in the last-used byte of each component must be filled out to the next byte boundary so that any following data starts on a byte boundary.
Subrecords must be aligned according to the largest alignment of the contained elements and subrecords. A subrecord always starts at the next available byte unless it consists entirely of unaligned bit data and it immediately follows an unaligned bit string, unaligned bit array, or a subrecord consisting entirely of unaligned bit data.
Records must be aligned on byte boundaries.
4.9.6. Sample Code Sequences
In the sample code sequences in this section, register names of the form t1, t2, and so on, are temporary registers, and may be assigned to any available scratch register. The code sequences show necessary cycle breaks, but no other scheduling considerations have been made. It is assumed that these code sequences will be scheduled with surrounding code to make best use of the processor resources.
4.9.6.1. Addressing Own Data in the Short Data Area
addl t1=@gprel(var),gp ;; // calc. address of var ld8 loc0=[t1] // load contents of var
movl t1=@gprel(var) ;; // form gp-relative offset of var add t2=t1,gp ;; // calc. address of var ld8 loc0=[t2] // load contents of var
4.9.6.2. Addressing External Data or Data in a Long Data Area
addl t1=@ltoff(var),gp ;; // calc. address of LT entry ld8 t2=[t1] ;; // load address of var ld8 loc0=[t2] // load contents of var
4.9.6.3. Addressing Literals in the Text Segment
L1: mov r3=ip ;; // get current IP addl loc0=litbase-L1,r3 ;; // calc. addr. of lit. area adds t2=(lit-litbase),loc0 ;; // calc. address of lit. ld8 loc1=[t2] // load value of literal
Note
The first two instructions can be moved towards the beginning of the procedure, and the base address of the literal area (in LOC0) can be shared by other literal references in the same procedure.
4.9.6.4. Materializing Function Pointers
Function pointers must always be obtained from the data segment, either as an initialized quadword or through the linkage table, as shown in the following examples:
Materializing function pointers through linkage table:
addl t1=@ltoff(@fptr(func)),gp ;; // calc address of LT entry ld8 loc0=[t1] // load function pointer
Materializing function pointers in data:
fptr: data8 @ftpr(func) // initialize function ptr
4.9.6.5. Jump Tables
High-level language constructs such as case and switch statements, where there are several possible local targets of a branch, may use a number of different code generation strategies, ranging from sequential conditional branches to a direct-lookup branch table.
Two branch table methods are described: The first places the branch table in a read-only segment separate from the code segment. The second places the branch table in the code segment. The advantage of the first is that it allows the code segment to have execute-only access, while the second may require the code segment to allow read access as well. The advantage of the second is that it does not require addressing the branch table via the GP and hence may be slightly faster. Both methods avoid the need for relocation during image activation.
The branch table method descriptions that follow include examples that use 64-bit entries. It is also valid to use 32-bit, 16-bit or even 8-bit entries providing it is known that the smaller entry size is sufficient to allow the required displacement to be represented (without overflow).
Preferred Method
If a branch table is placed in a data segment separate from the code, each entry should be a byte displacement from a dispatch address located in the code segment to the branch target for that entry.
// // Assume case index in loc0 // addl loc1=@ltoff($DSPTBL1), gp // addr of GOT entry ld8 loc2=[loc1] // load addr of dsp table shladd loc3=loc0,3,loc2 // calc addr of dsp entry ld8 loc4=[loc3] // load dsp table entry $DA1: mov loc5=ip // get "dispatch address" add loc6=loc5,loc4 // calc target address mov b6=loc6 br.cond b6 // perform dispatch $L1: {target for case 1} ... $L2: {target for case 2} ... ... etc // The dispatch table is in the linkage section. It consists // of only constants (no relocations involved) // $DSPTBL1: .data8 $L1-$DA1 .data8 $L2-$DA1 . . .
Alternative Method
If a branch table is placed in the same segment as the code, each table entry should be a 64-bit byte displacement from the base of the branch table to the branch target for that entry.
addl loc1=@ltoff(brtab),gp // calc. address of ;; // linkage table entry ld8 loc2=[loc1] ;; // load addr. of br. table shladd loc3=loc0,3,loc2 ;; // calc. address of branch // table entry ld8 loc4=[loc3] ;; // load branch table entry add loc5=loc4,loc2 ;; // calc. target address mov b6=loc5 ;; // move address to B6... br.cond b6 ;; // ...and branch
Chapter 5. OpenVMS x86-64 Conventions
This chapter describes the fundamental concepts and conventions for calling a procedure in an OpenVMS x86-64 environment. These conventions are based on industry standards with extensions to be compatible with other OpenVMS systems. See Section C.2, “Differences from Industry x86-64 Software Conventions” for additional information.
5.1. x86-64 Register Usage
General-purpose
Floating-point and related control/status
Segment
Legacy pseudo-registers
5.1.1. x86-64 Register Classes
Scratch registers—may be modified by a procedure call; the caller must save these registers before a call if needed (caller save).
Preserved registers—must not be modified by a procedure call; the callee must save and restore these registers if used (callee save). A procedure using one of the preserved general-purpose registers must save and restore the original content of the caller.
One way to preserve a register is not to use it at all.
Special registers—used in the calling standard call/return mechanism.
Volatile registers—may be used as scratch registers within a procedure and are not preserved across a call; may not be used to pass information between procedures either as input or output.
5.1.2. x86-64 General-Purpose Register Usage
Register | Class | Usage |
---|---|---|
%rax %eax %ax %al %ah | Scratch |
|
%rbx %ebx %bx %bl %bh | Preserved | Callee-saved registers. |
%rcx %ecx %cx %cl %ch | Scratch | Pass the 4th argument to procedures. |
%rdx %edx %dx %dl %dh | Scratch |
|
%rsi %esi %si %sil | Scratch | Pass the 2nd argument to procedure. |
%rdi %edi %di %dil | Scratch | Pass the 1st argument to procedures. |
%rbp %ebp %bp %bpl | Preserved | Used as a frame pointer, if manifested in a register. |
%rsp %esp %sp %spl | Special | Stack pointer. |
%r8 %r8d %r8w %r8l | Scratch | Pass the 5th argument to procedures. |
%r9 %r9d %r9w %r9l | Scratch | Pass the 6th argument to procedures. |
%r10 %r10d %r10w %r10l | Scratch | Pass the environment value when calling a bound procedure. |
%r11 %r11d %r11w %r11l | Volatile | Available for use in call stubs, trampolines, and other constructs. |
| Preserved | Callee-saved registers. |
RFLAGS | Preserved | The Direction Flag (DF) bit must be zero at procedure call and return. |
Scratch | All other bits. | |
%rip | Special | Instruction pointer, not directly addressable by software. |
5.1.3. x86-64 Floating-Point Register Usage (SSE)
The base x86-64 architecture provides 16 SSE floating-point registers, each 128 bits wide.
Intel AVX (Advanced Vector Extensions) option provides 16 256-bit wide AVX registers
(%ymm0
—%ymm15
). The lower 128 bits of
%ymm0
—%ymm15
are aliased to the respective 128-bit SSE
registers (%xmm0
—%xmm15
?).
Intel AVX-512 option provides 32 512-bit wide SIMD registers
(%zmm0
—%zmm31
). The lower 128 bits of
%zmm0
—%zmm31
are aliased to the respective 128-bit SSE
registers (%xmm0
—%xmm31
). The lower 256 bits of
%zmm0
—%zmm31
are aliased to the respective 256-bit AVX
registers (%ymm0
—%ymm31
?).
In addition, Intel AVX-512 also provides 8 vector mask registers
(%k0
—%k7
), each 64 bits wide.
For the purposes of parameter passing and function return, %xmmN
,
%ymmN
, and %zmmN
refer to the same register. Only one of
them can be used at a time.
Vector register is used to refer to either an SSE, AVX, or AVX-512 register (but not a vector mask register). This document often uses the name SSE to refer collectively to the SSE registers together with either the AVX or AVX-512 options.
Register | Class | Usage |
---|---|---|
%xmm0 %ymm0 %zmm0 | Scratch |
|
%xmm1 %ymm1 %zmm1 | Scratch |
|
%xmm2 %ymm2 %zmm2 | Scratch | Pass the 3rd argument to procedures. |
%xmm3 %ymm3 %zmm3 | Scratch | Pass the 4th argument to procedures. |
%xmm4 %ymm4 %zmm4 | Scratch | Pass the 5th argument to procedures. |
%xmm5 %ymm5 %zmm5 | Scratch | Pass the 6th argument to procedures. |
%xmm6 %ymm6 %zmm6 | Scratch | Pass the 7th argument to procedures. |
%xmm7 %ymm7 %zmm7 | Scratch | Pass the 8th argument to procedures. |
| Scratch | Temporary registers. |
MXCSR |
Preserved | The control flags (bits 6-15) are preserved. |
Scratch | The other bits are scratch. |
Register | Class | Usage |
---|---|---|
%k0—%k7 | Scratch | Temporary registers |
5.1.4. x86-64 Floating-Point Register Usage (FPU)
OpenVMS x86-64 applications may use the x87 registers though there is little reason to do so. Packed, single- and double-precision floating-point operations are usually performed in the SSE registers, while the 80-bit extended-precision floating-point format is not supported by the OpenVMS compilers or run-times.
Register | Class | Usage |
---|---|---|
%st0 | Scratch | 1st return value register. |
%st1 | Scratch | 2nd return value register. |
%st2—%st7 | Scratch | Temporary registers. |
%mm0—%mm7 | Scratch | The MMX registers. Overlay the x87 floating-point
(%st0—%st7 ) registers. |
Control Word | Preserved | Stores the value of the control word. |
Status Word | Scratch | Stores the value of the status word. |
| — | Not used by applications. |
The CPU should be in x87 mode, not MMX mode, on procedure entry and exit.
5.1.5. Floating-Point Status Management on OpenVMS
The floating-point hardware registers
A supplementary software register (a quadword)
The floating-point status is normally managed by three OpenVMS system services:
SYS$IEEE_SET_FP_CONTROL
SYS$IEEE_SET_PRECISION_MODE
SYS$IEEE_SET_ROUNDING_MODE
The supplementary software register is internal to OpenVMS and is not documented for general use. This register holds information that is used by OpenVMS to implement the three system services and handle floating-point exceptions in general. It can only be accessed indirectly using the system services.
Floating-point control status bits are bits or flags that control the floating-point arithmetic operations.
Floating-point information status bits are bits or flags that record summary information about the execution of previous floating-point arithmetic operations.
Note
The floating-point control status is sometimes informally called the floating-point mode or IEEE mode.
Full IEEE-format floating-point control status is the default, unless the status is explicitly set to another value.
VAX-format floating-point control status can be set for programs that use VAX-format floating-point processing.
Bit | Field | IEEE-format setting | VAX-format setting | |
---|---|---|---|---|
0 | Invalid Operation | Flags | 0 | 0 |
1 | Denormal | 0 | 0 | |
2 | Zero Divide | 0 | 0 | |
3 | Overflow | 0 | 0 | |
4 | Underflow | 0 | 0 | |
5 | Inexact | 0 | 0 | |
6 | Denormals are Zeros | 0 | 0 | |
7 | Invalid Operation | Masks | 1 | 0 |
8 | Denormal | 1 | 1 | |
9 | Zero Divide | 1 | 0 | |
10 | Overflow | 1 | 0 | |
11 | Underflow | 1 | 1 | |
12 | Inexact | 1 | 1 | |
14:13 | Rounding Control | 00 (nearest) | 00 | |
15 | Flush to Zero | 0 | 0 | |
31:16 | Reserved | 0 | 0 |
Note
VAX floating-point data is never loaded or manipulated in the x86-64 floating-point registers. However, VAX floating-point values may be converted to IEEE floating-point values, which are then manipulated in the x86-64 floating-point registers.
Bit | Field | IEEE-format setting | VAX-format setting | |
---|---|---|---|---|
0 | Invalid Operation | Masks | 1 | 0 |
1 | Denormal | 1 | 1 | |
2 | Zero Divide | 1 | 0 | |
3 | Overflow | 1 | 0 | |
4 | Underflow | 1 | 1 | |
5 | Inexact | 1 | 1 | |
7:6 | Reserved | 0 | 0 | |
9:8 | Precision Control | 11 | 11 | |
11:10 | Rounding Control | 00 (nearest) | 00 | |
15:13 | Reserved | 0 | 0 |
Using a compiler or linker switch, you can associate a floating-point control status with the main procedure of a program to set the floating-point state prior to the beginning of program execution. If no control status is explicitly set, a default status appropriate for full IEEE computation is used.
5.1.6. x86-64 Segment Register Usage
Register | Class | Usage |
---|---|---|
%cs %ds %ss %es | — | Managed by OpenVMS and implicitly used by applications |
%fs | — | Reserved to OpenVMS |
%gs | — | Reserved to OpenVMS |
5.1.7. x86-64 Bound Register Usage
Use of the x86-64 bound registers is deprecated on OpenVMS. The only support provided is to context switch the contents of the bound registers as part of the normal application context; they are otherwise unused and unsupported.
5.1.8. Legacy Pseudo-Registers
The OpenVMS MACRO compiler for x86-64 (XMACRO) generates code that uses a set of pseudo-registers to emulate the Alpha register set. The pseudo-register set consists of 32 64-bit registers (R0—R31). The contents of these pseudo-registers are well defined only at procedure calls and returns; otherwise, XMACRO uses pseudo-registers at its discretion. No special semantics are associated with the pseudo-registers, even for the registers that would otherwise be considered special or part of the Alpha hardware.
The pseudo-registers are invisible to high-level languages, except for BLISS and VSI C. BLISS linkage attributes and VSI C linkage pragmas may be used to access pseudo-registers on calls and returns. See Chapter 3, OpenVMS Alpha Conventions for more information regarding Alpha register conventions and usage.
Use of such registers for other than legacy applications from other OpenVMS environments is deprecated.
The pseudo-registers are stored as a per-thread vector of quadwords in memory.
alpha_reg_vector_t* LIB$GET_ALPHA_REG_VECTOR ();Arguments:
None. |
ptr | Pointer to the Alpha pseudo-register vector for the current thread. |
LIB$GET_ALPHA_REG_VECTOR preserves all
registers other than the return value register %rax
.
Any procedure that accesses the pseudo-registers must make its own call to LIB$GET_ALPHA_REG_VECTOR to obtain the array address. Passing the array address to another procedure by any means is an error that may result in undefined behavior.
5.2. Address and Pointer Representation
An address is a 64-bit value that is used to denote a position in memory. However, for compatibility with OpenVMS VAX and Alpha, many OpenVMS applications and user-mode facilities operate in such a manner that addresses are restricted to values that are representable in 32 bits. This means that OpenVMS addresses can often be stored and manipulated as 32-bit longword values. In such cases, the 32-bit address value is always implicitly or explicitly sign-extended to form a 64-bit address for use by the x86-64 hardware.
The OpenVMS run-time environment supports a mix of 32- and 64-bit pointers. For backward compatibility, the default pointer size is 32 bits. A 32-bit pointer is converted to a 64-bit pointer by sign-extending its value. A 64-bit pointer can be converted to a valid 32-bit pointer only if the high-order 33 bits are all zero or all one.
5.3. Procedure Values
An x86-64 procedure value (a function pointer) is a pointer to code. To call through a procedure value, call through the value itself, not through a location in the memory pointed to by the value.
All procedure values must be representable in 32 bits. Because 32-bit addresses and pointers are always sign-extended before use (see Section 5.2, “Address and Pointer Representation”), this means that the code they point to must reside in either the (hexadecimal) range 0..00000000 7FFFFFFF or FFFFFFFF 80000000..FFFFFFFF FFFFFFFF (see the VSI OpenVMS Programming Concepts Manual, Volume I for discussion of the structure of the OpenVMS address space). If the code is not in either of these regions, the linker creates a 32-bit-addressable trampoline for it. The trampoline code simply jumps to the procedure. The address of this trampoline becomes the value for that procedure.
Unbound procedures normally do not require an associated trampoline. They need a trampoline only if code in the same image takes the address of the procedure, or if it is a universal symbol.
Bound procedure values always point to trampolines. These trampolines are created by the containing procedure at the time it is called. When the bound procedure value trampolines pass control to the procedure, they pass an environment pointer (a pointer to the containing procedure stack frame) as an additional hidden parameter to the procedure. (See Section 5.6.5, “Indirect Calls to a Bound Procedure” regarding creation and deletion of bound procedure values).
5.4. Procedure Types
Variable-size stack procedure (sometimes known as a normal procedure in industry x86-64 documentation)—allocates a memory stack that is addressable using either
%rbp
(the frame pointer register) or%rsp
(the stack pointer register). The size of the stack may vary during the procedure execution. The called procedure may maintain a part or the whole context of its caller on that stack.Fixed-size stack procedure (sometimes known as a framepointerless procedure in industry x86-64 documentation)—allocates a memory stack that is addressable only using
%rsp
(the stack pointer register). The size of the stack is fixed during the procedure execution. The called procedure may maintain a part or the whole context of its caller on that stack.Null frame procedure (sometimes known as a frameless procedure in industry x86-64 documentation)—allocates no memory stack (other than the implicit saving of the caller return address that is a part of the CALL instruction). No context of its caller is saved.
All types of procedures allow use of 128 bytes of temporary storage below the address given in the stack pointer. This so-called red zone is not preserved across procedure calls, but is preserved by signal and condition handlers. Outside of the kernel, procedures may use this for temporary storage. Because hardware interrupts do not preserve the red zone, kernel code cannot use it. The use of the red zone can be disabled with a compiler option or pragma.
The red zone is useful in frameless leaf procedures (that call no other procedures). It gives them 128 bytes of scratch storage without the performance overhead of setting up and taking down a stack frame.
A compiler chooses which type of procedure to generate based on the requirements of the procedure in question. A calling procedure does not need to know what type of procedure it is calling.
Every variable-size stack or fixed-size stack procedure must have an associated unwind description (see Appendix B, Stack Unwinding and Exception Handling on OpenVMS x86-64) that provides information on the procedure type and its characteristics. A null frame procedure may also have an associated unwind description. (The default description applies if there is no unwind description). This data structure is used to interpret the call stack at any given point in a thread execution. It is built at compile time and usually is not accessed at run-time except to support exception processing or other rarely executed code.
5.4.1. Variable-Size Stack Procedures
Variable-size stack procedures allocate the stack that grows towards lower addresses. The
stack pointer (SP) is contained in the %rsp
register. The frame pointer (FP)
is contained in the %rbp
register. The stack pointer is normally 0mod16
aligned and must be 0mod16 aligned when making a call. Because the return address is pushed on
the stack by the caller, the stack pointer is 8mod16 aligned on entry to a procedure. The
%rbp
register is saved immediately below the return address. The frame
pointer points to the saved %rbp
.
5.4.2. Fixed-Size Stack Procedures
Fixed-size stack procedures allocate the stack that grows towards lower addresses. The
stack pointer (SP) is contained in the %rsp
register. No frame pointer (FP)
is used, so that the %rbp
register is available as an additional preserved
register. The stack pointer is normally 0mod16 aligned and must be 0mod16 aligned when making a
call. Because the return address is pushed on the stack by the caller, the stack pointer is
8mod16 aligned on entry to a procedure.
5.4.3. Null Frame Procedures
A null frame procedure is almost a special case of a fixed-size stack procedure. It is like a fixed-size stack which has no local storage other than the return address that is pushed on the stack as a result of the call. Because no additional stack is allocated it is unlike a fixed-size stack in that the alignment of the stack pointer is 8mod16 (not 0mod16).
A null frame procedure is necessarily a leaf procedure because the stack pointer must be 0mod16 aligned in order to make a call.
5.5. Stack Overflow Detection on OpenVMS x86-64
This section defines the conventions to support the execution of multiple threads in a multilanguage OpenVMS environment. Specifically defined is how compiled code must perform stack limit checking. While this standard is compatible with a multithreaded execution environment, the detailed mechanisms, data structures, and procedures that support this capability are not specified in this manual.
There can be one or more threads executing within a single process.
The state of a thread is represented in a thread environment block (TEB).
The TEB of a thread contains information that determines a stack limit below which the stack pointer must not be decremented by the executing code (except for code that implements the multithreaded mechanism itself).
Exception handling is fully reentrant and multithreaded.
5.5.1. Stack Limit Checking
A program that is otherwise correct can fail because of stack overflow. Stack overflow occurs when extension of the stack (by decrementing the stack pointer, SP) allocates addresses not currently reserved for the current thread's stack. This section defines the conventions for stack limit checking in a multithreaded environment.
Stack Guard Region
In a multithreaded environment, the address space beyond each thread's stack is protected by contiguous guard pages, which trap on any access. These pages form the stack guard region.Stack Reserve Region
In some cases, it is useful to maintain a stack reserve region, which is a minimum-sized region that is between the current top of stack and the stack guard region. A stack reserve region can ensure that the following conditions exist:Exceptions or asynchronous system traps (ASTs, analogous to asynchronous signals) have stack space to execute on a thread's stack.
The exception dispatcher and any exception handler that it might call have stack space to execute after detection of an invalid attempt to extend the stack.
This calling standard does not require a stack reserve region, but it does allow a language and its run-time system to implement one.
5.5.1.1. Methods for Stack Limit Checking
Because accessible memory may be available at addresses lower than those occupied by the stack guard region, compilers must generate code that never extends the stack past the stack guard region into accessible memory that is not allocated to the thread's stack.
A general strategy to prevent extending the stack past the stack guard region is to access
each page of memory down to and possibly including the page corresponding to the intended new
value of %rsp
. If the stack is to be extended by an amount larger than the
size of a memory page, then a series of accesses is required that works from higher to lower
addressed pages. If any access results in a memory access violation, then the code has made an
invalid attempt to extend the stack of the current thread.
For the purposes of this section, the amount by which the stack is to be extended must include the size of the red zone in addition to the size of the needed stack extension for the executing procedure.
This calling standard defines two methods for stack limit checking, implicit and explicit, which are explained in the following sections.
Implicit Stack Limit Checking
If a byte (not necessarily the lowest) of the new stack region is guaranteed to be accessed prior to any further stack extension, then the stack can be extended by an increment that is up to one-half the stack guard region (without any additional accesses).
This standard requires that the minimum stack guard region size is 8192 bytes.
Explicit stack limit checking must be performed unless the amount by which
%rsp
is decremented is known to be less than or equal to 4096 and the application does not use a stack reserve region.Some byte in the new stack region must be accessed before
%rsp
can be further decremented for a subsequent stack extension.This access can be performed either before or after
%rsp
is decremented for this stack extension, but it must be done before%rsp
can be decremented again.No standard procedure call can be made before some byte in the new stack region is accessed.
The system exception dispatcher ensures that the lowest addressed byte in the new stack region is accessed if any kind of asynchronous interrupt occurs both after
%rsp
is decremented and before the access in the new stack region occurs.
These conventions ensure that the stack pointer is not decremented so that it points to accessible storage beyond the stack limit without this error being detected (either by the guard region being accessed by the thread or by an explicit stack limit check failure).
As a matter of practice, the system can provide multiple guard pages in the stack guard region. When a stack overflow is detected as a result of access to the stack guard region, one or more guard pages can be unprotected for use by the exception handling facility, as long as one or more guard pages remain protected to provide implicit stack limit checking during exception processing.
Explicit Stack Limit Checking
If the stack is being extended by an unknown amount or by a known amount that is greater than the maximum implicit check size 4096, then a code sequence that follows the rules for implicit stack limit checking can be executed in a loop to access the new stack region incrementally in segments that are less than or equal to the minimum stack guard region size 8192. At least one access must occur in each such segment.
The first access must occur between %rsp
and
%rsp
-4096, because in the absence of more specific information, the previous
guaranteed access relative to the current stack may be as much as 4096 bytes greater than the
current stack pointer address.
The last access must be within 4096 of the intended new value of the stack pointer. These accesses must occur in order, starting with the highest addressed segment and working toward the lowest addressed segment.
Perform a read access using the intended new value of the stack pointer. This is nondestructive, even if the read is beyond the stack guard region, and may facilitate OS mapping of new stack pages, if appropriate, in a single operation.
Proceed with sequential accesses as just described.
Note
A simple algorithm that is consistent with this requirement (but achieves up to twice
the minimum number of accesses) is to perform a sequence of accesses in a loop starting with
the previous value of %rsp
, decrementing by the minimum no-check extension
size (4096) to, but not including, the first value that is less than the new value for the
stack pointer.
The stack must not be extended incrementally in procedure prologues.
A procedure prologue that needs to extend the stack by an amount of unknown size or known size
greater than the minimum implicit check size must test new stack segments as just described in
a loop that does not modify %rsp
, and then update the stack with one
instruction that copies the new stack pointer value into %rsp
.
Note
An explicit stack limit check can be performed either by inline code that is part of a prologue or by a run-time support routine that is tailored to be called from a procedure prologue.
5.6. Procedure Call and Return
Calls may be direct, which are performed directly to the entry point of a target procedure, or indirect, which are performed through a procedure value. The target of a call may be either an unbound or a bound procedure. Returns are the same for all types of calls.
From the perspective of a compiler or assembly language programmer, all calls are local, that is, the call target is always assumed to be in the same segment as the caller. In case a call resolves to a procedure in a different segment or image, the linker creates a local code stub that forwards that call to the target.
5.6.1. Direct Local Calls to an Unbound Procedure
Within a single segment, direct local calls to an unbound procedure can be performed with a simple CALL instruction using a 32-bit PC-relative displacement. This is sufficient in the small and medium memory models (see Section 5.10.1, “Memory Models”).
If the code in a single segment grows beyond 2GB, the segment can be broken up into multiple segments.
5.6.2. Direct Local Calls to a Bound Procedure
Direct local calls to a bound procedure can only come from somewhere within the containing
scope; which is why this type of calls can be performed with the CALL instruction using a 32-bit
PC-relative displacement. The only difference between direct local calls to a bound procedure
and direct local calls to an unbound procedure is that a bound procedure requires an additional
implicit parameter, the procedure’s environment pointer, to be passed in
%r10
.
5.6.3. Direct Local Calls to a Non-Local Procedure
Calls between images, or between segments in a single image, are performed via an entry in the Global Offset Table (GOT) that points to the target procedure. In most cases, compilers do not know whether a call target is local or external to the image or segment, and so generate a local call. The linker creates a trampoline and redirects this local call to it. The trampoline forwards the call to the target procedure via an indirect jump through the GOT entry. In cases where a compiler knows that a call target is external, it can generate an indirect call via a GOT entry itself.
5.6.4. Indirect Calls to an Unbound Procedure
Indirect calls to an unbound procedure transfer control to the address that is specified by a procedure value.
5.6.5. Indirect Calls to a Bound Procedure
There is no distinction between the unbound and bound procedure values, so the caller does not know whether the called procedure is bound or not. Therefore, the called side must make special arrangements to pass the environment pointer to the called procedure.
When code takes the address of a bound procedure, the value is not the address of the
procedure itself, but a trampoline. This trampoline loads the environment pointer into
%r10
and then jumps to the actual procedure.
The trampoline is created when the value of the environment pointer becomes known during run-time. Since a bound procedure value is specific to a particular activation of the containing scope, multiple recursive invocations create multiple trampolines. This means that the storage for the bound procedure trampolines must be dynamically allocated either on the stack or from the heap.
Allocating bound procedure trampolines on the stack is the common industry practice on x86-64, but this is deprecated on OpenVMS because the stack is normally non-executable by default. To use this method on OpenVMS, applications have to explicitly make stack memory executable either with a flag in the object file that has a .note.GNU-stack option or with a run-time call.
The preferred method of creating and allocating bound procedure trampolines on OpenVMS is to call a run-time routine. This routine dynamically allocates and manages a linked list of executable memory pages where the trampolines reside. A second routine must be called to deallocate a bound procedure trampoline. This should be done when the containing procedure exits.
void* LIB$X86_ALLOC_BOUND_PROC_VALUE (size)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
size |
integer |
quadword |
read |
by value |
|
Number of bytes needed to hold a bound procedure value. |
Pointer to a block of memory of the given size |
The returned memory must be initialized by the caller to complete the creation of the bound
procedure value. Typically the contents will consist of an instruction to copy the appropriate
invocation context (which might be saved in the same block) into %r10
followed by an instruction to transfer control to the entry point of the target
procedure.
Storage for bound procedure values is local to the thread in which they are created.
Bound procedure values logically form a stack on which any newly allocated value is added and one or more of the most recently added entries may be deleted (as a group).
LIB$X86_DELETE_BOUND_PROC_VALUE (bpv)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
bpv |
address |
quadword |
read |
by value |
|
Pointer to a bound procedure value (created by LIB$X86_ALLOC_BOUND_PROC_VALUE). |
None. |
The effect of calling LIB$X86_FREE_BOUND_PROC_VALUES is to delete an existing bound procedure value, as well as any additional bound procedure values that were created subsequent to it.
5.6.6. Returns
All calls push a 64-bit return address on the stack. When the called procedure returns, it uses the RET instruction to pop the return address from the stack and jump to that address.
5.7. Parameter and Return Value Passing
On OpenVMS x86-64, procedure parameters are passed in registers and/or on the stack. Procedures can return results in registers or in a memory location designated by the caller.
All calls use %rax
as an argument information register as described in
Section 5.7.4, “Argument Information Register (AI)”.
5.7.1. Scalar Argument Types
the six general-purpose registers (
%rdi
,%rsi
,%rdx
,%rcx
,%r8
, and%r9
)the eight XMM registers (
%xmm0—%xmm7
)the stack.
Nominal Type | Argument Location | Return Value Location |
---|---|---|
Pointer [Q] |
The next available general-purpose register. Otherwise, in the next argument slot on the stack. | General-purpose register %rax |
Boolean [B, BU] | ||
Integers (size ≤ 64 bits) [B, W, L, Q, BU, WU, LU, QU] | ||
Integers (64 < size ≤ 128 bits) [O, OU] |
The next two available general-purpose registers. Otherwise, in the next two argument slots on the stack. | General-purpose registers %rax (low half) and
%rdx (high half) |
VAX float (F_floating, D_floating, and G_floating) [F, D, G] |
The next available general-purpose register. Otherwise, in the next argument slot on the stack. | General-purpose register %rax |
IEEE single-precision float (S_floating) [FS] |
Bits 31:0 of the next available XMM register. Otherwise, in the next argument slot on the stack. | Bits 31:0 of register %xmm0 |
IEEE double-precision float (T_floating) [FT] |
Bits 63:0 of the next available XMM register. Otherwise, in the next argument slot on the stack. | Bits 63:0 of register %xmm0 |
IEEE quadruple-precision float (X_floating) [FX] |
The next available XMM register. Otherwise, in the next two argument slots on the stack. | Register %xmm0 |
VAX complex single-precision float (F_floating) [FC] |
The next available general-purpose register. Otherwise, in the next argument on the stack. |
General-purpose register |
VAX complex double-precision float (D_floating and G_floating) [DC, GC] |
The next two available general-purpose registers. Otherwise, in the next two argument slots on the stack. | Registers %rax (the real part of a value) and
%rdx (the imaginary part of a value) |
IEEE complex single-precision float [FSC] |
In the next available XMM register, real part in bits 31:0, imaginary part in bits 63:32. Otherwise, in the next argument slot on the stack. | Register %xmm0 , the real part of a value in bits 31:0, the
imaginary part in bits 63:32 |
IEEE complex double-precision float [FTC] |
In bits 63:0 of the next two available XMM registers. Otherwise, the next two argument slots on the stack. | Bits 63:0 of registers %xmm0 (the real part of a value) and
%xmm1 (the imaginary part of a value) |
IEEE complex quadruple-precision float [FXC] | In the next four available argument slots on the stack. | In a caller-allocated memory buffer whose address is passed as a hidden first argument |
An argument that requires two registers is never split so that the first part is in a register and the second part is on the stack. Either both parts are in registers or both parts are on the stack.
For example, a procedure that takes ten integer scalar arguments will find the first six arguments in the general-purpose registers, and the last four on the stack. A procedure that takes ten IEEE double-precision floating-point scalars as arguments will find the first eight arguments in the XMM registers, and the last two on the stack. And, a procedure that takes six integer arguments and eight floating-point arguments, regardless of how the integer and floating-point arguments are intermixed, will find all 14 arguments in registers.
5.7.2. Aggregate Argument Types
This section describes how the aggregate argument types are passed to procedures.
First, the argument types are assigned in the appropriate classes and then the registers are allocated for passing them.
INTEGER class consists of integral types that fit in one of the general-purpose registers including pointers.
SSE class consists of types that fit in a floating-point register.
SSEUP class consists of types that fit into a floating-point register and can be passed and returned in the upper bytes of it.
X87, X87UP, COMPLEX_X87 classes consist of types that can be returned via the x87 FPU.
NO_CLASS is used as initializer in the algorithms. It is used for padding as well as empty structures and unions.
MEMORY class consists of types that are passed and returned in memory via the stack.
The size of each argument is rounded up to a quadword (8 bytes). Therefore, the stack will always be 8-byte aligned.
Nominal Type | Equivalent C/C++ Type(s) | Argument Passing Class |
---|---|---|
Pointer [Q] | * | INTEGER |
Boolean [B, BU] | _Bool (bool) | |
Integers (size ≤ 64 bits) [B, W, L, Q, BU, WU, LU, QU] | char, short, int, long (signed and unsigned) | |
Integers (64 < size ≤ 128 bits) [O, OU] | __int128 (signed and unsigned) | Split into two 8-byte chunks. Both belong to class INTEGER. |
VAX floating-point types (up to 64 bits) [F, D, G] | INTEGER | |
VAX floating-point complex (64 bits) [FC] | INTEGER | |
VAX floating-point complex (128 bits) [DC, GC] | Split into two 8-byte chunks. Both belong to class INTEGER. | |
IEEE binary floating-point types (up to 64 bits) [FS, FT] | float, double | SSE |
IEEE extended binary floating-point type (128 bits) [FX] | __float128 | Split into two halves. The first (lower addressed) 64-bits belong to class SSE and the second half to class SSEUP. |
IEEE binary floating-point complex (64 bits) [FSC] | complex float | Treat as two successive binary floating-point values, each treated as a scalar of half the size (see above). |
IEEE binary floating-point complex (128 bits) [FTC] | complex double | |
IEEE binary floating-point complex (256 bits) [FXC] | complex long double |
If the size of an object is larger than eight quadwords (64 bytes), or it contains unaligned fields, it belongs to the MEMORY class.
If a C++ object is non-trivial for the purpose of calls, as specified in the C++ ABI?, it is passed by an invisible reference—that is, the object is replaced in the parameter list by a pointer that has the INTEGER class.?
If the size of the aggregate exceeds a single quadword, each quadword is classified separately. Each quadword is initialized to the NO_CLASS class.
- Each field of an object is classified recursively so that always two fields are considered. The two fields are the containing quadword as a whole and the lowest level field components of the quadword, considered in order:
If both classes are equal, this is the resulting class.
If one of the classes is NO_CLASS, the resulting class is the other class.
If one of the classes is MEMORY, the result is the MEMORY class.
If one of the classes is INTEGER, the result is the INTEGER class.
If one of the classes is X87, X87UP, or COMPLEX_X87, the result is the MEMORY class.
Otherwise the result is the SSE class.
- Then a post merger cleanup is done:
If one of the classes is MEMORY, the whole argument is passed in memory.
If X87UP is not preceded by X87, the whole argument is passed in memory.
If the size of the aggregate exceeds two quadwords and the first quadword is not SSE or any other quadword is not SSEUP, the whole argument is passed in memory.
If SSEUP is not preceded by SSE or SSEUP, it is converted to SSE.
If the class is MEMORY, the argument is passed on the stack.
If the class is INTEGER, the next available register of the sequence
%rdi
,%rsi
,%rdx
,%rcx
,%r8
, and%r9
is used.If the class is SSE, the argument is passed in the next available floating-point register. The registers are taken in order from
%xmm0
to%xmm7
.If the class is SSEUP, the quadword is passed in the next available 8-byte chunk of the last used floating-point register.
If the class is X87, X87UP, or COMPLEX_X87, the argument is passed in memory.
When a value of a boolean type is returned or passed in a register or on the stack, bit 0 contains the truth value, bits 1 to 7 must be zero, and all other bits are left unspecified. A consumer of such values can rely on it being 0 or 1 only when truncated to the low byte.
If there are no registers available for any quadword of an argument, the whole argument is passed on the stack. If registers have already been assigned for some quadwords of such an argument, the assignments are reverted.
Once registers are assigned, the arguments passed in memory are pushed on the stack in reversed (right-to-left?) order.
Nominal Type | Equivalent C/C++ Type(s) | Argument Passing Class |
---|---|---|
IEEE binary floating-point vector (up to 64 bits) [M64] | __m64 | SSE |
IEEE extended binary floating-point vector (128 bits) [M128] | __m128 | Split into two halves. The first (lower addressed) 64-bits belong to class SSE and the second half to class SSEUP. |
IEEE binary floating-point vector (256 bits) [M256] | __m256 | Split into four 8-byte chunks. The first chunk belongs to class SSE and the rest to class SSEUP. |
IEEE binary floating-point vector (512 bits) [M512] | __m512 | Split into eight 8-byte chunks. The first chunk belongs to class SSE and the rest to class SSEUP. |
When passing the __m256
or __m512
arguments to
functions that use varargs or stdarg, function prototypes must be provided. Otherwise, the
run-time behavior is undefined.
5.7.3. Unused Bits in Passed Data
Note
Bit 31 is replicated in bits 32—63, even for unsigned 32-bit integers.
This rule applies to the argument types described in Section 5.7.1, “Scalar Argument Types” as well as the individual elements of aggregate types passed in general-purpose registers as described in Section 5.7.2, “Aggregate Argument Types”.
Data Type |
Type Designator? |
Data Size (bytes) |
Register Extension Type |
Memory Extension Type |
---|---|---|---|---|
Byte logical |
DSC$K_DTYPE_BU |
1 |
Zero64 |
Zero64 |
Word logical |
DSC$K_DTYPE_WU |
2 |
Zero64 |
Zero64 |
Longword logical |
DSC$K_DTYPE_LU |
4 |
Sign64 |
Sign64 |
Quadword logical |
DSC$K_DTYPE_QU |
8 |
Data64 |
Data64 |
Byte integer |
DSC$K_DTYPE_B |
1 |
Sign64 |
Sign64 |
Word integer |
DSC$K_DTYPE_W |
2 |
Sign64 |
Sign64 |
Longword integer |
DSC$K_DTYPE_L |
4 |
Sign64 |
Sign64 |
Quadword integer |
DSC$K_DTYPE_Q |
8 |
Data64 |
Data64 |
F_floating |
DSC$K_DTYPE_F |
4 |
VAXF64 |
Data32 |
D_floating |
DSC$K_DTYPE_D |
8 |
VAXDG64 |
Data64 |
G_floating |
DSC$K_DTYPE_G |
8 |
VAXDG64 |
Data64 |
F_floating complex |
DSC$K_DTYPE_FC |
2 * 4 |
2*VAXF64 |
2*Data32 |
D_floating complex |
DSC$K_DTYPE_DC |
2 * 8 |
2*VAXDG64 |
2*Data64 |
G_floating complex |
DSC$K_DTYPE_GC |
2 * 8 |
2*VAXDG64 |
2*Data64 |
S_floating |
DSC$K_DTYPE_FS |
4 |
Hard |
Data32 |
T_floating |
DSC$K_DTYPE_FT |
8 |
Hard |
Data64 |
X_floating |
DSC$K_DTYPE_FX |
16 |
N/A |
N/A |
S_floating complex |
DSC$K_DTYPE_FSC |
2 * 4 | Hard? |
2*Data32 |
T_floating complex |
DSC$K_DTYPE_FTC |
2 * 8 |
2*Hard |
2*Data64 |
X_floating complex |
DSC$K_DTYPE_FXC |
2 * 16 |
N/A |
N/A |
Small structures of 8 bytes or less |
N/A |
≤8 |
Nostd |
Nostd |
Small arrays of 8 bytes or less |
N/A |
≤8 |
Nostd |
Nostd |
32-bit address |
N/A |
4 |
Sign64 |
Sign64 |
64-bit address |
N/A |
8 |
Data64 |
Data64 |
Sign Extension Type |
Defined Function |
---|---|
Sign64 |
Sign-extended to 64 bits. |
Zero64 |
Zero-extended to 64 bits. |
Data32 |
Data is 32 bits. The state of bits <63:32> is unpredictable. |
2*Data32 |
Two single-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data32). |
Data64 |
Data is 64 bits. |
2*Data64 |
Two double-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data64). |
VAXF64 |
Data is 64 bits. Low-order 32 bits are the same as the F_floating memory format and the high-order 32 bits are zero. (Used only in a general register, never in a floating-point register). |
VAXDG64 |
Data is 64 bits. Uses the corresponding D_floating or G_floating memory format. (Used only in a general register, never in a floating-point register). |
2*VAXF64 |
Two single-precision parts of the complex value are stored in memory as independent floating-point values (each handled as VAXF64). |
2*VAXDG64 |
Two double-precision parts of the complex value are stored in memory as independent floating-point values (each handled as VAXDG64). |
Hard |
Passed in the layout defined by the hardware SRM. |
2*Hard |
Two floating-point parts of the complex value are stored in a pair of registers as independent floating-point values (each handled as Hard). |
Nostd |
State of all high-order bits not occupied by the data is unpredictable across a call or return. |
5.7.4. Argument Information Register (AI)
%rax
is used as
the AI register. It must contain the argument information that is presented in Table 5.13, “Contents of the Argument Information Register (%rax
)”.%rax
)Bit | Contents |
---|---|
7:0 (%al ) | Upper bound on the number of XMM registers that are used to pass arguments |
15:8 (%ah ) | Total number of passed argument slots |
47:16 | Argument Info Offset relative to the return address of the caller, or zero |
63:48 |
Reserved and must be either 0x0000 or 0xFFFF? |
If the Argument Info Offset field is non-zero, it contains a signed byte offset to an Argument Info Block (AIB). This byte offset is relative to the return address of the caller, that is, an offset from the location of the instruction after the call instruction. The Argument Info Block must be close enough to the call site for the offset to fit in 32 bits. If the AIB is in the same section as the code, this offset can be calculated at compile time.
Bit | Name | Usage |
---|---|---|
7:0 | version | Format version. This format is version 1. |
15:8 | arg info count | Number of argument slots represented in this block. |
19:16 | 1st arg info | Information on the 1st argument slot. |
23:20 | 2nd arg info | Information on the 2nd argument slot. |
. | ||
Information on the nth argument slot. |
The arg info count may be less than, equal to, or greater than the actual number of passed arguments. If it is less, the missing argument information fields are assumed to be 0 (AI$K_AR_I64). If it is greater, the extra entries in this block are ignored.
If all the passed arguments are integers and pointers, there is no need to pass an Argument Info Block. Instead, the Argument Info Offset should be set to zero.
Value | Name | Meaning |
---|---|---|
0 | AI$K_AR_I64 |
Argument is passed in a general-purpose register, if one is available, otherwise on the stack. or Argument is not present. |
1 | AI$K_AR_FF | F_floating argument is passed in a general-purpose register. |
2 | AI$K_AR_FD | D_floating argument is passed in a general-purpose register. |
3 | AI$K_AR_FG | G_floating argument is passed in a general-purpose register. |
4 | AI$K_AR_FS | Argument is passed in bits 31:0 of an XMM register. |
5 | AI$K_AR_FT | Argument is passed in bits 63:0 of an XMM register. |
6 | AI$K_AR_FXL | Low half of argument is passed in bits 63:0 of an XMM register. |
7 | AI$K_AR_FXH | High half of argument is passed in bits 127:64 of an XMM register. |
8 | AI$K_AR_MEM | Argument is pushed on the stack. |
9—15 | — | Reserved. |
5.7.5. Variable Argument Lists
The x86-64 industry standards define how C-style variable argument lists (va_start, va_arg and so on) are implemented. OpenVMS also allows variable argument lists to be accessed as arrays. On prior OpenVMS architectures, a single common mechanism supports both. On OpenVMS x86-64, different mechanisms are implemented.
5.7.5.1. Standard Variable Arguments
Offset | Register | Usage |
---|---|---|
0 | %rdi | 1st general-purpose argument register |
8 | %rsi | 2nd general-purpose argument register |
16 | %rdx | 3rd general-purpose argument register |
24 | %rcx | 4th general-purpose argument register |
32 | %r8 | 5th general-purpose argument register |
40 | %r9 | 6th general-purpose argument register |
48 | %xmm0 | 1st floating-point argument register |
64 | %xmm1 | 2nd floating-point argument register |
80 | %xmm2 | 3rd floating-point argument register |
96 | %xmm3 | 4th floating-point argument register |
112 | %xmm4 | 5th floating-point argument register |
128 | %xmm5 | 6th floating-point argument register |
144 | %xmm6 | 7th floating-point argument register |
160 | %xmm7 | 8th floating-point argument register |
The register save area is always allocated in the stack frame of the called function. Any
function that contains an invocation of the va_start macro must save argument registers in the
register save area. The six general-purpose registers are always saved. The number of
floating-point registers to be saved depends on the value passed in the %al
register. In theory, code should not save more registers than indicated in
%al
, but in practice, it either saves none (if %al
is
zero) or all the registers.
The standard requires the caller to pass a floating-point register argument count in the
%al
register whenever the called function uses the C variable arguments.
This includes not only functions explicitly declared with the variable arguments, but all
unprototyped functions as well.
Note that the OpenVMS “arginfo notused” linkage does not influence whether this value is
passed in the %al
or not. The passed value does not need to be absolutely
correct, but should at least be an upper bound on the number of arguments passed in
floating-point registers.
Offset | Field | Usage |
---|---|---|
0 | gp_offset | Byte offset from the start of the register save area of the next available saved integer argument register |
4 | fp_offset | Byte offset from the start of the register save area of the next available saved floating-point argument register |
8 | overflow_arg_area | Pointer to the first available stack argument |
16 | reg_save_area | Pointer to the register save area |
gp_offset is the byte offset within the register save area of the first unused general-purpose register.
fp_offset is the byte offset within the register save area of the first unused floating-point register.
overflow_arg_area points to the first unused stack argument.
reg_save_area points to the register save area that is already initialized.
printf(const char *fmt, ...)
function, the va_list
structure is initialized as follows:gp_offset is set to +8, the offset of the second general-purpose argument; the first argument (
fmt
) is already used.fp_offset is set to +48, the offset of the first floating-point argument.
overflow_arg_area is set to FP+16, the location of the first stack argument.
When the va_arg macro is invoked, it fetches the argument from a saved register or the stack and increments one field on the va_list structure accordingly. For example, if an integer argument is requested, the va_arg macro will compare the value of gp_offset against 48. If gp_offset is less than 48, the va_arg macro will return a saved integer register and increment gp_offset. Otherwise, it will return a stack argument and increment overflow_arg_area.
5.7.5.2. OpenVMS Variable Argument Lists
ARGPTR, ACTUALPARAMETER and ACTUALCOUNT in BLISS
[list], argument, and argument_list_length in VSI Pascal
va_count in VSI C
All rely on OpenVMS extensions to the standard calling conventions.
On OpenVMS standard calls, the caller passes argument information in the
%rax
register that specifies the total number of the used argument slots
and location of each register argument. In theory, this information only needs to be passed if
the called procedure uses one of the above mentioned language constructs, but since the caller
is not able to determine this, the argument information is passed in %rax
on
all OpenVMS standard
calls.
%ah
. If a
called procedure requests an argument list, the called procedure performs the following:Allocates the storage in its own stack frame for the entire arglist (8 *
%ah
).Copies all general-purpose registers, floating-point registers, and memory arguments to the arglist as indicated by the values in
%rax
.
Unlike the prior OpenVMS architectures, on OpenVMS x86-64 it is not possible to create a register “home” on the stack that is contiguous with the incoming memory arguments.
5.7.6. Procedure Return Values
If the class is MEMORY, then the caller provides the space for the return value and passes the address of this storage in
%rdi
as if it were the first argument to the function. In effect, this address becomes a hidden first argument. This storage must not overlap any data visible to the callee through the other parameters in this argument list.On return
%rax
will contain the address that was passed in%rdi
by the caller.If the class is INTEGER, the next available register of the sequence
%rax
,%rdx
is used.If the class is SSE, the next available floating-point register of the sequence
%xmm0
,%xmm1
is used.If the class is SSEUP, the quadword is returned in the next available 8-byte chunk of the last used floating-point register.
If the class is X87, the value is returned on the X87 stack in
%st0
as an 80-bit x87 number.If the class is X87UP, the value is returned together with the previous X87 value in
%st0
.If the class is COMPLEX_X87, the real part of the value is returned in
%st0
and the imaginary part in%st1
.
As a result scalar values and complex floating-point values are returned in registers
%rax
, %rax
and %rdi
,
%xmm0
, or %xmm0
and %xmm1
. The
exception is an IEEE complex quadruple precision value which is returned in a caller-provided
temporary location.
5.7.7. Parameter Passing and Return Result Examples
This section includes examples that illustrate the parameter passing and return result rules.
Example 1
As an example of the register passing conventions, consider the declarations and function call shown in Figure 5.4, “Parameter Passing Example 1”. The corresponding register allocation is given in Figure 5.5, “Register Allocation Example 1” where the stack frame offset given shows the frame before calling the function.
Example 2
An int (4 bytes) or a long (8 bytes) named a.
A short (2 bytes) named b.
A float (4 bytes) or a double (8 bytes) named c.
All four alternatives are included. This structure is followed by a declaration for a function that returns a value of that structure type and a function that has one parameter of that structure type.
// Part C Declarations: Fields of type int, short, float typedef struct { int a; short b; float c; } structparm_isf; structparm_isf s_isf; extern structparm_isf set_isf(); extern void func_isf(structparm_isf p_isf); // Part D Declarations: Fields of type long, short, float typedef struct { long a; short b; float c; } structparm_lsf; structparm_lsf s_lsf; extern structparm_lsf set_lsf(); extern void func_lsf(structparm_lsf p_lsf);
Call | Field a | Field b | Field c |
---|---|---|---|
func_isd(s_isd) | %rdi | %xmm0 | |
func_lsd(s_lsd) | memory (stack) | ||
func_isf(s_isf) | %rdi | %xmm0 | |
func_lsf(s_lsf) | %rdi | %rsi |
Call | Field a | Field b | Field c |
---|---|---|---|
set_isd(s_isd) | %rax | %xmm0 | |
set_lsd(s_lsd) | memory pointed to by %rax
(passed in %rdi ) | ||
set_isf(s_isf) | %rax | %xmm0 | |
set_lsf(s_lsf) | %rax | %rdx |
5.8. Procedure Call Stack
A procedure is an active procedure while its body is executing, including while any procedure it calls is executing. When a procedure is active, its designated condition handler may handle an exception that is signaled during its execution.
Associated with each active procedure is an invocation context, informally called a frame, which consists of the set of registers and space in memory that is allocated and that may be accessed during execution for a particular call of that procedure.
When a procedure begins to execute, it has a limited invocation context that includes the parameter passing registers of its caller. The initial instructions may allocate and initialize additional context, including possibly saving information from the invocation context of its caller. Such instructions, if any, are termed a procedure prologue. Once execution of the prologue is complete, the procedure is said to be active.
When a procedure is ready to return to its caller, the procedure ceases to be active after it begins to execute the instructions that deallocate and discard the procedure's invocation context (which may include restoring state of the caller's invocation context that was saved during the prologue). These instructions are termed a procedure epilogue.
A null frame procedure has no prologue and no epilogue, and consists solely of body instructions. Such a procedure becomes active immediately.
A procedure may have more than one prologue if there are multiple entry points. A procedure may also have more than one epilogue if there are multiple return points. One of each will be executed during any given invocation of the procedure.
A procedure call stack (for a thread) consists of the stack of invocation contexts that exists at any point in time. New invocation contexts are pushed on that stack as procedures are called and invocations are popped from the call stack as procedures return.
The invocation context of a procedure that calls another procedure is said to precede or be previous to the invocation context of the called procedure.
5.8.1. Current Procedure
The current procedure is the active procedure whose execution began most recently; its invocation context is at the top of the call stack. Note that a procedure executing in its prologue or epilogue is not active, and hence cannot be the current procedure.
For OpenVMS x86-64, the IP (instruction pointer) register in combination with associated unwind information determines what procedure is current (for exception handling purposes). See Section B.3, “Data Structures” for a description of the unwind information data structures.
5.8.2. Procedure Call Tracing
Mechanisms for each of the following functions are needed to support procedure call tracing:
To provide the context of a procedure invocation
To walk (navigate) the procedure call stack
To refer to a given procedure invocation
To examine or modify the register context of an active procedure
This section describes the data structure mechanisms. The run-time library functions that support these functions are described in Section 5.8.3, “Invocation Context Block Access Routines”.
5.8.2.1. Invocation Context Block
The context of a specific procedure invocation is provided through the use of a data structure called an invocation context block (ICB). Table 5.20, “Contents of the Invocation Context Block” describes the contents of the OpenVMS x86-64 invocation context block.
Field |
Size |
Description |
---|---|---|
LIBICB$L_CONTEXT_LENGTH |
Longword |
Unsigned total length in bytes of the invocation context block. See Section 5.8.3.1, “Initializing the Invocation Context Block”. |
LIBICB$V_FRAME_FLAGS |
3 Bytes |
See Table 5.21, “Flags in LIBICB$V_FRAME_FLAGS Field of the Invocation Context Block”. |
LIBICB$B_BLOCK_VERSION |
Byte |
ICB version; initial value of 3 for OpenVMS x86-64. (1 is for OpenVMS Alpha, 2 is for OpenVMS I64). See Section 5.8.3.1, “Initializing the Invocation Context Block”. |
| 2 Quadwords |
Internal (opaque) unwind context data. |
LIBICB$IH_IREG |
16 Quadwords |
Array of general registers.
|
LIBICB$IH_IP | Quadword |
Current instruction pointer (IP). |
LIBICB$IH_PSEUDO_REGS | 32 Quadwords |
Array of Alpha pseudo-registers. |
LIBICB$IH_RFLAGS | Quadword | Processor RFLAGS register. |
LIBICB$IH_FSGS | Quadword |
|
LIBICB$IH_XSAVE_STATE | Quadword |
XSAVE state control register value indicating what information is contained in the XSAVE area. This is the state-component bit map needed by the XRSTOR to restore the floating-point state from the XSAVE area (0 if the XSAVE pointer is null). |
LIBICB$PH_XSAVE | Quadword |
Pointer to an XSAVE area (null if floating-point is not in use). |
LIBICB$L_XSAVE_LENGTH | Longword | The number of bytes in the block pointed to by LIBICB$PH_XSAVE (0 if LIBICB$PH_XSAVE is null). |
LIBICB$PH_CHFCTX_ADDR | Quadword |
Pointer to condition handler facility context block. |
LIBICB$IH_OSSD | Quadword |
Copy of OSSD from unwind information. |
LIBICB$IH_HANDLER_PV | Quadword |
Condition Handler Procedure Value (if any). |
LIBICB$PH_LSDA | Quadword |
Address of the Language Specific Data Area (if any). |
Beginning of User Override Parameters (offset LIBICB$R_UO_BASE) | ||
LIBICB$Q_UO_FLAGS | Quadword |
Operational flags: LIBICB$V_UO_FLAG_CACHE_UNWIND – Cache unwind information during a walk of the call stack. See Section 5.8.3.2, “Walking the Call Stack”. |
LIBICB$IH_UO_IDENT | Quadword | |
LIBICB$PH_UO_READ_MEM | Quadword | |
LIBICB$PH_UO_GETUEINFO | Quadword | |
LIBICB$PH_UO_GETCONTEXT | Quadword | |
LIBICB$PH_UO_WRITE_MEM | Quadword | |
LIBICB$PH_UO_WRITE_REG | Quadword | |
LIBICB$PH_UO_MALLOC | Quadword | |
LIBICB$PH_UO_FREE | Quadword | |
End of user override parameters (length of LIBICB$K_UO_LENGTH) | ||
LIBICB$L_ALERT_CODE | Longword |
Stack walk detailed status. Alert codes are enumerated in the LIBICB include files (see Section 5.8.3.7, “LIB$X86_GET_CURR_INVO_CONTEXT”). |
LIBICB$IH_SYSTEM_ | n Quadwords |
Variable-sized area; unused and undefined at this time. |
Flag | Description |
---|---|
LIBICB$V_EXCEPTION_FRAME |
Set to 1 if this is an exception frame. |
LIBICB$V_AST_FRAME | Set to 1 if this is an AST frame. |
LIBICB$V_BOTTOM_OF_STACK |
Set to 1 if this is the bottom of the stack and there is absolutely no previous frame. |
LIBICB$V_HANDLER_PRESENT |
Set to 1 if this frame has a condition handler. |
LIBICB$V_IN_PROLOGUE |
Set to 1 if the IP is in a prologue region. |
LIBICB$V_IN_EPILOGUE |
Set to 1 if the IP is in an epilogue region. |
5.8.2.2. Invocation Context Handle
To refer to a specific procedure invocation at run-time, an invocation context handle (ICH) can be used. The invocation context handle is a quadword that uniquely identifies any one of the active frames on a call stack.
On OpenVMS x86-64, the invocation context handle for a frame is simply the stack pointer value at procedure entry (that is, the address of the caller’s return address on the stack).
5.8.3. Invocation Context Block Access Routines
Note
The OpenVMS x86-64 stack tracing routines use heap storage during the analysis of unwind descriptors. The default heap storage mechanism uses a LIBRTL implementation of the C RTL function malloc, the use of which may result in virtual memory being expanded using the $EXPREG system service. See Section 5.8.5, “Invocation Context Callback Routines” on how to override the defaults. See also Section 5.8.3.12, “LIB$X86_PREV_INVO_END”.
5.8.3.1. Initializing the Invocation Context Block
Allocate the block on an octaword (16-byte) boundary.
Clear (set to all zero bytes) the entire block.
Initialize the LIBICB$L_CONTEXT_LENGTH field to LIBICB$K_INVO_CONTEXT_BLK_SIZE and the LIBICB$B_BLOCK_VERSION field to LIBICB$K_INVO_CONTEXT_VERSION.
Set any required parameters in the user override portion of the invocation context block.
Set the LIBICB$V_UO_FLAG_CACHE_UNWIND flag if appropriate. See also Section 5.8.3.2, “Walking the Call Stack” and Section 5.8.3.12, “LIB$X86_PREV_INVO_END” regarding subsequent use of LIB$X86_PREV_INVO_END.
Failure to do so will cause these routines to return an error status. Note that this is a change from Alpha, where initialization was not necessary.
LIB$X86_CREATE_INVO_CONTEXT (see Section 5.8.3.3, “LIB$X86_CREATE_INVO_CONTEXT”)
LIB$X86_FREE_INVO_CONTEXT (see Section 5.8.3.4, “LIB$X86_FREE_INVO_CONTEXT”)
LIB$X86_INIT_INVO_CONTEXT (see Section 5.8.3.5, “LIB$X86_INIT_INVO_CONTEXT”)
5.8.3.2. Walking the Call Stack
During the course of program execution, it is sometimes necessary to walk the call stack. Frame-based exception handling is one case where this is done. Call stack navigation is possible only in the reverse direction (in a latest-to-earliest or top-to-bottom sequence).
Given a program state (which contains a register set), build an invocation context.
For the current routine, an initial invocation context block can be obtained by calling the LIB$X86_GET_CURR_INVO_CONTEXT routine (see Section 5.8.3.7, “LIB$X86_GET_CURR_INVO_CONTEXT”).
Repeatedly call the LIB$X86_GET_PREV_INVO_CONTEXT routine (see Section 5.8.3.8, “LIB$X86_GET_PREV_INVO_CONTEXT”) until the desired invocation context, or the end of the call chain, has been reached.
LIB$X86_GET_PREV_INVO_CONTEXT indicates the end of the invocation call chain if either of the following conditions is true:The OSSD$V_BOTTOM_OF_STACK flag is set for the target frame (see Table A.14, “Operating System-Specific Data Area”).
The return address (IP) of the target frame is zero.
To make the stack walk more efficient, you can set the LIBICB$V_UO_FLAG_CACHE_UNWIND flag. This causes unwind information to be carried over from one call to LIB$X86_GET_PREV_INVO_CONTEXT to the next. At the conclusion of the stack walk, you must call LIB$X86_PREV_INVO_END to free any cached unwind information. This is the recommended practice, but not the default behavior.
Compilers are allowed to optimize high-level language procedure calls in such a way that they do not appear in the invocation chain. For example, inline procedures never appear in the invocation chain.
Make no assumptions about the relative positions of any memory used for procedure frame information. There is no guarantee that successive stack frames will always appear at higher addresses.
5.8.3.3. LIB$X86_CREATE_INVO_CONTEXT
This convenience routine simplifies creating and properly initializing an invocation context block. The routine allocates an invocation context block from heap storage and initializes it according to the steps described in Section 5.8.3.1, “Initializing the Invocation Context Block”. Users of this routine should call LIB$X86_FREE_INVO_CONTEXT when the invocation context block is no longer required.
This routine sets the cache unwind flag LIBICB$V_UO_FLAG_CACHE_UNWIND in the invocation context block to speed the stack walk. Do not use this routine in conjunction with LIB$X86_INIT_INVO_CONTEXT, as the same initialization is performed by both routines.
LIB$X86_CREATE_INVO_CONTEXT ([malloc] [, free] [, ident])
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
malloc |
function_value |
procedure |
read |
by value |
free |
function_value |
procedure |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
A procedure value for a user callback routine that allocates memory. See Section 5.8.5.6, “The Memory Allocation Routine” for details of
this routine. This is an optional argument. The default is to use an implementation of
the C RTL routine |
|
A procedure value for a user callback routine that deallocates memory. This value is
placed in the invocation context block field LIBICB$PH_UO_FREE.
See Section 5.8.5.7, “The Memory Deallocation Routine” for
details on this routine. This is an optional argument; however, it must be specified if
|
|
Specifies a user ident value to be placed in the invocation context block
LIBICB$IH_UO_IDENT field. In turn, this value is passed to the
|
|
A non-zero value represents the address of the invocation context block allocated. A value of 0 indicates failure. |
5.8.3.4. LIB$X86_FREE_INVO_CONTEXT
Deallocates an invocation context block that was previously allocated using LIB$X86_CREATE_INVO_CONTEXT. This routine calls LIB$X86_PREV_INVO_END as a convenience.
LIB$X86_FREE_INVO_CONTEXT (invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
|
Address of an invocation context block. |
None. |
5.8.3.5. LIB$X86_INIT_INVO_CONTEXT
Initializes an invocation context block that the user has already allocated (on the stack, or from heap, or other storage) in accordance with Section 5.8.3.1, “Initializing the Invocation Context Block”. Use this routine as an alternative to LIB$X86_CREATE_INVO_CONTEXT, which both allocates and initializes an invocation context block.
LIB$X86_INIT_INVO_CONTEXT (invo_context, invo_version [, cache_unwind_flag])
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
invo_version |
version_number |
byte |
read |
by value |
cache_unwind_flag |
flag |
longword |
read |
by value |
|
Address of an invocation context block. |
|
The value LIBICB$K_INVO_CONTEXT_VERSION. This is used to verify the operating environment. |
|
A flag indicating if the cache unwind flag, LIBICB$V_UO_FLAG_CACHE_UNWIND, should be set in the invocation context block. A value of zero clears the flag; a value of one sets the flag. This is an optional argument. The default is zero. |
|
A value of 1 indicates success. A value of 0 indicates a version number mismatch. |
5.8.3.6. LIB$X86_GET_INVO_CONTEXT
LIB$X86_GET_INVO_CONTEXT(invo_handle, invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_handle |
invo_handle |
quadword |
read |
by reference |
invo_context |
invo_context_blk |
structure |
modify |
by reference |
|
Address of the location that contains the handle for the desired invocation. |
|
Address of an invocation context block into which the procedure context of the frame
specified by |
Note
The invocation context block must be properly initialized as described in Section 5.8.3.1, “Initializing the Invocation Context Block” before calling this routine.
|
Status value. A value of 1 indicates success; a value of 0 indicates failure. |
Note
If the invocation handle that was passed does not represent any procedure context in the active call stack, the new contents of the context block is unpredictable.
5.8.3.7. LIB$X86_GET_CURR_INVO_CONTEXT
LIB$X86_GET_CURR_INVO_CONTEXT(invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
|
Address of an invocation context block into which the procedure context of the caller will be written. |
Note
The invocation context block must be properly initialized as described in Section 5.8.3.1, “Initializing the Invocation Context Block” before calling this routine.
Zero |
This facilitates use in the implementation of the C language unwind
|
5.8.3.8. LIB$X86_GET_PREV_INVO_CONTEXT
LIB$X86_GET_PREV_INVO_CONTEXT(invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
|
Address of a valid invocation context block. The given invocation context block is updated to represent the context of the previous (calling) frame. The LIBICB$V_BOTTOM_OF_STACK flag of the invocation context block is set if the target frame represents the end of the invocation call chain or if stack corruption is detected. |
|
Status value. A value of 1 indicates success. When the initial context represents the bottom of the call stack, a value of 0 is returned. |
5.8.3.9. LIB$X86_GET_INVO_HANDLE
LIB$X86_GET_INVO_HANDLE(invo_context, invo_handle)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
read |
by reference |
invo_handle |
invo_handle |
quadword |
write |
by reference |
|
Address of a valid invocation context block. |
|
Address of the location into which the invocation context handle is to be written. If the call fails, the value of the invocation context handle is LIB$K_INVO_HANDLE_NULL. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.3.10. LIB$X86_GET_CURR_INVO_HANDLE
LIB$X86_GET_CURR_INVO_HANDLE(invo_handle)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_handle |
invo_handle |
quadword |
write |
by reference |
|
Address of a quadword into which the invocation handle of the caller will be written. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.3.11. LIB$X86_GET_PREV_INVO_HANDLE
LIB$X86_GET_PREV_INVO_HANDLE(invo_handle_in, invo_handle_out)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_handle_in |
invo_handle |
quadword |
read |
by reference |
invo_handle_out |
invo_handle |
quadword |
write |
by reference |
|
The address of an invocation handle that represents a target invocation context. |
|
Address of the location into which the invocation context handle of the previous context is to be written. If the call fails, the value of the previous invocation context handle is LIB$K_INVO_HANDLE_NULL. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
Note
Each call to this routine involves a stack walk from the top of the stack to find the procedure matching the input handle. Consequently, using this routine repeatedly is an inefficient way to walk the stack, compared to using LIB$X86_GET_PREV_INVO_CONTEXT.
5.8.3.12. LIB$X86_PREV_INVO_END
This routine should be called at the conclusion of call tracing operations to free the memory used to process unwind descriptors. The call tracing routines are LIB$X86_GET_INVO_CONTEXT, LIB$X86_GET_PREV_INVO_CONTEXT, and LIB$X86_GET_CURR_INVO_CONTEXT.
To provide efficient call tracing, some unwind information is tracked in heap storage from one call to the next. This heap storage should be freed before you release or reuse the invocation context block.
Calling this routine is necessary if the LIBICB$V_UO_FLAG_CACHE_UNWIND flag is set in the LIBICB$Q_UO_FLAGS field of the invocation context block. If this flag is not set, unwind information is released and recreated at each call, and calling this routine is not required.
LIB$X86_PREV_INVO_END (invo_context)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
|
Address of a valid invocation context block previously used for call tracing. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.3.13. LIB$X86_PUT_INVO_REGISTERS
LIB$X86_PUT_INVO_REGISTERS (invo_handle, invo_context [,gr_mask] [,xmm_mask] [,ymm_mask] [,zmm_mask] [,apr_mask] [,misc_mask])
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_handle |
invo_handle |
quadword |
read |
by reference |
invo_context |
invo_context_blk |
structure |
read |
by reference |
gr_mask |
mask_word |
16-bit vector |
read |
by reference |
xmm_mask |
mask_word |
16-bit vector |
read |
by reference |
ymm_mask |
mask_word |
16-bit vector |
read |
by reference |
zmm_mask |
mask_longword |
32-bit vector |
read |
by reference |
apr_mask |
mask_longword |
32-bit vector |
read |
by reference |
misc_mask |
mask_quadword | 64-bit vector |
read |
by reference |
|
Handle for the invocation to be updated. |
|
Address of a valid invocation context block that contains new register contents. |
At least one of the following register masks must be specified and contain a
non-zero value. Each register that is set in the | |
|
Address of a 16-bit bit vector, where each bit corresponds to a register field in
the Bits 0 through 15 correspond to IREG[0] through IREG[15]. Bit 0 corresponds to the argument information register (AI). If bit 7, which corresponds to SP, is set, then no changes are made. |
|
Address of a 16-bit bit vector, where each bit corresponds to an SSE XMM register
field in the XSAVE area, pointed to from the passed |
|
Address of a 16-bit bit vector, where each bit corresponds to an SSE YMM register
field in the XSAVE area, pointed to from the passed |
|
Address of a 32-bit bit vector, where each bit corresponds to an SSE ZMM register
field in the XSAVE area, pointed to from the passed |
Note that if the same bit position is set in more than one of the
| |
|
Address of a 32-bit bit vector, where each bit corresponds to a register field in the pointed to Alpha pseudo-register area passed. Bits 0 through 31 correspond to Alpha registers R0 through R31. If bit 30, which corresponds to SP, or 31, which corresponds to RZ are set, then no changes are made. |
|
Address of a 64-bit bit vector, where each bit corresponds to a register field in
the passed
invo_context as follows:
Note that IP can only be updated when the invocaton in question has been interrupted (either by exception or by an interrupt) and is logically previous to an invocation with the OSSD$V_EXCEPTION_FRAME bit set. Note that MXCSR, FCW, and FSW can only be updated when there is a valid address and
an XSAVE area in the |
|
A value of 1 indicates success. A value of 0 is returned (and nothing is changed) in
the following circumstances:
|
Caution
Great care must be taken to assure that a valid stack frame and execution environment result; otherwise, execution may become unpredictable.
5.8.4. Supplemental Invocation Context Access Routines
The routines described in this section can be used to perform some of the more common operations involving invocation contexts.
5.8.4.1. LIB$X86_GET_GR
Given an invocation context block and general-purpose register index such that 0 <=
index
< 16, copy the register value to
gr_copy
, for example, index
4 fetches the
invocation context block IREG[4] value, which represents the contents of
%rsi
for the context.
LIB$X86_GET_GR fails if the index represents a scratch register whose contents have not been realized.
LIB$X86_GET_GR (invo_context, index, gr_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
read |
by reference |
index | index | longword | read | by value |
gr_copy | integer value | quadword | write |
by reference |
|
Address of a valid invocation context block. |
index |
Index into the IREG array of the invocation context block. |
gr_copy |
Address of a quadword to receive the value from the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.4.2. LIB$X86_SET_GR
Given an invocation context block, a general-purpose register index such that 1 <=
index
< 16, and a quadword value gr_copy
,
writes the corresponding invocation context block general register and uses
LIB$X86_PUT_INVO_REGISTERS to write to the actual context. The
invocation context block remains unchanged if the routine fails.
LIB$X86_SET_GR (invo_context, index, gr_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context | invo_context_blk | structure | modify | by reference |
index | index | longword | read | by value |
gr_copy | integer value | quadword | read | by reference |
|
Address of a valid invocation context block. |
index |
Index into the IREG array of the invocation context block. |
gr_copy |
Address of a quadword that contains the value to be written to the invocation context block. |
5.8.4.3. LIB$X86_GET_XMM
Given an invocation context block and a register index that is 0 <= index
< 16 for SSE (Streaming SIMD Extensions) or 0 <= index
< 32 for AVX-512
(512-bit Advanced Vector Extensions), copy the register value to
xmm_copy
. For example, an index
value of 4
fetches the value, which represents the contents of xmm4
.
LIB$X86_GET_MMX returns failure status if there is no
corresponding XSAVE area in the invo_context
or if the
index
represents a register or register set not saved in the XSAVE
area.
LIB$X86_GET_XMM (invo_context, index, xmm_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
read |
by reference |
index |
index |
longword |
read |
by value |
xmm_copy | register contents |
16 bytes |
write |
by reference |
|
Address of a valid invocation context block. | |
index |
Index into the virtual array of XMM registers constructed from the XSAVE area. The
XSAVE area is pointed to from the invocation context block.
NoteIn case of CPUs implementing the AVX-512 or AVX10 Advanced Vector Extensions, the additional XMM/YMM registers are part of the ZMM registers. For more information on Advanced Vector Extensions, refer to the official documentation on the Intel website. | |
xmm_copy |
Address of a 16-byte buffer to receive the contents of the specified register. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.4.4. LIB$X86_SET_XMM
Given
an invocation context block, a register index that is 0 <= index
< 16 for
SSE (Streaming SIMD Extensions) or 0 <= index
< 32 for AVX-512 (512-bit
Advanced Vector Extensions), and a register value in xmm_copy
, writes the
corresponding entry in the XSAVE area pointed to from the invocation context block, and calls
LIB$X86_PUT_INVO_REGISTERS to write the actual context. The
XSAVE area remains unchanged if the routine fails.
LIB$X86_SET_XMM fails if LIB$X86_PUT_INVO_REGISTERS fails.
LIB$X86_SET_XMM (invo_context, index, xmm_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
index | index | longword | read | by value |
xmm_copy | register contents | 16 bytes | read |
by reference |
|
Address of a valid invocation context block. | |
index |
Index into the virtual array of XMM registers constructed from the XSAVE area. The XSAVE area is pointed to from the invocation context block. NoteIn case of CPUs implementing the AVX-512 or AVX10 Advanced Vector Extensions, the additional XMM/YMM registers are part of the ZMM registers. For more information on Advanced Vector Extensions, refer to the official documentation on the Intel website. | |
xmm_copy |
Address of a 16-byte buffer that contains the value to be written to the invocation context. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.4.5. LIB$X86_GET_YMM
Given an invocation context block and a register index that is 0 <= index
< 16 for AVX (Advanced Vector Extensions) or 0 <= index
< 32 for AVX-512
(512-bit Advanced Vector Extensions), copy the register value to
ymm_copy
. For example, an index
value of 4
fetches the value, which represents the contents of ymm4
.
LIB$X86_GET_YMM returns failure status if there is no
corresponding XSAVE area in the invo_context
or if the index represents
a register or register set not saved in the XSAVE area.
LIB$X86_GET_YMM (invo_context, index, ymm_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context | invo_context_blk | structure | read | by reference |
index | index | longword | read | by value |
ymm_copy | register contents | 32 bytes | write | by reference |
|
Address of a valid invocation context block. |
|
Index into the virtual array of YMM registers constructed from the XSAVE area. The XSAVE area is pointed to from the invocation context block. NoteIn case of CPUs implementing the AVX-512 or AVX10 Advanced Vector Extensions, the additional XMM/YMM registers are part of the ZMM registers. For more information on Advanced Vector Extensions, refer to the official documentation on the Intel website. |
|
Address of a 32-byte buffer to receive the contents of the specified register. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.4.6. LIB$X86_SET_YMM
Given an invocation context block, a register index that is 0 <= index
< 16 for AVX (Advanced Vector Extensions) or 0 <= index
< 32 for AVX-512
(512-bit Advanced Vector Extensions), and a register value in ymm_copy
, writes the
corresponding entry in the XSAVE area pointed to from the invocation context block, and calls
LIB$X86_PUT_INVO_REGISTERS to write the actual context. The XSAVE
area remains unchanged if the routine fails.
LIB$X86_SET_YMM fails if LIB$X86_PUT_INVO_REGISTERS fails.
LIB$X86_SET_YMM (invo_context, index, ymm_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context | invo_context_blk | structure | modify | by reference |
index | index | longword | read | by value |
ymm_copy | register contents | 32 bytes | read | by reference |
|
Address of a valid invocation context block. |
|
Index into the virtual array of YMM registers constructed from the XSAVE area. The XSAVE area is pointed to from the invocation context block. NoteIn case of CPUs implementing the AVX-512 or AVX10 Advanced Vector Extensions, the additional XMM/YMM registers are part of the ZMM registers. For more information on Advanced Vector Extensions, refer to the official documentation on the Intel website. |
|
Address of a 32-byte buffer that contains the value to be written to the invocation context. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.4.7. LIB$X86_GET_ZMM
Given an invocation context block and a register index that is 0 <= index
< 32 for for AVX-512 (512-bit Advanced Vector Extensions), copy the register value to
zmm_copy
. For example, an index
value of 4
fetches the value, which represents the contents of zmm4
.
LIB$X86_GET_ZMM returns failure status if there is no
corresponding XSAVE save area in the invo_context
or if the index
represents a register or register set not saved in the XSAVE save area.
LIB$X86_GET_YMM (invo_context, index, zmm_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context | invo_context_blk | structure | read | by reference |
index | index | longword | read | by value |
zmm_copy | register contents | 64 bytes | write | by reference |
|
Address of a valid invocation context block. |
|
Index into the virtual array of ZMM registers constructed from the XSAVE area. The XSAVE area is pointed to from the invocation context block. |
|
Address of a 64-byte buffer to receive the contents of the specified register. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.4.8. LIB$X86_SET_ZMM
Given an invocation context block, a register index that is 0 <= index
< 32 for AVX-512 (512-bit Advanced Vector Extensions), and a register value in
zmm_copy
, writes the corresponding entry in the XSAVE area pointed to from the
invocation context block, and calls LIB$X86_PUT_INVO_REGISTERS to
write the actual context. The XSAVE area remains unchanged if the routine fails.
LIB$X86_SET_ZMM fails if LIB$X86_PUT_INVO_REGISTERS fails.
LIB$X86_SET_ZMM (invo_context, index, zmm_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context | invo_context_blk | structure | modify | by reference |
index | index | longword | read | by value |
zmm_copy | register contents | 64 bytes | read | by reference |
|
Address of a valid invocation context block. |
|
Index into the virtual array of ZMM registers constructed from the XSAVE area. The XSAVE area is pointed to from the invocation context block. |
|
Address of a 64-byte buffer that contains the value to be written to the invocation context. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.4.9. LIB$X86_SET_IP
Given an invocation context block and a quadword IP value in
ip_copy
, write the ip_copy
value to the
invocation context block IP and then use LIB$X86_PUT_INVO_REGISTERS to
write to the actual context. The invocation context block remains unchanged if the routine
fails.
LIB$X86_SET_IP fails if LIB$X86_PUT_INVO_REGISTERS fails.
LIB$X86_SET_IP (invo_context, ip_copy)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
ip_copy | integer value | quadword | read |
by reference |
|
Address of a valid invocation context block. |
ip_copy |
Address of a quadword that contains the IP value to be written to the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.4.10. LIB$X86_GET_UNWIND_LSDA
Given an ip_value
, find the address of the unwind information block
language specific data area (LSDA), and write it to unwind_lsda_p
. If
not present, then write 0 to unwind_lsda_p
.
LIB$X86_GET_UNWIND_LSDA (ip_value, unwind_lsda_p)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
ip_value | IP value | quadword | read |
by reference |
unwind_lsda_p | address | quadword | write | by reference |
|
Address of a location that contains the IP value. |
unwind_lsda_p |
Address of a quadword to receive the address of the language-specific data area, if there is one. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.4.11. LIB$X86_GET_UNWIND_OSSD
ip_value
, find the address of the unwind information block
operating system-specific data area, if present, and write it to
unwind_ossd_p
. If not present, then write 0 to
unwind_ossd_p
.LIB$X86_GET_UNWIND_OSSD (ip_value, unwind_ossd_p)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
ip_value |
IP value |
quadword |
read |
by reference |
unwind_ossd_p | address | quadword | write |
by reference |
|
Address of a location that contains the IP value. |
unwind_ossd_p |
Address of a quadword to receive the address of the operating system-specific data area. Note that the OSSD value is contained in the FDE unwind information (see Section B.3.2.3, “Frame Description Entry”) and is therefore not writable. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.4.12. LIB$X86_GET_UNWIND_HANDLER_PV
Given an ip_value
, find the procedure value for the condition
handler, if present, and write it to handler_pv
. If not present, then
write 0 to handler_pv
.
LIB$X86_GET_UNWIND_HANDLER_PV (ip_value, handler_pv)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
ip_value |
IP value |
quadword |
read |
by reference |
handler_pv | address | quadword | write |
by reference |
|
Address of a location that contains the IP value. |
handler_pv |
A quadword to receive the procedure value for the condition handler, if there is one. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.4.13. LIB$X86_IS_EXC_DISPATCH_FRAME
Used to determine whether a given IP value represents an exception dispatch frame.
LIB$X86_IS_EXC_DISPATCH_FRAME (ip_value)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
ip_value |
IP value |
quadword |
read |
by reference |
|
Address of a quadword that contains the IP value. The
|
|
Returns 1 if the operating system-specific data area is present and the EXCEPTION_FRAME flag is set. Returns 0 if the operating system-specific data area is present and the EXCEPTION_FRAME flag is clear. Returns 0 if the operating system-specific data area is not present. |
5.8.4.14. LIB$X86_IS_AST_DISPATCH_FRAME
Used to determine whether a given IP value represents an AST dispatch frame.
LIB$X86_IS_AST_DISPATCH_FRAME (ip_value)
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
ip_value |
IP value |
quadword |
read |
by reference |
|
Address of a quadword that contains the IP value. The
|
|
Returns 1 if the operating system-specific data area is present and the AST_FRAME flag is set. Returns 0 if the operating system-specific data area is present and the AST_FRAME flag is clear. Returns 0 if the operating system-specific data area is not present. |
5.8.5. Invocation Context Callback Routines
Perform a call trace on a process other than the current process.
Override the heap storage mechanism used to allocate memory used during the analysis of unwind descriptors.
The user override callback mechanism provides a user ident value that is passed to each callback routine. The user ident value is stored in the LIBICB$IH_UO_IDENT field of the invocation context block.
Note
LIB$X86_GET_CURR_INVO_HANDLE
LIB$X86_GET_PREV_INVO_HANDLE
5.8.5.1. The Get Unwind Information Routine
Place a procedure value for this routine in the LIBICB$PH_UO_GETUEINFO field of the invocation context block.?
int (* getueinfo) (uint64 ip, void *get_ue_block, void *name, ...);
This routine should mimic SYS$GET_UNWIND_ENTRY_INFO for the target process. See Section B.5, “System Unwind Routines” for detailed argument descriptions and return status, with the following notes:
The name argument is not used, and can be ignored. If a read memory callback has been specified, the contents of LIBICB$PH_UO_READ_MEM are passed as a fourth argument, and the contents of LIBICB$PH_UO_IDENT are passed as a fifth argument, otherwise the routine is called with three arguments.
5.8.5.2. The Get Initial Context Routine
Place a function pointer for this routine in the LIBICB$PH_UO_GETCONTEXT field of the invocation context block.
The get initial context routine is used to seed the invocation context block from the target process. This routine should initialize the invocation context block structure with the preserved registers, as well as applicable control and status registers, from the target process. This callback routine is used by LIB$X86_GET_CURR_INVO_CONTEXT and should be followed by at least one call to LIB$X86_GET_PREV_INVO_CONTEXT to generate a working context.
int (* getcontext) (void *invo_context, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
invo_context |
invo_context_blk |
structure |
modify |
by reference |
ident |
user_value |
quadword |
read |
by value |
|
The address of the invocation context block. |
|
Specifies a user ident value from the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.5.3. The Read Memory Routine
Place a function pointer for this routine in the LIBICB$PH_UO_READ_MEM field of the invocation context block.
The read memory routine is used to transfer data from the target process.
int (* read_mem) (void *dst, uint64 src, size_t length, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
dst |
memory_access |
byte_array |
write |
by reference |
src |
memory_address |
quadword |
read |
by value |
length |
size_t |
longword |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
A local memory address and the destination for the read operation. |
|
An address in the target process to be read. |
|
The length in bytes to be read. |
|
Specifies a user ident value from the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.5.4. The Write Memory Routine
Place a procedure value for this routine in the LIBICB$PH_UO_WRITE_MEM field of the invocation context block.
The write memory routine is used to transfer data to the target process. It is used by LIB$X86_PUT_INVO_REGISTERS for a register that has been saved in memory.
int (* write_mem) (void *src, uint64 dst, size_t length, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
src |
memory_access |
byte_array |
read |
by value |
dst |
memory_address |
quadword |
write |
by reference |
length |
size_t |
longword |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
A local memory address and the source for the write operation. |
|
An address in the target process to be written. |
|
The length in bytes to be written. |
|
Specifies a user ident value from the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.5.5. The Write Register Routine
Place a procedure value for this routine in the LIBICB$PH_UO_WRITE_REG field of the invocation context block.
The write register routine is used to write a register in the target process. It is used by LIB$X86_PUT_INVO_REGISTERS for a register that has not been saved in memory.
This routine is optional, or a subset of registers can be implemented, in this case LIB$X86_PUT_INVO_REGISTERS will return an error if this routine is not present, or is unable to write the desired register.
int (* write_reg) (int whichReg, uint64 value_1, uint64 value_2, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
whichReg |
enumeration |
longword |
read |
by value |
value_p |
address |
quadword |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
Indicates the register to be written (see enum in libicb.h). |
|
Specifies the address of the register contents to be written. The number of bytes written is determined by the size of the register. |
|
Specifies a user ident value from the invocation context block. |
|
A value of 1 indicates success. A value of 0 indicates failure. |
5.8.5.6. The Memory Allocation Routine
The memory allocation routine is used to allocate heap storage required during the analysis of unwind descriptors. This routine should mimic the behavior of the C RTL routine malloc.
void * (* malloc) (size_t size, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
length |
size_t |
longword |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
The length in bytes of memory to be allocated. The returned memory block should be aligned on a 16-byte boundary. |
|
Specifies a user ident value from the invocation context block. |
|
Address of the memory block allocated, or 0 for failure. |
One Unwind Context block of size LIBICB$K_CONTEXT_BLK_SIZE
One Unwind Descriptor block of size LIBICB$K_DESCRIPTOR_BLK_SIZE
Several Unwind region blocks of size LIBICB$K_REGION_BLK_SIZE
Several Unwind region label blocks of size LIBICB$K_REGIONLABEL_BLK_SIZE
The number of the last two required depends on the complexity of the unwind descriptors for a given procedure being traced.
5.8.5.7. The Memory Deallocation Routine
The memory deallocation routine is used to free heap storage allocated by the memory allocation routine (see Section 5.8.5.6, “The Memory Allocation Routine”). This routine should mimic the behavior of the C RTL routine free.
void (* free) (void * ptr, uint64 ident);
Argument |
OpenVMS Usage |
Type |
Access |
Mechanism |
---|---|---|---|---|
ptr |
address |
quadword |
read |
by value |
ident |
user_value |
quadword |
read |
by value |
|
Address of a memory block previously allocated by a call to the user malloc routine. |
|
Specifies a user ident value from the invocation context block. |
Function Value Returned:
None.
5.9. Data Alignment and Layout
On x86-64 hardware, a memory reference to data that is not naturally aligned does not result in alignment faults. However, natural alignment is nonetheless generally more efficient and recommended on OpenVMS x86-64.
In addition, common blocks, dynamically allocated (heap) regions (for example from malloc), and global data items greater than 8 bytes should be aligned on a 16-byte boundary.
5.9.1. Scalars
Data Type |
Alignment Starting Position |
---|---|
8-bit character string |
Byte boundary |
16-bit integer |
Address that is a multiple of 2 (word alignment) |
32-bit integer |
Address that is a multiple of 4 (longword alignment) |
64-bit integer |
Address that is a multiple of 8 (quadword alignment) |
|
Address that is a multiple of 4 (longword) |
|
Address that is a multiple of 8 (quadword) |
|
Address that is a multiple of 8 (quadword) |
|
Address that is a multiple of 4 (longword) |
|
Address that is a multiple of 8 (quadword) |
|
Address that is a multiple of 16 (octaword) |
For aggregates such as strings, arrays, and records, the data type to be considered for purposes of alignment is not the aggregate itself, but rather the elements of which the aggregate is composed. The alignment requirement of an aggregate is that all elements of the aggregate be naturally aligned. For example, varying 8-bit character strings must start at addresses that are a multiple of at least 2 (word alignment) because of the 16-bit count at the beginning of the string; 32-bit integer arrays start at a longword boundary, irrespective of the extent of the array.
However, some languages allow definition of aggregate types with an alignment that is
greater than that of any of its components, or provide predefined types with such an alignment
(for example, the __m128
, __m256
, and
__m512
types in C/C++ for x86-64). The alignment of such types becomes the
natural alignment for elements of those types when included in a containing aggregate.
The rules for passing a record in an argument that is passed by immediate value (see Section 5.7, “Parameter and Return Value Passing”) always provide quadword alignment of the record value independent of the normal alignment requirement of the record. If deemed appropriate by an implementation, normal alignment can be established within the called procedure by making a copy of the record argument at a suitably aligned location.
5.9.2. Record Layout Conventions
The OpenVMS x86-64 calling standard rules for record layout are designed to provide good run-time performance on all implementations of the x86-64 architecture and to provide the required level of compatibility with conventional VAX, Alpha, and I64 operating environments.
Those optimized for optimal access characteristics (referred to as aligned record layouts)
Those compatible with conventions that are traditionally used by VAX languages (referred to as VAX compatible record layouts)
Only these record layouts may be used across standard interfaces or between languages. Languages can support other language-specific record layout conventions, but such layouts are nonstandard.
The aligned record layout conventions should be used unless interchange is required with conventional VAX applications that use the OpenVMS VAX compatible record layouts.
5.9.2.1. Aligned Record Layout
All components of a record or subrecord are naturally aligned.
Layout and alignment of record elements and subrecords are independent of any record or subrecord in which they are embedded.
Layout and alignment of a subrecord is the same as if it were a top-level record.
Declaration in high-level languages of standard records for interlanguage use is straightforward and obvious, and meets the requirements for source-level compatibility between OpenVMS x86-64 languages and OpenVMS I64, Alpha, and VAX languages.
The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.
The first bit of a record or subrecord must be directly addressable (byte aligned).
Records and subrecords must be aligned according to the largest natural alignment requirements of the contained elements and subrecords.
Bit fields (packed subranges of integers) are characterized by an underlying integer type that is a byte, word, longword, or quadword in size together with an allocation size in bits. A bit field is allocated at the next available bit boundary, provided that the resulting allocation does not cross an alignment boundary of the underlying type. Otherwise, the field is allocated at the next byte boundary that is aligned as required for the underlying type. (In the later case, the space skipped over is left permanently not allocated). In addition, if necessary, the alignment of the record as a whole is increased to that of the underlying integer type.
Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.
All other components of a record must start at the next available naturally aligned address for the data type.
The length of a record must be a multiple of its alignment. (This includes the case when a record is a component of another record).
Strings and arrays must be aligned according to the natural alignment requirements of the data type of which the string or array is composed.
The length of an array element is a multiple of its alignment, even if this leaves unused space at its end. The length of the whole array is the sum of the lengths of its elements.
5.9.2.2. OpenVMS VAX Compatible Record Layout
The OpenVMS VAX compatible record layout is defined by the following conventions:
The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.
Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.
All other components of a record must start at the next available byte in the record. Any unused bits following the last-used bit in the last-used byte of each component must be filled out to the next byte boundary so that any following data starts on a byte boundary.
Subrecords must be aligned according to the largest alignment of the contained elements and subrecords. A subrecord always starts at the next available byte unless it consists entirely of unaligned bit data and it immediately follows an unaligned bit string, unaligned bit array, or a subrecord consisting entirely of unaligned bit data.
Records must be aligned on byte boundaries.
5.10. Addressing
Industry standard conventions for x86-64 Position Independent Code (PIC) generally make use of a Global Offset Table (GOT) to facilitate addressing code and data that is not known or assured to be within a 32-bit offset of the reference. The GOT is itself a data segment that is assured “near” the code so that PC-relative addressing with a 32-bit offset is sufficient to access that GOT. The GOT holds 64-bit addresses that allow access to any location in the system 64-bit address space.
5.10.1. Memory Models
The small code model—all code and data is within 2 GB.
The large code model—code and data is not limited to be within 2 GB.
The medium code model—code and data is assumed within 2 GB while specifically marked large model data may not.
OpenVMS compilers generate small model position-independent code using indirect addressing of all data to allow static data to be farther than 2 GB away from code. Because direct addressing is used only for entries in the Global Offset Table, OpenVMS compilers do not distinguish between the small and medium memory models. In effect, OpenVMS compilers support the medium data model for applications.
Foreign compilers and object modules may use any memory model. The OpenVMS linker and image activator support all memory models.
5.10.2. Inter-Segment Addressing
In industry standards for x86-64, shareable images may be loaded anywhere, but all segments within a shared library must have the same positions relative to each other that they were assigned by the linker. On OpenVMS x86-64, the image activator may map (logically load) segments of a shareable image independently of each other.
The independent loading of segments influences the way code addresses data. Industry standard x86-64 code uses PC-relative addressing to access not only the Global Offset Table, but also any other data that is known to be local to the image. Because segments may be mapped independently, this standard requires that code use indirect addressing to access all data except for the Global Offset Table. With this scheme, the code segment and the Global Offset Table (linkage) segment are the only segments whose relative positions have to be maintained.
In an image with multiple code segments, each code segment has its own Global Offset Table.
Non-VSI compilers and object modules may assume a small code model and use PC-relative data addressing exclusively. Both the linker and the image activator maintain the relative positions of code segments, Global Offset Tables, and other segments that are referenced in a PC-relative manner. In theory, the code could be adjusted with image relocations; in practice, the limited address range of the small code model (±2 GB) precludes this.
Chapter 6. Signature Information and Translated Images (Alpha and I64 Systems)
To support interoperation between images built from native OpenVMS Alpha code and images translated from OpenVMS VAX code, native Alpha compilers can optionally generate information that describes the parameters and result of a procedure. Similarly, for interoperation between images built from native OpenVMS I64 code and images translated from VAX or Alpha code, I64 compilers can also optionally generate information that describes the parameters and result of a procedure. This auxiliary information is called signature information.
Translated VAX code on Alpha and I64 systems uses VAX argument list and function return conventions as described in Section 2.4, “Argument List” and Section 2.5, “Function Value Returns”.
Translated Alpha code on I64 systems uses Alpha argument list and function return conventions as described in Chapter 3, OpenVMS Alpha Conventions.
The following sections describe the conventions for using signature information to control the passing of arguments and returning a function value when a native procedure passes control to a translated procedure and vice versa.
Mediates calls between native and translated code
Controls execution of translated code
Performs interpretation where necessary