VSI OpenVMS Calling Standard

Document Number: DO–DCALLST–01A
Publication Date: April 2024
Operating System and Version:
VSI OpenVMS x86-64 Version 9.2-1 or higher
VSI OpenVMS IA-64 Version 8.4-1H1 or higher
VSI OpenVMS Alpha Version 8.4-2L1 or higher

Preface

The VSI OpenVMS Calling Standard defines the requirements, mechanisms, and conventions that support procedure-to-procedure calls for OpenVMS VAX, OpenVMS Alpha, OpenVMS Industry Standard 64 (I64), and OpenVMS x86-64. The standard defines the run-time data structures, constants, algorithms, conventions, methods, and functional interfaces that enable a native user-mode procedure to operate correctly in a multilanguage environment on VAX, Alpha, Itanium®, and x86-64 systems. Properties of the run-time environment that must apply at various points during program execution are also defined.

The 32-bit user mode of OpenVMS Alpha provides a high degree of compatibility with programs written for OpenVMS VAX.

The 64-bit user mode of OpenVMS Alpha is a compatible superset of the OpenVMS Alpha 32-bit user mode.

The 32-bit and 64-bit user modes of OpenVMS I64 and x86-64 are highly compatible with OpenVMS Alpha.

The interfaces, methods, and conventions specified in this manual are primarily intended for use by implementers of compilers, debuggers, and other run-time tools, run-time libraries, and base operating systems. These specifications may or may not be appropriate for use by higher level system software and applications.

This standard is under engineering change order (ECO) control. ECOs are approved by VSI's OpenVMS Calling Standard committee.

1. About VSI

VMS Software, Inc. (VSI) is an independent software company licensed by Hewlett Packard Enterprise to develop and support the OpenVMS operating system.

2. Intended Audience

This manual primarily defines requirements for developers of compilers and debuggers, but the information can apply to procedure calling for all programmers.

3. Document Structure

This manual contains the following chapters and appendixes:

Chapter 1 provides an overview of the standard, defines goals, and defines terms used in the text.

Chapter 2 describes the primary conventions in calling a procedure in an OpenVMS VAX environment. It defines register usage and addressing as well as vector and scalar processor synchronization.

Chapter 3 describes the fundamental concepts and conventions in calling a procedure in an OpenVMS Alpha environment. The chapter defines register usage and addressing, and focuses on aspects of the calling standard that pertain to procedure-to-procedure flow of control.

Chapter 4 describes the fundamental concepts and conventions in calling a procedure in an OpenVMS I64 environment. The chapter defines register usage and addressing, and focuses on aspects of the calling standard that pertain to procedure-to-procedure flow of control.

Chapter 5 describes the fundamental concepts and conventions in calling a procedure in an OpenVMS x86-64 environment. The chapter defines register usage and addressing, and focuses on aspects of the calling standard that pertain to procedure-to-procedure flow of control.

Chapter 6 describes signature information and its role in interfacing with translated OpenVMS VAX and Alpha images on Alpha and I64 systems.

Chapter 7 defines the argument-passing data types used in calling a procedure for all OpenVMS environments.

Chapter 8 defines the argument descriptors used in calling a procedure for all OpenVMS environments.

Chapter 9 describes the OpenVMS condition and exception handling requirements for all OpenVMS environments.

Appendix A describes stack unwinding and exception handling for OpenVMS I64 environments.

Appendix B describes stack unwinding and exception handling for OpenVMS x86-64 environments.

Appendix C contains a brief summary of the differences of this calling standard from Intel Itanium and industry x86-64 software conventions.

4. Related Documents

The following manuals contain related information:
  • VAX Architecture Reference Manual

  • Alpha Architecture Reference Manual

  • OpenVMS Programming Interfaces: Calling a System Routine

  • Guide to POSIX Threads Library

  • VAX/VMS Internals and Data Structures

  • OpenVMS AXP Internals and Data Structures

  • Itanium® Software Conventions and Runtime Architecture Guide

  • Intel IA-64 Architecture Software Developer's Manual

  • Intel 64 and IA-32 Architectures Software Developer Manuals

  • System V Application Binary Interface, AMD64 Architecture Processor Supplement, Version 1.0

  • Linux Standard Base, Version 5.0

5. VSI Encourages Your Comments

You may send comments or suggestions regarding this manual or any VSI document by sending electronic mail to the following Internet address: . Users who have VSI OpenVMS support contracts through VSI can contact for help with this product.

6. OpenVMS Documentation

The full VSI OpenVMS documentation set can be found on the VMS Software Documentation webpage at https://docs.vmssoftware.com.

7. Typographical Conventions

The following conventions are used in this manual:

ConventionMeaning
Ctrl/xA sequence such as Ctrl/x indicates that you must hold down the key labeled Ctrl while you press another key or a pointing device button.
PF1 xA sequence such as PF1 x indicates that you must first press and release the key labeled PF1 and then press and release another key (x) or a pointing device button.
...
A horizontal ellipsis in examples indicates one of the following possibilities:
  • Additional optional arguments in a statement have been omitted.

  • The preceding item or items can be repeated one or more times.

  • Additional parameters, values, or other information can be entered.

.
.
.
A vertical ellipsis indicates the omission of items from a code example or command format; the items are omitted because they are not important to the topic being discussed.
( )In command format descriptions, parentheses indicate that you must enclose choices in parentheses if you specify more than one.
[ ]In command format descriptions, brackets indicate optional choices. You can choose one or more items or no items. Do not type the brackets on the command line. However, you must include the brackets in the syntax for directory specifications and for a substring specification in an assignment statement.
|In command format descriptions, vertical bars separate choices within brackets or braces. Within brackets, the choices are optional; within braces, at least one choice is required. Do not type the vertical bars on the command line.
{ }In command format descriptions, braces indicate required choices; you must choose at least one of the items listed. Do not type the braces on the command line.
bold typeBold type represents the name of an argument, an attribute, or a reason. Bold type also represents the introduction of a new term.
italic typeItalic type indicates important information, complete titles of manuals, or variables. Variables include information that varies in system output (Internal error number), in command lines (/PRODUCER=name), and in command parameters in text (where dd represents the predefined code for the device type).
UPPERCASE TYPEUppercase type indicates a command, the name of a routine, the name of a file, or the abbreviation for a system privilege.
Example

This typeface indicates code examples, command examples, and interactive screen displays. In text, this type also identifies website addresses, UNIX commands and pathnames, PC-based commands and folders, and certain elements of the C programming language.

-
A hyphen at the end of a command format description, command line, or code line indicates that the command or statement continues on the following line.
numbersAll numbers in text are assumed to be decimal unless otherwise noted. Nondecimal radixes—binary, octal, or hexadecimal—are explicitly indicated.

Chapter 1. Introduction

This standard defines properties such as the run-time data structures, constants, algorithms, conventions, methods, and functional interfaces that enable a native user-mode procedure to operate correctly in a multilanguage and multithreaded environment on OpenVMS VAX, OpenVMS Alpha, OpenVMS I64, and OpenVMS x86-64 systems. These properties include the contents of key registers, format and contents of certain data structures, and actions that procedures must perform under certain circumstances.

This standard also defines properties of the run-time environment that must apply at various points during program execution. These properties vary in scope and applicability. Some properties apply at all points throughout the execution of standard-conforming user-mode code and must, therefore, be held constant at all times. Examples of such properties include those defined for the stack pointer and various properties of the call stack navigation mechanism. Other properties apply only at certain points, such as call conventions that apply only at the point of transfer of control to another procedure.

Furthermore, some properties are optional depending on circumstances. For example, compilers are not obligated to follow the argument list conventions when a procedure and all of its callers are in the same module, have been analyzed by an interprocedural analyzer, or have private interfaces (such as language-support routines).

Note

In many cases, significant performance gains can be realized by selective use of nonstandard calls when the safety of such calls is known. Developers of compilers and other tools are encouraged to make full use of such optimizations.

The procedure call mechanism depends on agreement between the calling and called procedures to interpret the argument list. The argument list does not fully describe itself. This standard requires language extensions to permit a calling program to generate some of the argument-passing mechanisms expected by called procedures.

This standard specifies the following attributes of the interfaces between modules:
  • Calling sequence—instructions at the call site, entry point, and returns

  • Argument list—structure of the list describing the arguments to the called procedure

  • Function value return—form and conventions for the return of the function value as a value or as a condition value to indicate success or failure

  • Register usage—which registers are preserved and who is responsible for preserving them

  • Stack usage—rules governing the use of the stack

  • Argument data types—data types of arguments that can be passed

  • Argument descriptor formats—how descriptors are passed for the more complex arguments

  • Condition handling—how exception conditions are signaled and how they are handled in a modular fashion

  • Stack unwinding—how the current thread of execution is aborted efficiently.

1.1. Applicability

This standard defines the rules and conventions that govern the native user-mode run-time environment on OpenVMS VAX, Alpha, I64, and x86-64 systems. It is applicable to all software that executes in OpenVMS native user mode.

Uses of this standard include:
  • All externally callable interfaces in OpenVMS supported, standard system software

  • All intermodule calls to major software components

  • All external procedure calls generated by OpenVMS language processors without interprocedural analysis or permanent private conventions (such as those used for language-support run-time library [RTL] routines).

1.2. Architectural Level

This standard defines an implementation-level run-time software architecture for OpenVMS operating systems.

The interfaces, methods, and conventions specified in this document are primarily intended for use by implementers of compilers, debuggers, and other run-time tools, run-time libraries, and base operating systems. These specifications may or may not be appropriate for use by higher-level system software and applications.

Compilers and run-time libraries may provide additional support of these capabilities via interfaces that are more suited for compiler and application use. This specification neither prohibits nor requires such additional interfaces.

1.3. Goals

Generally, this calling standard promotes the highest degree of performance, portability, efficiency, and consistency in the interface between called procedures of a common OpenVMS environment. Specifically, the calling standard:
  • Applies to all intermodule callable interfaces in the native software system. Specifically, the standard considers the requirements of important compiled languages including Ada, BASIC, BLISS, C, C++, COBOL, Fortran, Pascal, LISP, PL/I, and calls to the operating system and library procedures. The needs of other languages that the OpenVMS operating system may support in the future must be met by the standard or by compatible revisions to it.

  • Excludes capabilities for lower-level components (such as assembler routines) that cannot be invoked from the high-level languages.

  • Allows the calling program and called procedure to be written in different languages. The standard reduces the need for using language extensions in mixed-language programs.

  • Contributes to the writing of error-free, modular, and maintainable software, and promotes effective sharing and reuse of software modules.

  • Provides the programmer with control over fixing, reporting, and flow of control when various types of exception conditions occur.

  • Provides subsystem and application writers with the ability to override system messages toward a more suitable application-oriented interface.

  • Adds no space or time overhead to procedure calls and returns that do not establish exception handlers, and minimizes time overhead for establishing handlers at the cost of increased time overhead when exceptions occur.

The portion of this standard specific to OpenVMS Alpha:
  • Supports a 32-bit user-mode environment that provides a high degree of compatibility with the OpenVMS VAX environment.

  • Supports a 64-bit user-mode environment that is a compatible superset of the OpenVMS Alpha 32-bit environment.

  • Simplifies coexistence with OpenVMS VAX procedures that execute under the translated image environment.

  • Simplifies the compilation of OpenVMS VAX assembler source to native OpenVMS Alpha object code.

  • Supports a multilanguage, multithreaded execution environment, including efficient, effective support for the implementation of the multithreaded architecture.

  • Provides an efficient mechanism for calling lightweight procedures that do not need or cannot expend the overhead of setting up a stack call frame.

  • Provides for the use of a common calling sequence to invoke lightweight procedures that maintain only a register call frame and heavyweight procedures that maintain a stack call frame. This calling sequence allows a compiler to determine whether to use a stack frame based on the complexity of the procedure being compiled. A recompilation of a called routine that causes a change in stack frame usage does not require a recompilation of its callers.

  • Provides condition handling, traceback, and debugging for lightweight procedures that do not have a stack frame.

  • Makes efficient use of the Alpha architecture, including effectively using a larger number of registers than is contained in a conventional VAX processor.

  • Minimizes the cost of procedure calls.

The portion of this standard specific to OpenVMS I64:
  • Extends all of the goals listed above for the OpenVMS Alpha environment to the OpenVMS I64 environment.

  • Supports a 64-bit user mode environment that is highly compatible with the OpenVMS Alpha 64-bit user mode environment.

  • Makes efficient use of the Itanium architecture, including using a larger number of registers than is contained in a conventional Alpha processor, as well as additional I64 architecture features.

  • Follows conventions established for Intel Itanium processor software generally except where required to preserve compatibility with OpenVMS VAX and Alpha environments.

The portion of this standard specific to OpenVMS x86-64:
  • Extends all of the goals of the earlier OpenVMS environments to x86-64 compatible systems.

  • Follows industry conventions established for the Intel and AMD compatible x86-64 processor software generally except where required to preserve compatibility with OpenVMS for earlier environments.

The OpenVMS procedure calling mechanisms of this standard do not provide:
  • Checking of argument data types, data structures, and parameter access. The OpenVMS protection and memory management systems do not depend on correct interactions between user-level calling and called procedures. Such extended checking might be desirable in some circumstances, but system integrity does not depend on it.

  • Information for an interpretive OpenVMS Debugger. The definition of the debugger includes a debug symbol table (DST) that contains the required descriptive information.

1.4. Definitions

The following terms are used in this standard:
  • Address: On OpenVMS VAX systems, a 32-bit value used to denote a position in memory. On OpenVMS Alpha, OpenVMS I64, and OpenVMS x86-64 systems (collectively referred to as the 64-bit systems), a 64-bit value used to denote a position in memory. However, many 64-bit applications and user-mode facilities operate in such a manner that addresses are restricted only to values that are representable in 32 bits. This allows addresses on 64-bit systems often to be stored and manipulated as 32-bit longword values. In such cases, the 32-bit address value is always implicitly or explicitly sign-extended to form a 64-bit address for use by the hardware.

  • Argument list: A vector of entries (longwords on OpenVMS VAX, quadwords on 64-bit systems) that represents a procedure parameter list and possibly a function value.

  • Asynchronous software interrupt: An asynchronous interruption of normal code flow caused by some software event. This interruption shares many of the properties of hardware exceptions, including forcing some out-of-line code to execute.

  • Bound procedure: A type of procedure that requires knowledge (at run-time) of a dynamically determined larger enclosing scope to function correctly.

  • Call frame: The body of information that a procedure must save to allow it to properly return to its caller. A call frame may exist on the stack or in registers. A call frame may optionally contain additional information required by the called procedure.

  • Condition handler: A procedure designed to handle conditions (exceptions) when they occur during the execution of a thread.

  • Condition value: A 32-bit value (sign-extended to a 64-bit value on 64-bit systems) used to uniquely identify an exception condition. A condition value can be returned to a calling program as a function value or it can be signaled using the OpenVMS signaling mechanism.

  • Descriptor: A mechanism for passing parameters where the address of a descriptor is an entry in the argument list. The descriptor contains the address of the parameter, data type, size, and additional information needed to describe fully the data passed.

  • Exception condition (or condition): An exceptional condition in the current hardware or software state that should be noted or fixed. Its existence causes an interruption in program flow and forces execution of out-of-line code. Such an event might be caused by an exceptional hardware state, such as arithmetic overflows, memory access control violations, and so on, or by actions performed by software, such as subscript range checking, assertion checking, or asynchronous notification of one thread by another.

    During the time the normal control flow is interrupted by an exception, that condition is termed active.

  • Function: A procedure that returns a single value in accordance with the standard conventions for value returning. Additional values may be returned by means of the argument list.

  • Function pointer: See Procedure value.

  • Function value: Depending on context, either 1) a value that is returned as a result of calling a procedure, or 2) a procedure value (see below).

  • Hardware exception: A category of exceptions that reflect an exceptional condition in the current hardware state that should be noted or fixed by the software. Hardware exceptions can occur synchronously or asynchronously with respect to the normal program flow.

  • IP (I64 platforms): Instruction pointer—a value that identifies a bundle of instructions in memory; the address of the first (lowest addressed) byte of an aligned 16-byte sequence that encodes three Itanium architecture instructions. See also PC.

  • IP (x86-64 platforms): Instruction pointer—an address that identifies an instruction in memory. See also PC.

  • Immediate value: A mechanism for passing input parameters where the actual value is provided in the argument list entry by the calling program.

  • Language-support procedure: A procedure called implicitly to implement high-level language constructs. Such procedures are not intended to be explicitly called from user programs.

  • Leaf procedure: A procedure that makes no outbound calls. Conversely, a non-leaf procedure is one that does make outbound calls.

  • Library procedure: A procedure explicitly called using the equivalent of a call statement or function reference. Such procedures are usually language independent.

  • Natural alignment: An attribute of certain data types that refers to the placement of the data so that the lowest addressed byte of the data has an address that is a multiple of the size of the data in bytes. Natural alignment of an aggregate data type generally refers to an alignment in which all members of the aggregate are naturally aligned.

    This standard defines five natural alignments:
    • Byte—Any byte address

    • Word—Any byte address that is a multiple of 2

    • Longword—Any byte address that is a multiple of 4

    • Quadword—Any byte address that is a multiple of 8

    • Octaword—Any byte address that is a multiple of 16

  • PC: A value that identifies an instruction in memory. On OpenVMS VAX, Alpha, and x86-64 systems, the address of the first (lowest addressed) byte of the sequence (unaligned on VAX and x86-64, longword aligned on Alpha) that holds the instruction. On OpenVMS I64, the IP (see above) of the bundle that contains the instruction added to the number of the slot (0, 1, or 2) for that instruction within the bundle. Sometimes used as a synonym or generic alternative to IP.

  • Procedure: A closed sequence of instructions that is entered from and returns control to the calling program.

  • Procedure value: An address value that represents a procedure. On OpenVMS VAX systems, a procedure value is the address of the entry mask that is interpreted by the CALLx instruction invoking the procedure. On OpenVMS Alpha systems, a procedure value is the address of the procedure descriptor for the procedure. On OpenVMS I64 systems, a procedure value is the address of a function descriptor for the procedure; it is also known as a function pointer. On OpenVMS x86-64 systems, a procedure value is a 32-bit address for either the entry point of a procedure or, if the entry point address is not representable in 32-bits, a 32-bit address for trampoline code that jumps to the actual entry point; the trampoline code may be created by the linker or be created dynamically in the case of a bound procedure value.

  • Process: An address space and at least one thread of execution. Selected security and quota checks are done on a per-process basis.

    This standard anticipates the possibility of the execution of multiple threads within a process. An operating system that provides only a single thread of execution per process is considered a special case of a multithreaded system where the maximum number of threads per process is one.

  • Reference: A mechanism for passing parameters where the address of the parameter is provided in the argument list by the calling program.

  • Routine: Synonym for procedure or function.

  • Signal: A POSIX defined concept used to cause out-of-line execution of code. (This term should not be confused with the OpenVMS usage of the word that more closely equates to exception as used in this document).

  • Standard call: Any transfer of control to a procedure by any means that presents the called procedure with the environment defined by this document and does not place additional restrictions, not defined by this document, on the called procedure.

  • Standard-conforming procedure: A procedure that adheres to all the relevant rules set forth in this document.

  • Thread of execution (or thread): An entity scheduled for execution on a processor. In language terms, a thread is a computational entity used by a program unit. Such a program unit might be a task, procedure, loop, or some other unit of computation.

    All threads executing within the same process share the same address space and other process contexts, but they have a unique per-thread hardware context that includes program counter, processor status, stack pointer, and other machine registers.

    This standard applies only to threads that execute within the context of a user-mode process and are scheduled on one or more processors according to software priority. All subsequent uses of the term thread in this standard refer only to such user-mode process threads.

  • Thread-safe code: Code that is compiled in such a way to ensure it will execute properly when run in a threaded environment. Thread-safe code usually adds extra instructions to do certain run-time checks and requires that thread local storage be accessed in a particular fashion.

  • Trampoline: A code fragment (often just one or a very few instructions) that forwards a jump or call.

  • Undefined: Referring to operations or behavior for which there is no directing algorithm used across all implementations that support this standard. Such operations may be well defined for a particular implementation, but they still remain undefined with reference to this standard. The actions of undefined operations may not be required by standard-conforming procedures.

  • Unpredictable: Referring to the results of an operation that cannot be guaranteed across all implementations of this standard. These results may be well defined for a particular implementation, but they remain unpredictable with reference to this standard. All results that are not specified in this standard, but are caused by operations defined in this standard, are considered unpredictable. A standard-conforming procedure cannot depend on unpredictable results.

Chapter 2. OpenVMS VAX Conventions

This chapter describes the primary conventions in calling a procedure in an OpenVMS VAX environment.

2.1. Register Usage

In the VAX architecture, there are fifteen 32-bit-wide, general-purpose hardware registers for use with scalar and vector program operations. This section defines the rules of scalar and vector register usage.

2.1.1. Scalar Register Usage

This standard defines several general-purpose VAX registers and their scalar use as listed in Table 2.1.
Table 2.1. VAX Register Usage
RegisterUse

PC

Program counter.

SP

Stack pointer.

FP

Current stack frame pointer. This register must always point at the current frame. No modification is permitted within a procedure body.

AP

Argument pointer. When a call occurs, AP must point to a valid argument list. A procedure without parameters points to an argument list consisting of a single longword containing the value 0.

R1

Environment value. When a procedure that needs an environment value is called, the calling program must set R1 to the environment value. See bound procedure value in Section 7.3.

R0, R1

Function value return registers. These registers are not to be preserved by any called procedure. They are available as temporary registers to any called procedure.

Registers R2 through R11 are to be preserved across procedure calls. The called procedure can use these registers, provided it saves and restores them using the procedure entry mask mechanism. The entry mask mechanism must be used so that any stack unwinding done by the condition handling mechanism restores all registers correctly. In addition, PC, FP, and AP are always preserved in the stack frame (see Section 2.2) by the CALLS or CALLG instruction and restored by the RET instruction. However, a called procedure can use AP as a temporary register.

If JSB routines are used, they must not save or modify any preserved registers (R2 through R11) not already saved by the entry mask mechanism of the calling program.

2.1.2. Vector Register Usage

This calling standard does not specify conventions for preserved vector registers, vector argument registers, or vector function value return registers. All such conventions are by agreement between the calling and called procedures. In the absence of such an agreement, all vector registers, including V0 through V15, VLR, VCR, and VMR are scratch registers. Among cooperating procedures, a procedure that preserves or otherwise manipulates the vector registers by agreement with its callers must provide an exception handler to restore them during an unwind.

2.2. Stack Usage

Figure 2.1 shows the contents of the stack frame created for the called procedure by the CALLG or CALLS instruction.

Figure 2.1. Stack Frame Generated by CALLG or CALLS Instruction
Stack Frame Generated by CALLG or CALLS Instruction

FP always points to the call frame (the condition-handler longword) of the calling procedure. Other uses of FP within a procedure are prohibited. The bottom of stack frame (end of call stack) is indicated when the stack frame's preserved FP is 0. Unless the procedure has a condition handler, the condition-handler longword contains all zeros. See Chapter 9 for more information on condition handlers.

The contents of the stack located at addresses higher than the mask/PSW longword belong to the calling program; they should not be read or written by the called procedure, except as specified in the argument list. The contents of the stack located at addresses lower than SP belong to interrupt and exception routines; they are modified continually and unpredictably.

The called procedure allocates local storage by subtracting the required number of bytes from the SP provided on entry. This local storage is freed automatically by the return instruction (RET).

Bit <28> of the mask/PSW longword is reserved to OpenVMS for future extensions to the stack frame.

2.3. Calling Sequence

At the option of the calling procedure, the called procedure is invoked using the CALLG or CALLS instruction, as follows:
     CALLG    arglst, proc
     CALLS    argcnt, proc
CALLS pushes the argument count argcnt onto the stack as a longword and sets the argument pointer, AP, to the top of the stack. The complete sequence using CALLS follows:
     push     argn
     .
     .
     .
     push     arg1
     CALLS    #n, proc

If the called procedure returns control to the calling procedure, control must return to the instruction immediately following the CALLG or CALLS instruction. Skip returns and GOTO returns are allowed only during stack unwind operations.

The called procedure returns control to the calling procedure by executing the RET instruction.

2.4. Argument List

The argument list is the primary means of passing information to and receiving results from a procedure.

2.4.1. Argument List Format

Figure 2.2 shows the argument list format.

Figure 2.2. Argument List Format
Argument List Format

The first longword is always present and contains the argument count as an unsigned integer in the low byte. The 24 high-order bits are reserved and must be zero. To access the argument count, the called procedure must ignore the reserved bits and access the count as an unsigned byte (for example, MOVZBL, TSTB, or CMPB).

The remaining longwords can be one of the following:
  • An uninterpreted 32-bit value (by immediate value mechanism). If the called procedure expects fewer than 32 bits, it accesses the low-order bits and ignores the high-order bits.

  • An address (by reference mechanism). It is typically a pointer to a scalar data item, array, structure, record, or a procedure.

  • An address of a descriptor (by descriptor mechanism). See Chapter 8 for descriptor formats.

The standard permits programs to call by immediate value, by reference, by descriptor, or by combinations of these mechanisms. Interpretation of each argument list entry depends on agreement between the calling and called procedures. High-level languages use the reference or descriptor mechanisms for passing input parameters. OpenVMS system services and VAX BLISS, VAX C, VAX C++, or VAX MACRO programs use all three mechanisms.

A procedure with no arguments is called with a list consisting of a 0 argument count longword, as follows:
     CALLS    #0, proc 

A missing or null argument—for example, CALL SUB(A,,B)—is represented by an argument list entry consisting of a longword 0. Some procedures allow trailing null arguments to be omitted and others require all arguments. See each procedure's specification for details.

The argument list must be treated as read-only data by the called procedure and might be allocated in read-only memory at the option of the calling program.

2.4.2. Argument Lists and High-Level Languages

Functional notations for procedure calls in high-level languages are mapped into VAX argument lists according to the following rules:
  • Arguments are mapped from left to right to increasing argument list offsets. The leftmost (first) argument has an address of arglst+4, the next has an address of arglst+8, and so on. The only exception to this is when arglst+4 specifies where a function value is to be returned, in which case the first argument has an address of arglst+8, the second argument has an address of arglst+12, and so on. See Section 2.5 for more information.

  • Each argument position corresponds to a single VAX argument list entry. For the C and C++ languages, a floating-point argument or a record struct that is larger than 32 bits may be passed by value using more than one VAX argument list entry. In this case, the argument count in the argument list reflects the actual number of argument list entries rather than the number of C or C++ language arguments.

2.4.2.1. Order of Argument Evaluation

Because most high-level languages do not specify the order of evaluation of arguments (with respect to side effects), those language processors can evaluate arguments in any convenient order.

In constructing an argument list on the stack, a language processor can evaluate arguments from right to left and push their values on the stack. If call-by-reference semantics are used, argument expressions can be evaluated from left to right, with pointers to the expression values or descriptors being pushed from right to left.

Note

The choice of argument evaluation order and code generation strategy is constrained only by the definition of the particular language. Do not write programs that depend on the order of evaluation of arguments.

2.4.2.2. Language Extensions for Argument Transmission

This calling standard permits arguments to be passed by immediate value, by reference, or by descriptor. By default, all language processors except VAX BLISS, VAX C, and VAX MACRO pass arguments by reference or by descriptor.

Language extensions are needed to reconcile the different argument-passing mechanisms. In addition to the default passing mechanism used, each language processor is required to give you explicit control, in the calling program, of the argument-passing mechanism for the data types supported by the language.

Table 2.2 lists various argument data-type groups. In the table, the value Yes means the language processor is responsible for providing the user with explicit control of that argument-passing mechanism group.
Table 2.2. Argument-Passing Mechanisms with User Explicit Control

Data Type Group

Section

Value

Reference

Descriptor

Atomic <= 32 bits

7.1

Yes

Yes

Yes

Atomic > 32 bits

7.1

No

Yes

Yes

String

7.2

No

Yes

Yes

Miscellaneous

7.3

No?

No

No

Array

8

No

Yes

Yes

For example, VAX Fortran provides the following intrinsic compile-time functions:

%VAL(arg)

By immediate value mechanism. Corresponding argument list entry is the value of the argument arg as defined in the language.

%REF(arg)

By reference mechanism. Corresponding argument list entry contains the address of the value of the argument arg as defined in the language.

%DESCR(arg)

By descriptor mechanism. Corresponding argument list entry contains the address of a descriptor of the argument arg as defined in Chapter 8 and in the language.

Use these intrinsic functions in the syntax of a procedure call to control generation of the argument list. For example:
     CALL SUB1(%VAL(123), %REF(X), %DESCR(A))

For more information, see the VAX Fortran language documentation.

In other languages, you can achieve the same effect by making appropriate attributes of the declaration of SUB1 in the calling program. Thus, you might write the following after making the external declaration for SUB1:
     CALL SUB1 (123, X, A)

2.5. Function Value Returns

A function value is returned in register R0 if its data type can be represented in 32 bits, or in registers R0 and R1 if its data type can be represented in 64 bits, provided the data type is not a string data type (see Section 7.2).

If the data type requires fewer than 32 bits, then R1 and the high-order bits of R0 are undefined. If the data type requires 32 or more bits but fewer than 64 bits, then the high-order bits of R1 are undefined. Two separate 32-bit entities cannot be returned in R0 and R1 because high-level languages cannot process them.

In all other cases (the function value needs more than 64 bits, the data type is a string, the size of the value can vary from call to call, and so on), the actual argument list and the formal argument list are shifted one entry. The new first entry is reserved for the function value. In this case, one of the following mechanisms is used to return the function value:
  • If the maximum length of the function value is known (for example, octaword integer, H_floating, or fixed-length string), the calling program can allocate the required storage and pass the address of the storage or a descriptor for the storage as the first argument.

  • If the maximum length of a string function value is not known to the calling program, the calling program can allocate a dynamic string descriptor. The called procedure then allocates storage for the function value and updates the contents of the dynamic string descriptor using OpenVMS Run-Time Library procedures. For information about dynamic strings, see Section 8.3.

  • If the maximum length of a fixed-length string (see Section 8.2) or a varying string (see Section 8.8) function value is not known to the calling program, the calling program can indicate that it expects the string to be returned on top of the stack. For more information about the function value return, see Section 2.5.1.

Some procedures, such as operating system calls and many library procedures, return a success or failure value as a longword function value in R0. Bit <0> of the value is set (Boolean true) for a success and clear (Boolean false) for a failure. The particular success or failure status is encoded in the remaining 31 bits, as described in Section 9.1.

2.5.1. Returning a Function Value on Top of the Stack

If the maximum length of the function value is not known, the calling program can optionally allocate certain descriptors with the POINTER field set to 0, indicating that no space has been allocated for the value. If the called procedure finds POINTER 0, it fills in the POINTER, LENGTH, and other extent fields to describe the actual size and placement of the function value. This function value is copied to the top of the stack as control returns to the calling program.

This is an exception to the usual practice because the calling program regains control at the instruction following the CALLG or CALLS sequence with the contents of SP restored to a value different from the one it had at the beginning of its CALLG or CALLS calling sequence.

This technique applies only to the first argument in the argument list. Also, the called procedure cannot assume that the calling program expects the function value to be returned on the stack. Instead, the called procedure must check the CLASS field. If the descriptor is one that can be used to return a value on the stack, the called procedure checks the POINTER field. If POINTER is not 0, the called procedure returns the value using the semantics of the descriptor. If POINTER is 0, the called procedure fills in the POINTER and LENGTH fields and returns the value to the top of the stack.

Also, when POINTER is 0, the contents of R0 and R1 are unspecified by the called procedure. Once the called procedure fills in the POINTER field and other extent fields, the calling program may pass the descriptor as an argument to other procedures.

2.5.1.1. Returning a Fixed-Length or Varying String Function Value

If a called procedure can return its function value on the stack as a fixed-length (see Section 8.2) or varying string (see Section 8.8), the called procedure must also take the following actions (determined by the CLASS and POINTER fields of the first descriptor in the argument list):

CLASS

POINTER

Called Procedure's Action

S=1

Not 0

Copy the function value to the fixed-length area specified by the descriptor and space fill (hex 20 if ASCII) or truncate on the right. The entire area is always written according to Section 8.2.

S=1

0

Return the function value on top of the stack after filling in POINTER with the first address of the string and LENGTH with the length of the string to complete the descriptor according to Section 8.2.

VS=11

Not 0

Copy the function value to the varying area specified by the descriptor and fill in CURLEN and BODY according to Section 8.8.

VS=11

0

Return the function value on top of the stack after filling in POINTER with the address of CURLEN and MAXSTRLEN with the length of the string in bytes (same value as contents of CURLEN) according to Section 8.8.

Other

Error. A condition is signaled.

In both the fixed-length and varying string cases, the string is unaligned. Specifically, the function value is allocated on top of the stack with no unused bytes between the stack pointer value contained at the beginning of the CALLS or CALLG sequence and the last byte of the string.

2.6. Vector and Scalar Processor Synchronization

There are two kinds of synchronization between a scalar and vector processor pair: memory synchronization and exception synchronization.

Memory synchronization with the caller of a procedure that uses the vector processor is required because scalar machine writes (to main memory) might still be pending at the time of entry to the called procedure. The various forms of write-cache strategies allowed by the VAX architecture combined with the possibly independent scalar and vector memory access paths imply that a scalar store followed by a CALLx followed by a vector load is not safe without an intervening MSYNC.

Within a procedure that uses the vector processor, proper memory and exception synchronization might require use of an MSYNC instruction, a SYNC instruction, or both, prior to calling or upon being called by another procedure. Further, for calls to other procedures, the requirements can vary from call to call, depending on details of actual vector usage.

An MSYNC instruction (without a SYNC) at procedure entry, at procedure exit, and prior to a call provides proper synchronization in most cases. A SYNC instruction without an MSYNC prior to a CALLx (or RET) is sometimes appropriate. The remaining two cases, where both or neither MSYNC and SYNC are needed, are rare.

Refer to the VAX MACRO and Instruction Set Reference Manual for the specific rules on what exceptions are ensured to be reported by MSYNC and other MFVP instructions.

2.6.1. Memory Synchronization

Every procedure is responsible for synchronization of memory operations with the calling procedure and with procedures it calls. If a procedure executes vector loads or stores, one of the following must occur:
  • An MSYNC instruction (a form of the MFVP instruction) must be executed before the first vector load and store to synchronize with memory operations issued by the caller. While an MSYNC instruction might typically occur in the entry code sequence of a procedure, exact placement might also depend on a variety of optimization considerations.

  • An MSYNC instruction must be executed after the last vector load or store to synchronize with memory operations issued after return. While an MSYNC instruction might typically occur in the return code sequence of a procedure, exact placement might also depend on a variety of optimization considerations.

  • An MSYNC instruction must be executed between each vector load and store and each standard call to other procedures to synchronize with memory operations issued by those procedures.

Any procedure that executes vector loads or stores is responsible for synchronizing with potentially conflicting memory operations in any other procedure. However, execution of an MSYNC instruction to ensure scalar and vector memory synchronization can be omitted when it can be determined for the current procedure that all possibly incomplete vector load and stores operate only on memory not accessed by other procedures.

2.6.2. Exception Synchronization

Every procedure must ensure that no exception can be raised after the current frame is changed (as a result of a CALLx or RET). If a procedure executes any vector instruction that might raise an exception, then a SYNC instruction (a form of the MFVP instruction) must be executed prior to any subsequent CALLx or RET.

However, if the only exceptions that can occur are certain to be reported by an MSYNC instruction that is otherwise needed for memory synchronization, then the SYNC is redundant and can be omitted as an optimization.

Moreover, if the only exceptions that can occur are certain to be reported by one or more MFVP instructions that read the vector control registers, then the SYNC is redundant and can be omitted as an optimization.

Chapter 3. OpenVMS Alpha Conventions

This chapter describes the fundamental concepts and conventions for calling a procedure in an Alpha environment. The following sections identify register usage and addressing, and focus on aspects of the calling standard that pertain to procedure-to-procedure flow control.

3.1. Register Usage

The 64-bit-wide, general-purpose Alpha hardware registers divide into two groups:
  • Integer

  • Floating-point

The first 32 general-purpose registers support integer processing and the second 32 support floating-point operations.

3.1.1. Integer Registers

This standard defines the usage of the Alpha general-purpose integer registers as listed in Table 3.1.
Table 3.1. Alpha Integer Register Usage

Register

Usage

R0

Function value register. In a standard call that returns a nonfloating-point function result in a register, the result must be returned in this register. In a standard call, this register may be modified by the called procedure without being saved and restored. This register is not to be preserved by any called procedure.

R1

Conventional scratch register. In a standard call, this register may be modified by the called procedure without being saved and restored. This register is not to be preserved by any called procedure. In addition, R1 is the preferred and recommended register to use for passing the environment value when calling a bound procedure. (See Section 3.6.4 and Section 6.1.2).

R2—R15

Conventional saved registers. If a standard-conforming procedure modifies one of these registers, it must save and restore it.

R16—R21

Argument registers. In a standard call, up to six nonfloating-point items of the argument list are passed in these registers. In a standard call, these registers may be modified by the called procedure without being saved and restored.

R22—R24

Conventional scratch registers. In a standard call, these registers may be modified by the called procedure without being saved and restored.

R25

Argument information (AI) register. In a standard call, this register describes the argument list. (See Section 3.6.1 for a detailed description). In a standard call, this register may be modified by the called procedure without being saved and restored.

R26

Return address (RA) register. In a standard call, the return address must be passed in this register. In a standard call, this register may be modified by the called procedure without being saved and restored.

R27

Procedure value (PV) register. In a standard call, the procedure value of the procedure being called is passed in this register. In a standard call, this register may be modified by the called procedure without being saved and restored.

R28

Volatile scratch register. The contents of this register are always unpredictable after any external transfer of control either to or from a procedure. This applies to both standard and nonstandard calls. This register may be used by the operating system for external call fixup, autoloading, and exit sequences.

R29

Frame pointer (FP). The contents of this register define, among other things, which procedure is considered current. Details of usage and alignment are defined in Section 3.5.

R30

Stack pointer (SP). This register contains a pointer to the top of the current operating stack. Aspects of its usage and alignment are defined by the hardware architecture. Various software aspects of its usage and alignment are defined in Section 3.6.1.

R31

ReadAsZero/Sink (RZ). Hardware defines binary 0 as a source operand and sink (no effect) as a result operand.

3.1.2. Floating-Point Registers

This standard defines the usage of the Alpha general-purpose floating-point registers as listed in Table 3.2.
Table 3.2. Alpha Floating-Point Register Usage

Register

Usage

F0

Floating-point function value register. In a standard call that returns a floating-point result in a register, this register is used to return the real part of the result. In a standard call, this register may be modified by the called procedure without being saved and restored.

F1

Floating-point function value register. In a standard call that returns a complex floating-point result in registers, this register is used to return the imaginary part of the result. In a standard call, this register may be modified by the called procedure without being saved and restored.

F2—F9

Conventional saved registers. If a standard-conforming procedure modifies one of these registers, it must save and restore it.

F10—F15

Conventional scratch registers. In a standard call, these registers may be modified by the called procedure without being saved and restored.

F16—F21

Argument registers. In a standard call, up to six floating-point arguments may be passed by value in these registers. In a standard call, these registers may be modified by the called procedure without being saved and restored.

F22—F30

Conventional scratch registers. In a standard call, these registers may be modified by the called procedure without being saved and restored.

F31

ReadAsZero/Sink. Hardware defines binary 0 as a source operand and sink (no effect) as a result operand.

3.2. Address Representation

An address is a 64-bit value used to denote a position in memory. However, for compatibility with OpenVMS VAX, many Alpha applications and user-mode facilities operate in such a manner that addresses are restricted only to values that are representable in 32 bits. This allows Alpha addresses often to be stored and manipulated as 32-bit longword values. In such cases, the 32-bit address value is always implicitly or explicitly sign-extended to form a 64-bit address for use by the Alpha hardware.

3.3. Procedure Representation

One distinguishing characteristic of any calling standard is how procedures are represented. The term used to denote the value that uniquely identifies a procedure is a procedure value. If the value identifies a bound procedure, it is called a bound procedure value.

In the Alpha portion of this calling standard, all procedure values are defined to be the address of the data structure (a procedure descriptor) that describes that procedure. So, any procedure can be invoked by calling the address stored at offset 8 from the address represented by the procedure value.

Note that a simple (unbound) procedure value is defined as the address of that procedure's descriptor (see Section 3.4). This provides slightly different conventions than would be used if the address of the procedure's code were used as it is in many calling standards.

A bound procedure value is defined as the address of a bound procedure descriptor that provides the necessary information for the bound procedure to be called (see Section 3.6.4).

3.4. Procedure Types

This standard defines the following basic types of procedures:
  • Stack frame procedure—Maintains its caller's context on the stack.

  • Register frame procedure—Maintains its caller's context in registers.

  • Null frame procedure—Does not establish a context and, therefore, executes in the context of its caller.

A compiler can choose which type of procedure to generate based on the requirements of the procedure in question. A calling procedure does not need to know what type of procedure it is calling.

Every procedure must have an associated structure that describes which type of procedure it is and other procedure characteristics. This structure, called a procedure descriptor, is a quadword-aligned data structure that provides basic information about a procedure. This data structure is used to interpret the call stack at any point in a thread's execution. It is typically built at compile time and usually is not accessed at run-time except to support exception processing or other rarely executed code.

Read access to procedure descriptors is done through a procedure interface described in Section 3.5.2. This allows for future compatible extensions to these structures.

The purpose of defining a procedure descriptor for a procedure and making that procedure descriptor accessible to the run-time system is twofold:
  • To make invocations of that procedure visible to and interpretable by facilities such as the debugger, exception handling system, and the unwinder.

  • To ensure that the context of the caller saved by the called procedure can be restored if an unwind occurs. (For a description of unwinding, see Section 9.7).

3.4.1. Stack Frame Procedures

The stack frame of a procedure consists of a fixed part (the size of which is known at compile time) and an optional variable part. Certain optimizations can be done if the optional variable part is not present. Compilers must also recognize unusual situations, such as the following, that can effectively cause a variable part of the stack to exist:
  • A called routine may use the stack as a means to return certain types of function values (see Section 3.7.7 for more information).

  • A called routine that allocates stack space may take an exception in its routine prologue before it becomes current. This situation must be considered because the stack expansion happens in the context of the caller (see Section 3.5 and Section 3.6.5 for more information).

    For this reason, a fixed-stack usage version of this procedure type cannot make standard calls.

The variable-stack usage version of this type of procedure is referred to as full function and can make standard calls to other procedures.

3.4.2. Procedure Descriptor for Procedures with a Stack Frame

A stack frame procedure descriptor (PDSC) built by a compiler provides information about a procedure with a stack frame. The minimum size of the descriptor is 32 bytes defined by constant C. An optional PDSC extension in 8-byte increments supports exception handling requirements.

The fields defined in the stack frame descriptor are illustrated in Figure 3.1 and described in Table 3.3.

Figure 3.1. Stack Frame Procedure Descriptor (PDSC)
Stack Frame Procedure Descriptor (PDSC)
Table 3.3. Contents of Stack Frame Procedure Descriptor (PDSC)
Field NameContents

PDSC$W_FLAGS

The PDSC descriptor flag bits <15:0> are defined as follows:

PDSC$V_KIND

A 4-bit field <3:0> that identifies the type of procedure descriptor. For a procedure with a stack frame, this field must specify a value 9 (defined by constant PDSC$K_KIND_FP_STACK).

PDSC$V_HANDLER_VALID

If set to 1, this descriptor has an extension for the stack handler (PDSC$Q_STACK_HANDLER) information.

PDSC$V_HANDLER_
REINVOKABLE

If set to 1, the handler can be reinvoked, allowing an occurrence of another exception while the handler is already active. If this bit is set to 0, the exception handler cannot be reinvoked. Note that this bit must be 0 when PDSC$V_HANDLER_VALID is 0.

PDSC$V_HANDLER_DATA_
VALID

If set to 1, the HANDLER_VALID bit must be 1, the PDSC extension STACK_HANDLER_DATA field contains valid data for the exception handler, and the address of PDSC$Q_ STACK_HANDLER_DATA will be passed to the exception handler as defined in Section 9.2.

PDSC$V_BASE_REG_IS_FP

If this bit is set to 0, the SP is the base register to which PDSC$L_SIZE is added during an unwind. A fixed amount of storage is allocated in the procedure entry sequence, and SP is modified by this procedure only in the entry and exit code sequence. In this case, FP typically contains the address of the procedure descriptor for the procedure. A procedure for which this bit is 0 cannot make standard calls.

If this bit is set to 1, FP is the base address and the procedure has a minimum amount of stack storage specified by PDSC$L_SIZE. A variable amount of stack storage can be allocated by modifying SP in the entry and exit code of this procedure.

PDSC$V_REI_RETURN

If set to 1, the procedure expects the stack at entry to be set, so an REI instruction correctly returns from the procedure. Also, if set, the contents of the RSA$Q_SAVED_RETURN field in the register save area are unpredictable and the return address is found on the stack (see Figure 3.4).

Bit 9

Must be 0 (reserved).

PDSC$V_BASE_FRAME

For compiled code, this bit must be set to 0. If set to 1, this bit indicates the logical base frame of a stack that precedes all frames corresponding to user code. The interpretation and use of this frame and whether there are any predecessor frames is system software defined (and subject to change).

PDSC$V_TARGET_INVO

If set to 1, the exception handler for this procedure is invoked when this procedure is the target invocation of an unwind. Note that a procedure is the target invocation of an unwind if it is the procedure in which execution resumes following completion of the unwind. For more information, see Chapter 9.

If set to 0, the exception handler for this procedure is not invoked. Note that when PDSC$V_HANDLER_VALID is 0, this bit must be 0.

PDSC$V_NATIVE

For compiled code, this bit must be set to 1.

PDSC$V_NO_JACKET

For compiled code, this bit must be set to 1.

PDSC$V_TIE_FRAME

For compiled code, this bit must be 0. Reserved for use by system software.

Bit 15

Must be 0 (reserved).

PDSC$W_RSA_
OFFSET

Signed offset in bytes between the stack frame base (SP or FP as indicated by PDSC$V_BASE_REG_IS_FP) and the register save area. This field must be a multiple of 8, so that PDSC$W_RSA_OFFSET added to the contents of SP or FP (PDSC$V_BASE_REG_IS_FP) yields a quadword-aligned address.

PDSC$V_FUNC_
RETURN

A 4-bit field <11:8> that describes which registers are used for the function value return (if there is one) and what format is used for those registers.

Table 6.4 lists and describes the possible encoded values of PDSC$V_FUNC_RETURN.

PDSC$V_
EXCEPTION_MODE

A 3-bit field <14:12> that encodes the caller's desired exception-reporting behavior when calling certain mathematically oriented library routines. These routines generally search up the call stack to find the desired exception behavior whenever an error is detected. This search is performed independent of the setting of the Alpha FPCR. The possible values for this field are defined as follows:

Value

Name

Meaning

0

PDSC$K_EXC_
MODE_SIGNAL

Raise exceptions for all error conditions except for underflows producing a 0 result. This is the default mode.

1

PDSC$K_EXC_
MODE_SIGNAL_ALL

Raise exceptions for all error conditions (including underflow).

2

PDSC$K_EXC_
MODE_SIGNAL_
SILENT

Raise no exceptions. Create only finite values (no infinities, denormals, or NaNs). In this mode, either the function result or the C language errno variable must be examined for any error indication.

3

PDSC$K_EXC_
MODE_FULL_IEEE

Raise no exceptions except as controlled by separate IEEE exception enable bits. Create infinities, denormals, or NaN values according to the IEEE floating-point standard.

4

PDSC$K_EXC_
MODE_CALLER

Perform the exception-mode behavior specified by this procedure's caller.

PDSC$W_
SIGNATURE_
OFFSET

A 16-bit signed byte offset from the start of the procedure descriptor. This offset designates the start of the procedure signature block (if any). A 0 in this field indicates that no signature information is present. Note that in a bound procedure descriptor (as described in Section 3.6.4), signature information might be present in the related procedure descriptor. A 1 in this field indicates a standard default signature. An offset value of 1 is not otherwise a valid offset because both procedure descriptors and signature blocks must be quadword aligned.

PDSC$Q_ENTRY

Absolute address of the first instruction of the entry code sequence for the procedure.

PDSC$L_SIZE

Unsigned size, in bytes, of the fixed portion of the stack frame for this procedure. The size must be a multiple of 16 bytes to maintain the minimum stack alignment required by the Alpha hardware architecture and stack alignment during a call (defined in Section 3.6.1). PDSC$L_SIZE cannot be 0 for a stack-frame type procedure, because the stack frame must include space for the register save area.

The value of SP at entry to this procedure can be calculated by adding PDSC$L_SIZE to the value SP or FP, as indicated by PDSC$V_BASE_REG_IS_FP.

PDSC$W_ENTRY_
LENGTH

Unsigned offset, in bytes, from the entry point to the first instruction in the procedure code segment following the procedure prologue (that is, following the instruction that updates FP to establish this procedure as the current procedure).

PDSC$L_IREG_MASK

Bit vector (0-31) specifying the integer registers that are saved in the register save area on entry to the procedure. The least significant bit corresponds to register R0. Never set bits 31, 30, 28, 1, and 0 of this mask, because R31 is the integer read-as-zero register, R30 is the stack pointer, R28 is always assumed to be destroyed during a procedure call or return, and R1 and R0 are never preserved registers. In this calling standard, bit 29 (corresponding to the FP) must always be set.

PDSC$L_FREG_MASK

Bit vector (0-31) specifying the floating-point registers saved in the register save area on entry to the procedure. The least significant bit corresponds to register F0. Never set bit 31 of this mask, because it corresponds to the floating-point read-as-zero register.

PDSC$Q_STACK_
HANDLER

Absolute address to the procedure descriptor for a run-time static exception handling procedure. This part of the procedure descriptor is optional. It must be supplied if either PDSC$V_HANDLER_VALID is 1 or PDSC$V_HANDLER_DATA_VALID is 1 (which requires that PDSC$V_HANDLER_VALID be 1).

If PDSC$V_HANDLER_VALID is 0, then the contents or existence of PDSC$Q_STACK_HANDLER is unpredictable.

PDSC$Q_STACK_
HANDLER_DATA

Data (quadword) for the exception handler. This is an optional quadword and needs to be supplied only if PDSC$V_HANDLER_DATA_VALID is 1.

If PDSC$V_HANDLER_DATA_VALID is 0, then the contents or existence of PDSC$Q_STACK_HANDLER_DATA is unpredictable.

3.4.3. Stack Frame Format

The stack of a stack frame procedure consists of a fixed part (the size of which is known at compile time) and an optional variable part. There are two basic types of stack frames:
  • Fixed size

  • Variable size

Even though the exact contents of a stack frame are determined by the compiler, all stack frames have common characteristics.

Various combinations of PDSC$V_BASE_REG_IS_FP and PDSC$L_SIZE can be used as follows:
  • When PDSC$V_BASE_REG_IS_FP is 0 and PDSC$L_SIZE is 0, then the procedure utilizes no stack storage and SP contains the value of SP at entry to the procedure. (Such a procedure must be a register frame procedure).

  • When PDSC$V_BASE_REG_IS_FP is 0 and PDSC$L_SIZE is a nonzero value, then the procedure has a fixed amount of stack storage specified by PDSC$L_SIZE, all of which is allocated in the procedure entry sequence, and SP is modified by this procedure only in the entry and exit code sequences. (Such a procedure may not make standard calls).

  • When PDSC$V_BASE_REG_IS_FP is 1 and PDSC$L_SIZE is a nonzero value, then the procedure has a fixed amount of stack storage specified by PDSC$L_SIZE, and may have a variable amount of stack storage allocated by modifying SP in the body of the procedure. (Such a procedure must be a stack frame procedure).

  • The combination when PDSC$V_BASE_REG_IS_FP is 1 and PDSC$L_SIZE is 0 is illegal because it violates the rules for R29 (FP) usage that requires R29 to be saved (on the stack) and restored.

3.4.3.1. Fixed-Size Stack Frame

Figure 3.2 illustrates the format of the stack frame for a procedure with a fixed amount of stack that uses the SP register as the stack base pointer (when PDSC$V_BASE_REG_IS_FP is 0). In this case, R29 (FP) typically contains the address of the procedure descriptor for the current procedure (see Section 3.5.1).

Some parts of the stack frame are optional and occur only as required by the particular procedure. As shown in the figure, the field names within brackets are optional fields. Use of the arguments passed in memory field appending the end of the descriptor is described in Section 3.4.3.3 and Section 3.7.2.

For information describing the fixed temporary locations and register save area, see Section 3.4.3.3 and Section 3.4.3.4.

Figure 3.2. Fixed-Size Stack Frame Format
Fixed-Size Stack Frame Format

3.4.3.2. Variable-Size Stack Frame

Figure 3.3 illustrates the format of the stack frame for procedures with a varying amount of stack when PDSC$V_BASE_REG_IS_FP is 1. In this case, R29 (FP) contains the address that points to the base of the stack frame on the stack. This frame-base quadword location contains the address of the current procedure's descriptor.

Figure 3.3. Variable-Size Stack Frame Format
Variable-Size Stack Frame Format

Some parts of the stack frame are optional and occur only as required by the particular procedure. In Figure 3.3, field names within brackets are optional fields. Use of the arguments passed in memory field appending the end of the descriptor is described in Section 3.4.3.3 and Section 3.7.2.

For more information describing the fixed temporary locations and register save area, see Section 3.4.3.3 and Section 3.4.3.4.

A compiler can use the stack temporary area pointed to by the SP base register for fixed local variables, such as constant-sized data items and program state, as well as for dynamically sized local variables. The stack temporary area may also be used for dynamically sized items with a limited lifetime, for example, a dynamically sized function result or string concatenation that cannot be stored directly in a target variable. When a procedure uses this area, the compiler must keep track of its base and reset SP to the base to reclaim storage used by temporaries.

3.4.3.3. Fixed Temporary Locations for All Stack Frames

The fixed temporary locations are optional sections of any stack frame that contain language-specific locations required by the procedure context of some high-level languages. This may include, for example, register spill area, language-specific exception handling context (such as language-dynamic exception handling information), fixed temporaries, and so on.

The argument home area (if allocated by the compiler) can be found with the PDSC$L_SIZE offset in the last fixed temporary locations at the end of the stack frame. It is adjacent to the arguments passed in memory area to expedite the use of arguments passed (without copying). The argument home area is a region of memory used by the called procedure for the purpose of assembling in contiguous memory the arguments passed in registers, adjacent to the arguments passed in memory, so all arguments can be addressed as a contiguous array. This area can also be used to store arguments passed in registers if an address for such an argument must be generated. Generally, 6 * 8 bytes of stack storage is allocated for this purpose by the called procedure.

If a procedure needs to reference its arguments as a longword array or construct a structure that looks like an in-memory longword argument list, then it might allocate enough longwords in this area to hold all of the argument list and, optionally, an argument count. In that case, argument items passed in memory must be copied to this longword array.

The high-address end of the stack frame is defined by the value stored in PDSC$L_SIZE plus the contents of SP or FP, as indicated by PDSC$V_BASE_REG_IS_FP. The high-address end is used to determine the value of SP for the predecessor procedure in the calling chain.

3.4.3.4. Register Save Area for All Stack Frames

The register save area is a set of consecutive quadwords in which registers saved and restored by the current procedure are stored (see Figure 3.4). The register save area begins at the location pointed to by the offset PDSC$W_RSA_OFFSET from the frame base register (SP or FP as indicated by PDSC$V_BASE_REG_IS_FP), which must yield a quadword-aligned address. The set of registers saved in this area contain the return address followed by the registers specified in the procedure descriptor by PDSC$L_IREG_MASK and PDSC$L_FREG_MASK.

All registers saved in the register save area (other than the saved return address) must have the corresponding bit set in the appropriate procedure descriptor register save mask even if the register is not a member of the set of registers required to be saved across a standard call. Failure to do so will prevent the correct calculation of offsets within the save area.

Figure 3.4 illustrates the fields in the register save area (field names within brackets are optional fields). Quadword RSA$Q_SAVED_RETURN is the first field in the save area and it contains the contents of the return address register. The optional fields vary in size (8-byte increments) to preserve, as required, the contents of the integer and floating-point hardware registers used in the procedure.

Figure 3.4. Register Save Area (RSA) Layout
Register Save Area (RSA) Layout
The algorithm for packing saved registers in the quadword-aligned register save area is:
  1. The return address is saved at the lowest address of the register save area (offset 0).

  2. All saved integer registers (as indicated by the corresponding bit in PDSC$L_IREG_MASK being set to 1) are stored, in register-number order, in consecutive quadwords, beginning at offset 8 of the register save area.

  3. All saved floating-point registers (as indicated by the corresponding bit in PDSC$L_FREG_MASK being set to 1) are stored, in register-number order, in consecutive quadwords, following the saved integer registers.

    Note

    Floating-point registers saved in the register save area are stored as a 64-bit exact image of the register (for example, no reordering of bits is done on the way to or from memory). Compilers must use an STT instruction to store the register regardless of floating-point type.

The preserved register set must always include R29 (FP), because it will always be used.

If the return address register is not to be preserved (as is the case for a standard call), then it must be stored at offset 0 in the register save area and the corresponding bit in the register save mask must not be set.

However, if a nonstandard call is made that requires the return address register to be saved and restored, then it must be stored in both the location at offset 0 in the register save area and at the appropriate location within the variable part of the save area. In addition, the appropriate bit of PDSC$L_IREG_MASK must be set to 1.

The example register save area shown in Figure 3.5 illustrates the register packing when registers R10, R11, R15, FP, F2, and F3 are being saved for a procedure called with a standard call.

Figure 3.5. Register Save Area (RSA) Example
Register Save Area (RSA) Example

3.4.4. Register Frame Procedure

A register frame procedure does not maintain a call frame on the stack and must, therefore, save its caller's context in registers. This type of procedure is sometimes referred to as a lightweight procedure, referring to the expedient way of saving the call context.

Such a procedure cannot save and restore nonscratch registers. Because a procedure without a stack frame must use scratch registers to maintain the caller's context, such a procedure cannot make a standard call to any other procedure.

A procedure with a register frame can have an exception handler and can handle exceptions in the normal way. Such a procedure can also allocate local stack storage in the normal way, although it might not necessarily do so.

Note

Lightweight procedures have more freedom than might be apparent. By using appropriate agreements with callers of the lightweight procedure, with procedures that the lightweight procedure calls, and by the use of unwind handlers, a lightweight procedure can modify nonscratch registers and can call other procedures.

Such agreements may be by convention (as in the case of language-support routines in the RTL) or by interprocedural analysis. However, calls employing such agreements are not standard calls and might not be fully supported by a debugger; for example, the debugger might not be able to find the contents of the preserved registers.

Because such agreements must be permanent (for upwards compatibility of object code), lightweight procedures should, in general, follow the normal restrictions.

3.4.5. Procedure Descriptor for Procedures with a Register Frame

A register frame procedure descriptor built by a compiler provides information about a procedure with a register frame. The minimum size of the descriptor is 24 bytes (defined by PDSC$K_MIN_REGISTER_SIZE). An optional PDSC extension in 8-byte increments supports exception handling requirements.

The fields defined in the register frame procedure descriptor are illustrated in Figure 3.6 and described in Table 3.4.

Figure 3.6. Register Frame Procedure Descriptor (PDSC)
Register Frame Procedure Descriptor (PDSC)
Table 3.4. Contents of Register Frame Procedure Descriptor (PDSC)

Field Name

Contents

PDSC$W_FLAGS

The PDSC descriptor flag bits <15:0> are defined as follows:

PDSC$V_KIND

A 4-bit field <3:0> that identifies the type of procedure descriptor. For a procedure with a register frame, this field must specify a value 10 (defined by constant PDSC$K_KIND_FP_REGISTER).

PDSC$V_HANDLER_VALID

If set to 1, this descriptor has an extension for the stack handler (PDSC$Q_REG_HANDLER) information.

PDSC$V_HANDLER_
REINVOKABLE

If set to 1, the handler can be reinvoked, allowing an occurrence of another exception while the handler is already active. If this bit is set to 0, the exception handler cannot be reinvoked. This bit must be 0 when PDSC$V_HANDLER_VALID is 0.

PDSC$V_HANDLER_
DATA_VALID

If set to 1, the HANDLER_VALID bit must be 1 and the PDSC extension STACK_HANDLER_DATA field contains valid data for the exception handler, and the address of PDSC$Q_STACK_HANDLER _DATA will be passed to the exception handler as defined in Section 9.2.

PDSC$V_BASE_REG_IS_FP

If this bit is set to 0, the SP is the base register to which PDSC$L_SIZE is added during an unwind. A fixed amount of storage is allocated in the procedure entry sequence, and SP is modified by this procedure only in the entry and exit code sequence. In this case, FP typically contains the address of the procedure descriptor for the procedure. Note that a procedure that sets this bit to 0 cannot make standard calls.

If this bit is set to 1, FP is the base address and the procedure has a fixed amount of stack storage specified by PDSC$L_SIZE. A variable amount of stack storage can be allocated by modifying SP in the entry and exit code of this procedure.

PDSC$V_REI_RETURN

If set to 1, the procedure expects the stack at entry to be set, so an REI instruction correctly returns from the procedure. Also, if set, the contents of the PDSC$B_SAVE_RA field are unpredictable and the return address is found on the stack.

Bit 9

Must be 0 (reserved).

PDSC$V_BASE_FRAME

For compiled code, this bit must be 0. If set to 1, this bit indicates the logical base frame of a stack that precedes all frames corresponding to user code. The interpretation and use of this frame and whether there are any predecessor frames is system software defined (and subject to change).

PDSC$V_TARGET_INVO

If set to 1, the exception handler for this procedure is invoked when this procedure is the target invocation of an unwind. Note that a procedure is the target invocation of an unwind if it is the procedure in which execution resumes following completion of the unwind. For more information, see Chapter 9.

If set to 0, the exception handler for this procedure is not invoked. Note that when PDSC$V_HANDLER_VALID is 0, this bit must be 0.

PDSC$V_NATIVE

For compiled code, this bit must be set to 1.

PDSC$V_NO_JACKET

For compiled code, this bit must be set to 1.

PDSC$V_TIE_FRAME

For compiled code, this bit must be 0. Reserved for use by system software.

Bit 15

Must be 0 (reserved).

PDSC$B_SAVE_FP

Specifies the number of the register that contains the saved value of the frame pointer (FP) register.

In a standard procedure, this field must specify a scratch register so as not to violate the rules for procedure entry code as specified in Section 3.6.5.

PDSC$B_SAVE_RA

Specifies the number of the register that contains the return address. If this procedure uses standard call conventions and does not modify R26, then this field can specify R26.

In a standard procedure, this field must specify a scratch register so as not to violate the rules for procedure entry code as specified in Section 3.6.5.

PDSC$V_FUNC_
RETURN

A 4-bit field <11:8> that describes which registers are used for the function value return (if there is one) and what format is used for those registers.

Table 6.4 lists and describes the possible encoded values of PDSC$V_FUNC_RETURN.

PDSC$V_
EXCEPTION_MODE

A 3-bit field <14:12> that encodes the caller's desired exception-reporting behavior when calling certain mathematically oriented library routines. These routines generally search up the call stack to find the desired exception behavior whenever an error is detected. This search is performed independent of the setting of the Alpha FPCR. The possible values for this field are defined as follows:

Value

Name

Meaning

0

PDSC$K_EXC_
MODE_SIGNAL

Raise exceptions for all error conditions except for underflows producing a 0 result. This is the default mode.

1

PDSC$K_EXC_
MODE_SIGNAL_ALL

Raise exceptions for all error conditions (including underflows).

2

PDSC$K_EXC_
MODE_SIGNAL_
SILENT

Raise no exceptions. Create only finite values (no infinities, denormals, or NaNs). In this mode, either the function result or the C language errno variable must be examined for any error indication.

3

PDSC$K_EXC_
MODE_FULL_IEEE

Raise no exceptions except as controlled by separate IEEE exception enable bits. Create infinities, denormals, or NaN values according to the IEEE floating-point standard.

4

PDSC$K_EXC_
MODE_CALLER

Perform the exception-mode behavior specified by this procedure's caller.

PDSC$W_
SIGNATURE_OFFSET

A 16-bit signed byte offset from the start of the procedure descriptor. This offset designates the start of the procedure signature block (if any). A 0 in this field indicates no signature information is present. Note that in a bound procedure descriptor (as described in Section 3.6.4), signature information might be present in the related procedure descriptor. A 1 in this field indicates a standard default signature. An offset value of 1 is not otherwise a valid offset because both procedure descriptors and signature blocks must be quadword aligned.

PDSC$Q_ENTRY

Absolute address of the first instruction of the entry code sequence for the procedure.

PDSC$L_SIZE

Unsigned size in bytes of the fixed portion of the stack frame for this procedure. The size must be a multiple of 16 bytes to maintain the minimum stack alignment required by the Alpha hardware architecture and stack alignment during a call (defined in Section 3.6.1).

PDSC$W_ENTRY_
LENGTH

Unsigned offset in bytes from the entry point to the first instruction in the procedure code segment following the procedure prologue (that is, following the instruction that updates FP to establish this procedure as the current procedure).

PDSC$Q_REG_
HANDLER

Absolute address to the procedure descriptor for a run-time static exception handling procedure. This part of the procedure descriptor is optional. It must be supplied if either PDSC$V_HANDLER_VALID is 1 or PDSC$V_HANDLER_DATA_VALID is 1 (which requires that PDSC$V_HANDLER_VALID be 1).

If PDSC$V_HANDLER_VALID is 0, then the contents or existence of PDSC$Q_REG_HANDLER is unpredictable.

PDSC$Q_REG_
HANDLER_DATA

Data (quadword) for the exception handler. This is an optional quadword and needs to be supplied only if PDSC$V_HANDLER_DATA_VALID is 1.

If PDSC$V_HANDLER_DATA_VALID is 0, then the contents or existence of PDSC$Q_REG_HANDLER_DATA is unpredictable.

3.4.6. Null Frame Procedures

A procedure may conform to this standard even if it does not establish its own context if, in all circumstances, invocations of that procedure do not need to be visible or debuggable. This is termed executing in the context of the caller and is similar in concept to a conventional VAX JSB procedure. For the purposes of stack tracing or unwinding, such a procedure is never considered to be current.

For example, if a procedure does not establish an exception handler or does not save and restore registers, and does not extend the stack, then that procedure might not need to establish a context. Likewise, if that procedure does extend the stack, it still might not need to establish a context if the immediate caller either cannot be the target of an unwind or is prepared to reset the stack if it is the target of an unwind.

The circumstances under which procedures can run in the context of the caller are complex and are not fully specified by this standard.

As with the other procedure types previously described, the choice of whether to establish a context belongs to the called procedure. By defining a null procedure descriptor format, the same invocation code sequence can be used by the caller for all procedure types.

3.4.7. Procedure Descriptor for Null Frame Procedures

The null frame procedure descriptor built by a compiler provides information about a procedure with no frame. The size of the descriptor is 16 bytes (defined by PDSC$K_NULL_SIZE).

The fields defined in the null frame descriptor are illustrated in Figure 3.7 and described in Table 3.5.

Figure 3.7. Null Frame Procedure Descriptor (PDSC) Format
Null Frame Procedure Descriptor (PDSC) Format
Table 3.5. Contents of Null Frame Procedure Descriptor (PDSC)

Field Name

Contents

PDSC$W_FLAGS

The PDSC descriptor flag bits <15:0> are defined as follows:

PDSC$V_KIND

A 4-bit field <3:0> that identifies the type of procedure descriptor. For a null frame procedure, this field must specify a value 8 (defined by constant PDSC$K_KIND_NULL).

Bits 4—7

Must be 0.

PDSC$V_REI_
RETURN

Bit 8. If set to 1, the procedure expects the stack at entry to be set, so an REI instruction correctly returns from the procedure. Also, if set, the contents of the PDSC$B_SAVE_RA field are unpredictable and the return address is found on the stack.

Bit 9

Must be 0 (reserved).

PDSC$V_BASE_
FRAME

For compiled code, this bit must be 0. If set to 1, indicates the logical base frame of a stack that precedes all frames corresponding to user code. The interpretation and use of this frame and whether there are any predecessor frames is system software defined (and subject to change).

Bit 11

Must be 0 (reserved).

PDSC$V_NATIVE

For compiled code, this bit must be set to 1.

PDSC$V_NO_JACKET

For compiled code, this bit must be set to 1.

PDSC$V_TIE_FRAME

For compiled code, this bit must be 0. Reserved for use by system software.

Bit 15

Must be 0 (reserved).

PDSC$V_FUNC_RETURN

A 4-bit field <11:8> that describes which registers are used for the function value return (if there is one) and what format is used for those registers.

Table 6.4 lists and describes the possible encoded values of PDSC$V_FUNC_RETURN.

PDSC$W_SIGNATURE_
OFFSET

A 16-bit signed byte offset from the start of the procedure descriptor. This offset designates the start of the procedure signature block (if any). A 0 in this field indicates that no signature information is present. Note that in a bound procedure descriptor (as described in Section 3.6.4), signature information might be present in the related procedure descriptor. A 1 in this field indicates a standard default signature. An offset value of 1 is not otherwise a valid offset because both procedure descriptors and signature blocks must be quadword aligned.

PDSC$Q_ENTRY

The absolute address of the first instruction of the entry code sequence for the procedure.

3.5. Procedure Call Stack

Except for null-frame procedures, a procedure is an active procedure while its body is executing, including while any procedure it calls is executing. When a procedure is active, it may handle an exception that is signaled during its execution.

Associated with each active procedure is an invocation context, which consists of the set of registers and space in memory that is allocated and that may be accessed during execution for a particular call of that procedure.

When a procedure begins to execute, it has no invocation context. The initial instructions that allocate and initiallize its context, which may include saving information from the invocation context of its caller, are termed the procedure prologue. Once execution of the prologue is complete, the procedure is said to be active.

When a procedure is ready to return to its caller, the instructions that deallocate and discard the procedure's invocation context (which may include restoring state of the caller's invocation context that was saved during the prologue), are termed a procedure epilogue. A procedure ceases to be active when execution of its epilogue begins.

A procedure may have more than one prologue if there are multiple entry points. A procedure may also have more than one epilogue if there are multiple return points. One of each will be executed during any given invocation of the procedure.

Some procedures, notably null frame procedures (see Section Section 3.4.6), never have an invocation context of their own and are said to execute in the body of their caller. A null frame procedure has no prologue or epilogue, and consists solely of body instructions. Such a procedure never becomes current or active in the sense that its handler may be invoked.

A call stack (for a thread) consists of the stack of invocation contexts that exists at any point in time. New invocation contexts are pushed on that stack as procedures are called and invocations are popped from the call stack as procedures return.

The invocation context of a procedure that calls another procedure is said to precede or be previous to the invocation context of the called procedure.

3.5.1. Current Procedure

The current procedure is the active procedure whose execution began most recently; its invocation context is at the top of the call stack. Note that a procedure executing in its prologue or epilogue is not active, and hence cannot be the current procedure. Similarly, a null frame procedure cannot be the current procedure.

In this calling standard, R29 is the frame pointer (FP) register that defines the current procedure.

Therefore, the current procedure must always maintain in FP one of the following pointer values:
  • Pointer to the procedure descriptor for that procedure.

  • Pointer to a naturally aligned quadword containing the address of the procedure descriptor for that procedure. For purposes of finding a procedure's procedure descriptor, no assumptions must be made about the quadword location. As long as all other requirements of this standard are met, a compiler is free to use FP as a base register for any arbitrary storage, including a stack frame, provided that while the procedure is current, the quadword pointed to by the value in FP contains the address of that procedure's descriptor.

At any point in time, the FP value can be interpreted to find the procedure descriptor for the current procedure by examining the value at 0(FP) as follows:
  • If 0(FP)<2:0> = 0, then FP points to a quadword that contains a pointer to the procedure descriptor for the current procedure.

  • If 0(FP)<2:0> ≠ 0, then FP points to the procedure descriptor for the current procedure.

By examining the first quadword of the procedure descriptor, the procedure type can be determined from the PDSC$V_KIND field.

The following code is an example of how the current procedure descriptor and procedure type can be found:
        LDQ     R0,0(FP)        ;Fetch quadword at FP
        AND     R0,#7,R28       ;Mask alignment bits
        BNEQ    R28,20$         ;Is procedure descriptor pointer
        LDQ     R0,0(R0)        ;Was pointer to procedure descriptor
10$:    AND     R0,#7,R28       ;Do sanity check
        BNEQ    R28,20$         ;All is well

        ;Error - Invalid FP

20$:    AND     R0,#15,R0       ;Get kind bits

        ;Procedure KIND is now in R0

IF PDSC$V_KIND is equal to PDSC$K_KIND_FP_STACK, the current procedure has a stack frame.

If PDSC$V_KIND is equal to PDSC$K_KIND_FP_REGISTER, the current procedure is a register frame procedure.

Either type of procedure can use either type of mechanism to point to the procedure descriptor. Compilers may choose the appropriate mechanism to use based on the needs of the procedure involved.

3.5.2. Procedure Call Tracing

Mechanisms for each of the following functions are needed to support procedure call tracing:
  • To provide the context of a procedure invocation

  • To walk (navigate) the procedure call stack

  • To refer to a given procedure invocation

This section describes the data structure mechanisms. The routines that support these functions are described in Section 3.5.3.

3.5.2.1. Referring to a Procedure Invocation from a Data Structure

When referring to a specific procedure invocation at run-time, an invocation context handle, shown in Figure 3.8, can be used. Defined by constant LIBICB$K_INVO_HANDLE_SIZE, the structure is a single-field longword called HANDLE. HANDLE describes the invocation handle of the procedure.

Figure 3.8. Invocation Context Handle Format
Invocation Context Handle Format
To encode an invocation context handle, follow these steps:
  1. If PDSC$V_BASE_REG_IS_FP is set to 1 in the corresponding procedure descriptor, then set INVO_HANDLE to the contents of the FP register in that invocation.

    If PDSC$V_BASE_REG_IS_FP is set to 0, set INVO_HANDLE to the contents of the SP register in that invocation. (That is, start with the base register value for the frame).

  2. Shift the INVO_HANDLE contents left one bit. Because this value is initially known to be octaword aligned (see Section 3.6.1), the result is a value whose 5 low-order bits are 0.

  3. If PDSC$V_KIND = PDSC$K_KIND_FP_STACK, perform a logical OR on the contents of INVO_HANDLE with the value 1F16, and then set INVO_HANDLE to the value that results.

    If PDSC$V_KIND = PDSC$K_KIND_FP_REGISTER, perform a logical OR on the contents of INVO_HANDLE with the contents of PDSC$B_SAVE_RA, and then set INVO_HANDLE to the value that results.

Note that an invocation context handle is not defined for a null frame procedure.

Note

So you can distinguish an invocation of a register frame procedure that calls another register frame procedure (where the called procedure uses no stack space and therefore has the same base register value as the caller), the register number that saved the return address is included in the invocation handle of a register frame procedure. Similarly, the number 3110 in the invocation handle of a stack frame procedure is included to distinguish an invocation of a stack frame procedure that calls a register frame procedure where the called procedure uses no stack space.

3.5.2.2. Invocation Context Block

The context of a specific procedure invocation is provided through the use of a data structure called an invocation context block. The minimum size of the block is 528 bytes and is system defined using the constant LIBICB$K_INVO_CONTEXT_BLK_SIZE. The size of the last field (LIBICB$Q_SYSTEM_DEFINED[n]) defined by the host system determines the total size of the block.

The fields defined in the invocation context block are illustrated in the following figure and described in Table 3.6.

Figure 3.9. Invocation Context Block Format
Invocation Context Block Format
Table 3.6. Contents of the Invocation Context Block

Field Name

Contents

LIBICB$L_CONTEXT_LENGTH

Unsigned count of the total length in bytes of the context block; this represents the sum of the lengths of the standard-defined portion and the system-defined section.

LIBICB$R_FRAME_FLAGS

The procedure frame flag bits <23:0> are defined as follows:

LIBICB$V_EXCEPTION_
FRAME

Bit 0. If set to 1, the invocation context corresponds to an exception frame.

LIBICB$V_AST_FRAME

Bit 1. If set to 1, the invocation context corresponds to an asynchronous trap.

LIBICB$V_BOTTOM_OF_
STACK

Bit 2. If set to 1, the invocation context corresponds to a frame that has no predecessor.

LIBICB$V_BASE_FRAME

Bit 3. If set to 1, the BASE_FRAME bit is set in the FLAGS field of the associated procedure descriptor.

LIBICB$B_BLOCK_VERSION

A byte that defines the version of the context block. Because this block is currently the first version, the value is set to 1.

LIBICB$PH_PROCEDURE_
DESCRIPTOR

Address of the procedure descriptor for this context.

LIBICB$Q_PROGRAM_
COUNTER

Quadword that contains the current value of the procedure's program counter. For interrupted procedures, this is the same as the continuation program counter; for active procedures, this is the return address back into that procedure.

LIBICB$Q_PROCESSOR_
STATUS

Contains the current value of the processor status.

LIBICB$Q_IREG[n]

Quadword that contains the current value of the integer register in the procedure (where n is the number of the register).

LIBICB$Q_FREG[n]

Quadword that contains the current value of the floating-point register in the procedure (where n is the number of the register).

LIBICB$Q_SYSTEM_
DEFINED[n]

A variable-sized area with locations defined in quadword increments by the host environment that contains procedure context information. These locations are not defined by this standard.

3.5.2.3. Getting a Procedure Invocation Context with a Routine

A thread can obtain its own context or the current context of any procedure invocation in the current stack call (given an invocation handle) by calling the run-time library functions defined in Section 3.5.3.

3.5.2.4. Walking the Call Stack

During the course of program execution, it is sometimes necessary to walk the call stack. Frame-based exception handling is one case where this is done. Call stack navigation is possible only in the reverse direction (in a latest-to-earliest or top-to-bottom sequence).

To walk the call stack, perform the following steps:
  1. Given a program state (which contains a register set), build an invocation context block.

    For the current routine, an initial invocation context block can be obtained by calling the LIB$GET_CURR_INVO_CONTEXT routine. See Section 3.5.3.2.

  2. Repeatedly call the LIB$GET_PREV_INVO_CONTEXT routine until the end of the chain has been reached (as signified by 0 being returned). See Section 3.5.3.3.

    The bottom of stack frame (end of the call chain) is indicated (LIBICB$V_BOTTOM_OF_STACK) when the target frame's saved FP value is 0.

Compilers are allowed to optimize high-level language procedure calls in such a way that they do not appear in the invocation chain. For example, inline procedures never appear in the invocation chain.

Make no assumptions about the relative positions of any memory used for procedure frame information. There is no guarantee that successive stack frames will always appear at higher addresses.

3.5.3. Invocation Context Access Routines

A thread can manipulate the invocation context of any procedure in the thread's virtual address space by calling the following run-time library functions.

3.5.3.1. LIB$GET_INVO_CONTEXT

A thread can obtain the invocation context of any active procedure by using the following function format:
LIB$GET_INVO_CONTEXT(invo_handle, invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_handle

invo_handle

longword (unsigned)

read

by value

invo_context

invo_context_blk

structure

write

by reference

Arguments:

invo_handle

Handle for the desired invocation.

invo_context

Address of an invocation context block into which the procedure context of the frame specified by invo_handle will be written.

Function Value Returned:

status

Status value. A value of 1 indicates success; a value of 0 indicates failure.

Note

If the invocation handle that was passed does not represent any procedure context in the active call stack, the value of the new contents of the context block is unpredictable.

3.5.3.2. LIB$GET_CURR_INVO_CONTEXT

A thread can obtain the invocation context of a current procedure by using the following function format:
LIB$GET_CURR_INVO_CONTEXT(invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

write

by reference

Argument:

invo_context

Address of an invocation context block into which the procedure context of the caller will be written.

Function Value Returned:

Zero

This is to facilitate use in the implementation of the C language unwind setjmp or longjmp function (only).

3.5.3.3. LIB$GET_PREV_INVO_CONTEXT

A thread can obtain the invocation context of the procedure context preceding any other procedure context by using the following function format:
LIB$GET_PREV_INVO_CONTEXT(invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

Argument:

invo_context

Address of an invocation context block. The given invocation context block is updated to represent the context of the previous (calling) frame. The LIBICB$V_BOTTOM_OF_STACK flag of the invocation context block is set if the target frame represents the end of the invocation call chain or if stack corruption is detected.

Function Value Returned:

status

Status value. A value of 1 indicates success. When the initial context represents the bottom of the call stack, a value of 0 is returned. If the current operation completed without error, but a stack corruption was detected at the next level down, a value of 3 is returned.

3.5.3.4. LIB$GET_INVO_HANDLE

A thread can obtain an invocation handle corresponding to any invocation context block by using the following function format:
LIB$GET_INVO_HANDLE(invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

read

by reference

Argument:

invo_context

Address of an invocation context block. Here, only the frame pointer and stack pointer fields of an invocation context block must be defined.

Function Value Returned:

invo_handle

Invocation handle of the invocation context that was passed. If the returned value is LIB$K_INVO_HANDLE_NULL, the invocation context that was passed was invalid.

3.5.3.5. LIB$GET_PREV_INVO_HANDLE

A thread can obtain an invocation handle of the procedure context preceding that of a specified procedure context by using the following function format:
LIB$GET_PREV_INVO_HANDLE(invo_handle)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_handle

invo_handle

longword (unsigned)

read

by value

Argument:

invo_handle

An invocation handle that represents a target invocation context.

Function Value Returned:

invo_handle

An invocation handle for the invocation context that is previous to that which was specified as the target.

3.5.3.6. LIB$PUT_INVO_REGISTERS

A given procedure invocation context's fields can be updated with new register contents by calling a system library function in following format:
LIB$PUT_INVO_REGISTERS(invo_handle, invo_context, invo_mask)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_handle

invo_handle

longword (unsigned)

read

by value

invo_context

invo_context_blk

structure

read

by reference

invo_mask

mask_quadword

quadword (unsigned)

read

by reference

Arguments:

invo_handle

Handle for the invocation to be updated.

invo_context

Address of an invocation context block that contains new register contents.

Each register that is set in the invo_mask parameter, except SP, is updated using the value found in the corresponding IREG or FREG field. The program counter and processor status can also be updated in this way. (The SP register cannot be updated using this routine). No other fields of the invocation context block are used.

invo_mask

Address of a 64-bit bit vector, where each bit corresponds to a register field in the passed invo_context. Bits 0 through 30 correspond to IREG[0] through IREG[30], bit 31 corresponds to PROGRAM_COUNTER, bits 32 through 62 correspond to FREG[0] through FREG[30], and bit 63 corresponds to PROCESSOR_STATUS. (If bit 30, which corresponds to SP, is set, then no changes are made).

Function Value Returned:

status

Status value. A value of 1 indicates success. When the initial context represents the bottom of the call stack or when bit 30 of the invo_mask argument is set, a value of 0 is returned (and nothing is changed).

Caution

While this routine can be used to update the frame pointer (FP), great care must be taken to assure that a valid stack frame and execution environment result; otherwise, execution may become unpredictable.

3.6. Transfer of Control

This standard states that a standard call (see Section 1.4) may be accomplished in any way that presents the called routine with the required environment. However, typically, most standard-conforming external calls are implemented with a common sequence of instructions and conventions. Because a common set of call conventions is so pervasive, these conventions are included for reference as part of this standard.

One important feature of the calling standard is that the same instruction sequence can be used to call each of the different types of procedure. Specifically, the caller does not have to know which type of procedure is being called.

3.6.1. Call Conventions

The call conventions describe the rules and methods used to communicate certain information between the caller and the called procedure during invocation and return. For a standard call, these conventions include the following:
  • Procedure value

    The calling procedure must pass to the called procedure its procedure value. This value can be a statically or dynamically bound procedure value. This is accomplished by loading R27 with the procedure value before control is transferred to the called procedure.

  • Return address

    The calling procedure must pass to the called procedure the address to which control must be returned during a normal return from the called procedure. In most cases, the return address is the address of the instruction following the one that transferred control to the called procedure. For a standard call, this address is passed in the return address register (R26).

  • Argument list

    The argument list is an ordered set of zero or more argument items that together constitute a logically contiguous structure known as an argument item sequence. This logically contiguous sequence is typically mapped to registers and memory in a way that produces a physically discontiguous argument list. In a standard call, the first six items are passed in registers R16—21 or registers F16—21. (See Section 3.7.2 for details of argument-to-register correspondence). The remaining items are collected in a memory argument list that is a naturally aligned array of quadwords. In a standard call, this list (if present) must be passed at 0(SP).

  • Argument information

    The calling procedure must pass to the called procedure information about the argument list. This information is passed in the argument information (AI) register (R25). Defined by AI$K_AI_SIZE, the structure is a quadword as shown in Figure 3.10 with the fields described in Table 3.7.

    Figure 3.10. Argument Information Register (R25) Format
    Argument Information Register (R25) Format
    Table 3.7. Contents of the Argument Information Register (R25)

    Field Name

    Contents

    AI$B_ARG_COUNT

    Unsigned byte <7:0> that specifies the number of 64-bit argument items in the argument list (known as the argument count).

    AI$V_ARG_REG_
    INFO

    An 18-bit vector field <25:8> divided into six groups of 3 bits that correspond to the six arguments passed in registers. These groups describe how each of the first six arguments are passed in registers with the first group <10:8> describing the first argument. The encoding for each group for the argument register usage follows:

    Value

    Name

    Meaning

    0

    AI$K_AR_I64

    64-bit or 32-bit sign-extended to 64-bit argument passed in an integer register (including addresses). or Argument is not present.

    1

    AI$K_AR_FF

    F_floating argument passed in a floating register.

    2

    AI$K_AR_FD

    D_floating argument passed in a floating register.

    3

    AI$K_AR_FG

    G_floating argument passed in a floating register.

    4

    AI$K_AR_FS

    S_floating argument passed in a floating register.

    5

    AI$K_AR_FT

    T_floating argument passed in a floating register.

    6, 7

    Reserved.

    Bits 26—63

    Reserved and must be 0.

  • Function result

    If a standard-conforming procedure is a function and the function result is returned in a register, then the result is returned in R0, F0, or F0 and F1. Otherwise, the function result is returned via the first argument item or dynamically as defined in Section 3.7.7.

  • Stack usage

    Whenever control is transferred to another procedure, the stack pointer (SP) must be octaword aligned; at other times there is no stack alignment requirement. (A side effect of this is that the in-memory portion of the argument list will start on an octaword boundary). During a procedure invocation, the SP (R30) can never be set to a value higher than the SP at entry to that procedure invocation.

    The contents of the stack located above the portion of the argument list that is passed in memory (if any) belongs to the calling procedure and is, therefore, not to be read or written by the called procedure, except as specified by indirect arguments or language-controlled up-level references.

    Because SP is used by the hardware in raising exceptions and asynchronous interrupts, the contents of the next 2048 bytes below the current SP value are continually and unpredictably modified. Software that conforms to this standard must not depend on the contents of the 2048 stack locations below 0(SP).

    Note

    One implication of the stack alignment requirement is that low-level interrupt and exception-fielding software must be prepared to handle and correct the alignment before calling handler routines, in case the stack pointer is not octaword aligned at the time of an interrupt or exception.

3.6.2. Linkage Section

Because the Alpha hardware architecture has the property of instructions that cannot contain full virtual addresses, it is sometimes referred to as a base register architecture. In a base register architecture, normal memory references within a limited range from a given address are expressed by using displacements relative to the contents of a register containing that address (base register). Base registers for external program segments, either data or code, are usually loaded indirectly through a program segment of address constants.

The fundamental program section containing address constants that a procedure uses to access other static storage, external procedures, and variables is termed a linkage section. Any register used to access the contents of the linkage section is termed a linkage pointer.

A procedure's linkage section includes the procedure descriptor for the procedure, addresses of all external variables and procedures referenced by the procedure, and other constants a compiler may choose to reference using a linkage pointer.

When a standard procedure is called, the caller must provide the procedure value for that procedure in R27. Static procedure values are defined to be the address of the procedure's descriptor. Because the procedure descriptor is part of the linkage section, calling this type of procedure value provides a pointer into the linkage section for that procedure in R27. This linkage pointer can then be used by the called procedure as a base register to address locations in its linkage section. For this reason, most compilers generate references to items in the linkage section as offsets from a pointer to the procedure's descriptor.

Compilers usually arrange (as part of the environment setup) to have the environment setup code (for bound procedures) load R27 with the address of the procedure's descriptor so it can be used as a linkage pointer as previously described. For an example, see Section 3.6.4.

Although not required, linkages to external procedures are typically represented in the calling procedure's linkage section as a linkage pair. As shown in Figure 3.11 and described in Table 3.8, a linkage pair (LKP) block with two fields should be octaword aligned and defined by LKP$K_SIZE as 16 bytes.

In general, an object module contains a procedure descriptor for each entry point in the module. The descriptors are allocated in a linkage section. For each external procedure Q that is referenced in a module, the module's linkage section also contains a linkage pair denoting Q (which is a pointer to Q's procedure descriptor and entry code address).

The following code example calls an external procedure Q as represented by a linkage pair. In this example, R4 is the register that currently contains the address of the current procedure's descriptor.
      LDQ  R26,Q_DESC-MY_DESC(R4)   ;Q's entry address into R26
      LDQ  R27,Q_DESC-MY_DESC+8(R4) ;Q's procedure value into R27
      MOVQ #AI_LITERAL,R25          ;Load Argument Information register
      JSR  R26,(R26)                ;Call to Q. Return address in R26

Because Q's procedure descriptor (statically defined procedure value) is in Q's linkage section, Q can use the value in R27 as a base address for accessing data in its linkage section. Q accesses external procedures and data in other program sections through pointers in its linkage section. Therefore, R27 serves as the root pointer through which all data can be referenced.

3.6.3. Calling Computed Addresses

Most calls are made to a fixed address whose value is determined by the time the program starts execution. However, certain cases are possible that cause the exact address to be unknown until the code is finally executed. In this case, the procedure value representing the procedure to be called is computed in a register.

The following code example illustrates a call to a computed procedure value (simple or bound) that is contained in R4:
      LDQ  R26,8(R4)       ;Entry address to scratch register
      MOV  R4,R27          ;Procedure value to R27
      MOV  #AI_LITERAL,R25 ;Load Argument Information register
      JSR  R26,(R26)       ;Call entry address.

For interoperation with translated images, see Chapter 6.

3.6.4. Simple and Bound Procedures

There are two distinct classes of procedures:
  • Simple procedure

  • Bound procedure

A simple procedure is a procedure that does not need direct access to the stack of its execution environment. A bound procedure is a procedure that does need direct access to the stack of its execution environment, typically to reference an up-level variable or to perform a nonlocal GOTO operation. Both a simple procedure and a bound procedure have an associated procedure descriptor, as described in previous sections.

When a bound procedure is called, the caller must pass some kind of pointer to the called code that allows it to reference its up-level environment. Typically, this pointer is the frame pointer for that environment, but many variations are possible. When the caller is executing its program within that outer environment, it can usually make such a call directly to the code for the nested procedure without recourse to any additional procedure descriptors. However, when a procedure value for the nested procedure must be passed outside of that environment to a call site that has no knowledge of the target procedure, a bound procedure descriptor is created so that the nested procedure can be called just like a simple procedure.

Bound procedure values, as defined by this standard, are designed for multilanguage use and utilize the properties of procedure descriptors to allow callers of procedures to use common code to call both bound and simple procedures.

3.6.4.1. Bound Procedure Descriptors

Bound procedure descriptors provide a mechanism to interpose special processing between a call and the called routine without modifying either. The descriptor may contain (or reference) data used as part of that processing. Between native and translated images, the OpenVMS Alpha operating system uses linker and image-activator created bound procedure descriptors to mediate the handling of parameter and result passing (see Section 6.2). Language processors on OpenVMS Alpha systems use bound procedure descriptors to implement bound procedure values (see Section 3.6.4.2). Other uses are possible.

The minimum size of the descriptor is 24 bytes (defined by PDSC$K_MIN_BOUND_SIZE). An optional PDSC extension in 8-byte increments provides the specific environment values as defined by the implementation.

The fields defined in the bound procedure descriptor are illustrated in Figure 3.12 and described in Table 3.9.

Figure 3.12. Bound Procedure Descriptor (PDSC)
Bound Procedure Descriptor (PDSC)
Table 3.9. Contents of the Bound Procedure Descriptor (PDSC)

Field Name

Contents

PDSC$W_FLAGS

Vector of flag bits <15:0> that must be a copy of the flag bits (except for KIND bits) contained in the quadword pointed to by PDSC$Q_PROC_VALUE.

PDSC$V_KIND

A 4-bit field <3:0> that identifies the type of procedure descriptor. For a procedure with bound values, this field must specify a value of 0.

PDSC$V_FUNC_RETURN

A 4-bit field <11:8> that describes which registers are used for the function value return (if there is one) and what format is used for those registers.

PDSC$V_FUNC_RETURN in a bound procedure descriptor must be the same as the PDSC$V_FUNC_RETURN of the procedure descriptor for the procedure for which the environment is established.

Table 6.4 lists and describes the possible encoding values of PDSC$V_FUNC_RETURN.

Bits 12—15

Reserved and must be 0.

PDSC$W_SIGNATURE_OFFSET

A 16-bit signed byte offset from the start of the procedure descriptor. This offset designates the start of the procedure signature block (if any). In a bound procedure, a 0 in this field indicates the actual signature block must be sought in the procedure descriptor indicated by the PDSC$Q_PROC_VALUE field. A 1 in this field indicates a standard default signature. (An offset value of 1 is not a valid offset because both procedure descriptors and signature blocks must be quadword aligned. See Section 6.2 for details of the procedure signature block).

Note that a nonzero signature offset in a bound procedure value normally occurs only in the case of bound procedures used as part of the implementation of calls from native OpenVMS Alpha code to translated OpenVMS VAX images. In any case, if a nonzero offset is present, it takes precedence over signature information that might occur in any related procedure descriptor.

PDSC$Q_ENTRY

Address of the transfer code sequence.

PDSC$Q_PROC_VALUE

Value of the procedure to be called by the transfer code. The value can be either the address of a procedure descriptor for the procedure or possibly another bound procedure value.

PDSC$Q_ENVIRONMENT

An environment value to pass to the procedure. The choice of environment value is system implementation specific. For more information, see Section 3.6.4.2.

3.6.4.2. Bound Procedure Value

The procedure value for a bound procedure is a pointer to a bound procedure descriptor that, like all other procedure descriptors, contains the address to which the calling procedure must transfer control at offset 8 (see Figure 3.12). This transfer code is responsible for setting up the dynamic environment needed by the target nested procedure and then completing the transfer of control to the code for that procedure. The transfer code receives in R27 a pointer to its corresponding bound procedure descriptor and thus can fetch any required environment information from that descriptor. A bound procedure descriptor also contains a procedure value for the target procedure that is used to complete the transfer of control.

When the transfer code sequence addressed by PDSC$Q_ENTRY of a bound procedure descriptor is called (by a call sequence such as the one given in Section 3.6.3), the procedure value will be in R27, and the transfer code must finish setting up the environment for the target procedure. The preferred location for this transfer code is directly preceding the code for the target procedure. This saves a memory fetch and a branching instruction and optimizes instruction caches and paging.

The following is an example of such a transfer code sequence. It is an example of a target procedure Q that expects the environment value to be passed in R1 and a linkage pointer in R27.
Q_TRANSFER:
        LDQ     R1,24(R27)      ;Environment value to R1
        LDQ     R27,16(R27)     ;Procedure descriptor address to R27
Q_ENTRY::                       ;Normal procedure entry code starts here

After the transfer code has been executed and control is transferred to Q's entry address, R27 contains the address of Q's procedure descriptor, R26 (unmodified by transfer code) contains the return address, and R1 contains the environment value.

When a bound procedure value such as this is needed, the bound procedure descriptor is usually allocated on the parent procedure's stack.

3.6.5. Entry and Exit Code Sequences

To ensure that the stack can be interpreted at any point during thread execution, all procedures must adhere to certain conventions for entry and exit as defined in this section.

3.6.5.1. Entry Code Sequence

Because the value of FP defines the current procedure, all properties of the environment specified by a procedure's descriptor must be valid before the FP is modified to make that procedure current. In addition, none of the properties specified in the calling procedure's descriptor may be invalidated before the called procedure becomes current. So, until the FP has been modified to make the procedure current, all entry code must adhere to the following rules:
  • All registers specified by this standard as saved across a standard call must contain their original (at entry) contents.

  • No standard calls may be made.


Note

If an exception is raised or if an exception occurs in the entry code of a procedure, that procedure's exception handler (if any) will not be invoked because the procedure is not current yet. Therefore, if a procedure has an exception handler, compilers may not move code into the procedure prologue that might cause an exception that would be handled by that handler.

When a procedure is called, the code at the entry address must synchronize (as needed) any pending exceptions caused by instructions issued by the caller, must save the caller's context, and must make the called procedure current by modifying the value of FP as described in the following steps:
  1. If PDSC$L_SIZE is not 0, set register SP = SP − PDSC$L_SIZE.

  2. If PDSC$V_BASE_REG_IS_FP is 1, store the address of the procedure descriptor at 0(SP).

    If PDSC$V_KIND = PDSC$K_KIND_FP_REGISTER, copy the return address to the register specified by PDSC$B_SAVE_RA, if it is not already there, and copy the FP register to the register specified by PDSC$B_SAVE_FP.

    If PDSC$V_KIND = PDSC$K_KIND_FP_STACK, copy the return address to the quadword at the RSA$Q_SAVED_RETURN offset in the register save area denoted by PDSC$W_RSA_OFFSET, and store the registers specified by PDSC$L_IREG_MASK and PDSC$L_FREG_MASK in the register save area denoted by PDSC$W_RSA_OFFSET. (This step includes saving the value in FP).

    Execute TRAPB if required (see Section 9.5.3.2 for details).

  3. If PDSC$V_BASE_REG_IS_FP is 0, load register FP with the address of the procedure descriptor or the address of a quadword that contains the address of the procedure descriptor.

    If PDSC$V_BASE_REG_IS_FP is 1, copy register SP to register FP.

The ENTRY_LENGTH value in the procedure descriptor provides information that is redundant with the setting of a new frame pointer register value. That is, the value could be derived by starting at the entry address and scanning the instruction stream to find the one that updates FP. The ENTRY_LENGTH value included in the procedure descriptor supports the debugger or PCA facility so that such a scan is not required.

Entry Code Example for a Stack Frame Procedure
Example 3.1 is an entry code example for a stack frame. The example assumes that:
  • This is a stack frame procedure

  • Registers R2—4 and F2—3 are saved and restored

  • PDSC$W_RSA_OFFSET = 16

  • The procedure has a static exception handler that does not reraise arithmetic traps

  • The procedure uses a variable amount of stack

If the code sequence in Example 3.1 is interrupted by an asynchronous software interrupt, SP will have a different value than it did at entry, but the calling procedure will still be current.

After an interrupt, it would not be possible to determine the original value of SP by the register frame conventions. If actions by an exception handler result in a nonlocal GOTO call to a location in the immediate caller, then it will not be possible to restore SP to the correct value in that caller. Therefore, any procedure that contains a label that can be the target of a nonlocal GOTO by immediately called procedures must be prepared to reset or otherwise manage the SP at that label.
Example 3.1. Entry Code for a Stack Frame Procedure
      LDA     SP,-SIZE(SP)  ;Allocate space for new stack frame
      STQ     R27,(SP)      ;Set up address of procedure descriptor
      STQ     R26,16(SP)    ;Save return address
      STQ     R2,24(SP)     ;Save first integer register
      STQ     R3,32(SP)     ;Save next integer register
      STQ     R4,40(SP)     ;Save next integer register
      STQ     FP,48(SP)     ;Save caller's frame pointer
      STT     F2,56(SP)     ;Save first floating-point register
      STT     F3,64(SP)     ;Save last floating-point register
      TRAPB                 ;Force any pending hardware exceptions to
                            ; be raised
      MOV     SP,FP         ;Called procedure is now the current procedure
Entry Code Example for a Register Frame
Example 3.2 assumes that the called procedure has no static exception handler and utilizes no stack storage, PDSC$B_SAVE_RA specifies R26, PDSC$B_SAVE_FP specifies R22, and PDSC$V_BASE_REG_IS_FP is 0:
Example 3.2. Entry Code for a Register Frame Procedure
      MOV     FP,R22        ;Save caller's FP.
      MOV     R27,FP        ;Set FP to address of called procedure's
                            ; descriptor. Called procedure is now the
                            ; current procedure.

3.6.5.2. Exit Code Sequence

When a procedure returns, the exit code must restore the caller's context, synchronize any pending exceptions, and make the caller current by modifying the value of FP. The exit code sequence must perform the following steps:
  1. If PDSC$V_BASE_REG_IS_FP is 1, then copy FP to SP.

    If PDSC$V_KIND = PDSC$K_KIND_FP_STACK, and this procedure saves or restores any registers other than FP and SP, reload those registers from the register save area as specified by PDSC$W_RSA_OFFSET.

    If PDSC$V_KIND = PDSC$K_KIND_FP_STACK, load a scratch register with the return address from the register save area as specified by PDSC$W_RSA_OFFSET. (If PDSC$V_KIND = PDSC$K_KIND_FP_REGISTER, the return address is already in scratch register PDSC$B_SAVE_RA).

    Execute TRAPB if required (see Section 9.5.3.2 for details).

  2. If PDSC$V_KIND = PDSC$K_KIND_FP_REGISTER, copy the register specified by PDSC$B_SAVE_FP to register FP.

  3. If PDSC$V_KIND = PDSC$K_KIND_FP_STACK, reload FP from the saved FP in the register save area.

  4. If a function value is not being returned using the stack (PDSC$V_STACK_RETURN_VALUE is 0), then restore SP to the value it had at procedure entry by adding the value that was stored in PDSC$L_SIZE to SP. (In some cases, the returning procedure will leave SP pointing to a lower stack address than it had on entry to the procedure, as specified in Section 3.7.7).

  5. Jump to the return address (which is in a scratch register).

The called routine does not adjust the stack to remove any arguments passed in memory. This responsibility falls to the calling routine that may choose to defer their removal because of optimizations or other considerations.

Exit Code Example for a Stack Frame
Example 3.3 shows the return code sequence for the stack frame.
Example 3.3. Exit Code Sequence for a Stack Frame
      MOV     FP,SP       ;Chop the stack back
      LDQ     R28,16(FP)  ;Get return address
      LDQ     R2,24(FP)   ;Restore first integer register
      LDQ     R3,32(FP)   ;Restore next integer register
      LDQ     R4,40(FP)   ;Restore next integer register
      LDT     F2,56(FP)   ;Restore first floating-point register
      LDT     F3,64(FP)   ;Restore last floating-point register
      TRAPB               ;Force any pending hardware exceptions to
                          ; be raised
      LDQ     FP,48(FP)   ;Restore caller's frame pointer
      LDA     SP,SIZE(SP) ;Restore SP (SIZE is compiled into PDSC$L_SIZE)
      RET     R31,(R28)   ;Return to caller's code

Interruption of the code sequence in Example 3.3 by an asynchronous software interrupt can result in the calling procedure being the current procedure, but with SP not yet restored to its value in that procedure. The discussion of that situation in entry code sequences applies here as well.

Exit Code Example for a Register Frame
Example 3.4 contains the return code sequence for the register frame.
Example 3.4. Exit Code Sequence for a Register Frame
      MOV     R22,FP      ;Restore caller's FP value
                          ; Caller is once again the current procedure.
      RET     R31,(R26)   ;Return to caller's code

3.7. Data Passing

This section defines the OpenVMS Alpha calling standard conventions of passing data between procedures in a call stack. An argument item represents one unit of data being passed between procedures.

3.7.1. Argument Passing Mechanisms

This OpenVMS Alpha calling standard defines three classes of argument items according to the mechanism used to pass the argument:
  • Immediate value

  • Reference

  • Descriptor

Argument items are not self-defining; interpretation of each argument item depends on agreement between the calling and called procedures.

This standard does not dictate which passing mechanism must be used by a given language compiler. Language semantics and interoperability considerations might require different mechanisms in different situations.

Immediate value

An immediate value argument item contains the value of the data item. The argument item, or the value contained in it, is directly associated with the parameter.

Reference

A reference argument item contains the address of a data item such as a scalar, string, array, record, or procedure. This data item is associated with the parameter.

Descriptor

A descriptor argument item contains the address of a descriptor, which contains structural information about the argument's type (such as array bounds) and the address of a data item. This data item is associated with the parameter.

3.7.2. Argument List Structure

The argument list in an OpenVMS Alpha call is an ordered set of zero or more argument items, which together comprise a logically contiguous structure known as the argument item sequence. An argument item is specified using up to 64 bits.

A 64-bit argument item can be used to pass arguments by immediate value, by reference, and by descriptor. Any combination of these mechanisms in an argument list is permitted.

Although the argument items form a logically contiguous sequence, they are in practice mapped to integer and floating-point registers and to memory in a method that can produce a physically discontiguous argument list. Registers R16—21 and F16—21 are used to pass the first six items of the argument item sequence. Additional argument items must be passed in a memory argument list that must be located at 0(SP) at the time of the call.

Table 3.10 specifies the standard locations in which argument items can be passed.
Table 3.10. Argument Item Locations

Item

Integer Register

Floating-Point Register

Stack

1

R16

F16

2

R17

F17

3

R18

F18

4

R19

F19

5

R20

F20

6

R21

F21

7—n

0(SP) - (n-7)*8(SP)

The following list summarizes the general requirements that determine the location of any specific argument:
  • All argument items are passed in the integer registers or on the stack, except for argument items that are floating-point data passed by immediate value.

  • Floating-point data passed by immediate value is passed in the floating-point registers or on the stack.

  • Only one location (across an item row in Table 3.10) can be used by any given argument item in a list. For example, if argument item 3 is an integer passed by value, and argument item 4 is a single-precision floating-point number passed by value, then argument item 3 is assigned to R18 and argument item 4 is assigned to F19.

  • A single- or double-precision complex value is treated as two arguments for the purpose of argument-item sequence rules. In particular, the real part of a complex value might be passed as the sixth argument item in register F21, in which case the imaginary part is then passed as the seventh argument item in memory.

    An extended precision complex value is passed by reference using a single integer or stack argument item. (An extended precision complex value is not passed by immediate value because the component extended precision floating values are not passed by value. See also Section 3.7.5.1).

The argument list that includes both the in-memory portion and the portion passed in registers can be read from and written to by the called procedure. Therefore, the calling procedure must not make any assumptions about the validity of any part of the argument list after the completion of a call.

3.7.3. Argument Lists and High-Level Languages

High-level language functional notations for procedure call arguments are mapped into argument item sequences according to the following requirements:
  • Arguments are mapped from left to right to increasing offsets in the argument item sequence. R16 or F16 is allocated to the first argument, and the last quadword of the memory argument list (if any) is allocated to the last argument.

  • Each source language argument corresponds to one or more contiguous Alpha calling standard argument items.

  • Each argument item consists of 64 bits.

  • A null or omitted argument—for example, CALL SUB(A,,B)—is represented by an argument item containing the value 0.

    Arguments passed by immediate value cannot be omitted unless a default value is supplied by the language. (This is to enable called procedures to distinguish an omitted immediate argument from an immediate value argument with the value 0).

    Trailing null or omitted arguments—for example, CALL SUB(A,,)—are passed by the same rules as for embedded null or omitted arguments.

3.7.4. Unused Bits in Passed Data

Whenever data is passed by value between two procedures in registers (for the first six input arguments and return values), or in memory (for arguments after the first six), the bits not used by the data are sign-extended or zero-extended as appropriate.

Table 3.11 lists and defines the various data-type requirements for size and their extensions to set or clear unused bits.
Table 3.11. Unused Bits in Passed Data

Data Type

Type Designator

Data Size (bytes)

Register Extension Type

Memory Extension Type

Byte logical

BU

1

Zero64

Zero64

Word logical

WU

2

Zero64

Zero64

Longword logical

LU

4

Sign64

Sign64

Quadword logical

QU

8

Data64

Data64

Byte integer

B

1

Sign64

Sign64

Word integer

W

2

Sign64

Sign64

Longword integer

L

4

Sign64

Sign64

Quadword integer

Q

8

Data64

Data64

F_floating

F

4

Hard

Data32

D_floating

D

8

Hard

Data64

G_floating

G

8

Hard

Data64

F_floating complex

FC

2 * 4

2*Hard

2*Data32

D_floating complex

DC

2 * 8

2*Hard

2*Data64

G_floating complex

GC

2 * 8

2*Hard

2*Data64

S_floating

FS

4

Hard

Data32

T_floating

FT

8

Hard

Data64

X_floating

FX

16

N/A

N/A

S_floating complex

FSC

2 * 4

2*Hard

2*Data32

T_floating complex

FTC

2 * 8

2*Hard

2*Data64

X_floating complex

FXC

2 * 16

N/A

N/A

Small structures of 8 bytes or less

N/A

≤8

Nostd

Nostd

Small arrays of 8 bytes or less

N/A

≤8

Nostd

Nostd

32-bit address

N/A

4

Sign64

Sign64

64-bit address

N/A

8

Data64

Data64

Table 3.12 contains the defined meanings for the extension type symbols used in Table 3.11.
Table 3.12. Extension Type Codes

Sign Extension Type

Defined Function

Sign64

Sign-extended to 64 bits.

Zero64

Zero-extended to 64 bits.

Data32

Data is 32 bits. The state of bits <63:32> is unpredictable.

2*Data32

Two single-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data32).

Data64

Data is 64 bits.

2*Data64

Two double-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data64).

Hard

Passed in the layout defined by the hardware SRM.

2*Hard

Two floating-point parts of the complex value are stored in a pair of registers as independent floating-point values (each handled as Hard).

Nostd

State of all high-order bits not occupied by the data is unpredictable across a call or return.

Because of the varied rules for sign extension of data when passed as arguments, both calling and called routines must agree on the data type of each argument. No implicit data-type conversions can be assumed between the calling procedure and the called procedure.

3.7.5. Sending Data

This section defines the OpenVMS Alpha calling standard requirements for mechanisms to send data and the order of argument evaluation.

3.7.5.1. Sending Mechanism

As previously defined, the argument-passing mechanisms allowed are immediate value, reference, and descriptor. Requirements for using these mechanisms follow:
  • By immediate value. An argument may be passed by immediate value only if the argument is one of the following:
    • One of the noncomplex scalar data types with a size known (at compile time) to be ≤ 64 bits

    • Either single or double precision complex

    • A record with a known size (at compile time)

    • A set, implemented as a bit vector, with a size known (at compile time) to be ≤ 64 bits

    No form of string or array data type may be passed by immediate value in a standard call.

    Unused high-order bits must be zero or sign-extended, as appropriate depending on the date type, to fill all bits of each argument list item (as specified in Table 3.11).

    A single- or double- precision complex value is passed as two single or double precision floating-point values, respectively. Note that the argument count reflects that two argument positions are used rather than just one actual argument.

    A record value, which may be larger than 64 bits, is passed by immediate value as follows:
    • Allocate as many fully occupied argument item positions to the argument value as are needed to represent the argument.

    • The value of the unoccupied bits is undefined in a final, partially occupied argument item position, if any.

    • If an argument position is passed in one of the registers, it can only be passed in an integer register (never in a floating-point register).

    Other argument values that are larger than 64 bits can be passed by immediate value using nonstandard conventions, typically using a method similar to those for passing records. Thus, for example, a 26-byte string can be passed by value in four integer registers.

  • By reference. Nonparametric arguments (arguments for which associated information such as string size and array bounds are not required) can be passed by reference in a standard call. This includes extended precision floating and extended precision complex values.

  • By descriptor. Parametric arguments (arguments for which associated information such as string size and array bounds must be passed to the caller) are passed by a single descriptor in a standard call.

Note that extended floating values are not passed using the immediate value mechanism; rather, they are passed using the by reference mechanism. (However, when by value semantics is required, it may be necessary to make a copy of the actual parameter and pass a reference to that copy in order to avoid improper alias effects).

Also note that when a record is passed by immediate value, the component types are not material to how the argument is aligned; the record will always be quadword aligned.

3.7.5.2. Order of Argument Evaluation

Because most high-level languages do not specify the order of evaluation (with respect to side effects) of arguments, those language processors can evaluate arguments in any convenient order. The choice of argument evaluation order and code generation strategy is constrained only by the definition of the particular language. Programs should not depend on the order of evaluation of arguments.

3.7.6. Receiving Data

When it cannot be determined at compile time whether a given in-register argument item is passed in a floating-point register or an integer register, the argument information register can be interpreted at run-time to establish where the argument was passed. (See Section 3.6.1 for details).

3.7.7. Returning Data

A standard function must return its function value by one of the following mechanisms:
  • Immediate value

  • Reference

  • Descriptor

These mechanisms are the only standard means available for returning function values, and they support the important language-independent data types. Functions that return values by any mechanism other than those specified here are nonstandard, language-specific functions.

3.7.7.1. Function Value Return by Immediate Value

This standard defines the following two types of function returns by immediate value:
  • Nonfloating function value return

  • Floating function value return

Nonfloating Function Value Return by Immediate Value
A function value is returned by immediate value in register R0 only if the type of function value is one of the following:
  • Nonfloating-point scalar data type with size known to be ≤ 64 bits

  • Record with size known to be ≤ 64 bits

  • Set, implemented as a bit vector, with size known to be ≤ 64 bits

No form of string or array can be returned by immediate value, and two separate 32-bit entities cannot be combined and returned in R0.

A function value of less than 64 bits returned in R0 must be zero-extended or sign-extended as appropriate, depending on the data type (see Table 3.11), to a full quadword.

Floating Function Value Return by Immediate Value

A function value is returned by immediate value in register F0 only if it is a noncomplex single- or double-precision floating-point value (F, D, G, S, or T).

A function value is returned by immediate value in registers F0 and F1 only if it is a complex single or double-precision floating-point value (complex F, D, G, S, or T).

Note that extended floating-point and extended complex values are returned by reference as described next.

3.7.7.2. Function Value Return by Reference

A function value is returned by reference only if the function value satisfies both of the following criteria:
  • Its size is known to both the calling procedure and the called procedure, but the value cannot be returned by immediate value. (Because the function value requires more than 64 bits, the data type is a string or an array type).

  • It can be returned in a contiguous region of storage.

The actual-argument list and the formal-argument list are shifted to the right by one argument item. The new, first argument item is reserved for the function value. This hidden first argument is included in the count and register usage information that is passed in the argument information register (see Section 3.6.1 for details).

The calling procedure must provide the required contiguous storage and pass the address of the storage as the first argument. This address must specify storage naturally aligned according to the data type of the function value.

The called function must write the function value to the storage described by the first argument.

The this Pointer

For C++, when the this pointer is passed as an implicit first parameter and a pointer to a return value buffer is also required, then the this pointer becomes the first parameter, the buffer pointer becomes the second parameter, and the remaining normal parameters are shifted two slots to make this possible.

3.7.7.3. Function Value Return by Descriptor

A function value is returned by descriptor only if the function value satisfies all of the following criteria:
  • It cannot be returned by immediate value. (Because the function value requires more than 64 bits, the data type is a string or an array type, and so on).

  • Its size is not known to either the calling procedure or the called procedure.

  • It can be returned in a contiguous region of storage.

Noncontiguous function values are language specific and cannot be returned as a standard-conforming return value.

Records, noncontiguous arrays, and arrays with more than one dimension cannot be returned by descriptor in a standard call.

Both 32-bit and 64-bit descriptor forms can be used for function values returned by descriptor. See Chapter 8, for details of the descriptor forms.

The use of descriptors for function value return divides into three major cases with return values involving:
  • Dynamic text—Heap-managed strings of arbitrary and dynamically changeable length

  • Return objects created by the calling routine—Function values that are to be returned in an object allocated by and having attributes (bounds, lengths, and so on) specified by the calling routine

  • Return objects created by the called routine—Function values that are returned in an object allocated by and having attributes (bounds, lengths, and so on) specified by the called routine

For correct results to be obtained from this type of function return, the calling and called routines must agree by prior arrangement which of these three major cases applies, and whether 64-bit descriptor forms may be used.

The following paragraphs describe the specialized requirements for each major case:

Dynamic Text

For dynamic text return by descriptor, the calling routine passes a valid (completely initialized) dynamic string descriptor (DSC$B_CLASS = DSC$K_CLASS_D). The called routine must assign a value to the variable represented by this descriptor using the same rules that apply to a dynamic text descriptor used as an ordinary parameter.

Return Object Created by Calling Routine

For a return object created by the calling routine, the calling routine passes a descriptor in which all fields are completely loaded.

The called routine must supply a return value that satisfies that description. In particular, the called routine must truncate or pad the returned value to satisfy the requirements of the descriptor according to the semantics of the language in which the called routine is written.

The calling and called routines must agree by prior arrangement on the DSC$B_CLASS and DSC$B_DTYPE of descriptor to be used.

Return Object Created by Called Routine
For a return object created by the called routine, the calling and called routines must agree by prior arrangement on the DSC$B_CLASS and DSC$B_DTYPE of descriptor to be used. The calling routine passes a descriptor in which:
  • DSC$A_POINTER field is set to 0.

  • DSC$B_CLASS field is loaded.

  • DSC$B_DTYPE field is loaded.

  • DSC$B_DIMCT field is loaded and the DSC$B_AFLAGS field is set to 0 if the descriptor is an array descriptor.

  • All other fields are unpredictable.

If the passed descriptor is an array descriptor, it must contain space for bounds information to be returned even though the DSC$B_AFLAGS field is set to 0.

The called routine must return the function value using stack return conventions and load the DSC$A_POINTER field to point to the returned data. Other descriptor information, such as origin, bounds (if supplied), and DSC$B_AFLAGS fields must be filled in appropriately to correspond to the returned data.

An important implication of a call that uses this kind of value return is that the stack pointer normally is not restored to its value prior to the call as part of the return from the called procedure. The returned value typically (but not necessarily) is left by the called routine somewhere on the stack. For that reason, this mechanism is sometimes known as the stack return mechanism.

After a return of this type, the calling routine must assume that the stack has been extended by some unknown amount (or possibly none) by the called procedure. In particular, the stack cannot be cut back until the returned value is no longer needed (which may be ensured by copying it to another location).

However, this type of return does not imply that the actual storage used by the called routine to hold the returned value must be at the address pointed to by the stack pointer; it need not even be on the stack. It could be in some read-only, static memory. (This latter case might arise when the returned value is constant or is obtained from some constant structure). For this reason, the calling routine must not assume that the data described by the return descriptor is writable.

3.8. Data Allocation

This section describes the standard static data requirements that define the Alpha alignment of data structures, record formats, and record layout. These conventions help to ensure proper data compatibility with all OpenVMS Alpha and VAX languages.

3.8.1. Data Alignment

In the Alpha environment, memory references to data that is not naturally aligned can result in alignment faults, which can severely degrade the performance of all procedures that reference the unaligned data.

To avoid such performance degradation, all data values on Alpha systems should be naturally aligned. Table 3.13 contains information on data alignment.
Table 3.13. Natural Alignment Requirements

Data Type

Alignment Starting Position

8-bit character string

Byte boundary

16-bit integer

Address that is a multiple of 2 (word alignment)

32-bit integer

Address that is a multiple of 4 (longword alignment)

64-bit integer

Address that is a multiple of 8 (quadword alignment)

  • F_floating
  • F_floating complex

Address that is a multiple of 4 (longword)

  • D_floating
  • D_floating complex

Address that is a multiple of 8 (quadword)

  • G_floating
  • G_floating complex

Address that is a multiple of 8 (quadword)

  • S_floating
  • S_floating complex

Address that is a multiple of 4 (longword)

  • T_floating
  • T_floating complex

Address that is a multiple of 8 (quadword)

  • X_floating
  • X_floating complex

Address that is a multiple of 16 (octaword)

For aggregates such as strings, arrays, and records, the data type to be considered for purposes of alignment is not the aggregate itself, but rather the elements of which the aggregate is composed. The alignment requirement of an aggregate is that all elements of the aggregate be naturally aligned. For example, varying 8-bit character strings must start at addresses that are a multiple of at least 2 (word alignment) because of the 16-bit count at the beginning of the string; 32-bit integer arrays start at a longword boundary, irrespective of the extent of the array.

The rules for passing a record in an argument that is passed by immediate value (see Section 3.7.5.1) always provide quadword alignment of the record value independent of the normal alignment requirement of the record. If deemed appropriate by an implementation, normal alignment can be established within the called procedure by making a copy of the record argument at a suitably aligned location.

3.8.2. Record Layout Conventions

The OpenVMS Alpha calling standard rules for record layout are designed to provide good run-time performance on all implementations of the Alpha architecture and to provide the required level of compatibility with conventional VAX operating environments.

Therefore, this standard defines two record layout conventions:
  • Those optimized for optimal access characteristics (referred to as aligned record layouts)

  • Those compatible with conventions that are traditionally used by VAX languages (referred to as VAX compatible record layouts)

Only these two record layouts may be used across standard interfaces or between languages. Languages can support other language-specific record layout conventions, but such layouts are nonstandard.

The aligned record layout conventions should be used unless interchange is required with conventional VAX applications that use the OpenVMS VAX compatible record layouts.

3.8.2.1. Aligned Record Layout

The aligned record layout conventions ensure that:
  • All components of a record or subrecord are naturally aligned.

  • Layout and alignment of record elements and subrecords are independent of any record or subrecord in which they are embedded.

  • Layout and alignment of a subrecord is the same as if it were a top-level record.

  • Declaration in high-level languages of standard records for interlanguage use is straightforward and obvious, and meets the requirements for source-level compatibility between Alpha and VAX languages.

The aligned record layout is defined by the following conventions:
  • The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.

  • The first bit of a record or subrecord must be directly addressable (byte aligned).

  • Records and subrecords must be aligned according to the largest natural alignment requirements of the contained elements and subrecords.

  • Bit fields (packed subranges of integers) are characterized by an underlying integer type that is a byte, word, longword, or quadword in size together with an allocation size in bits. A bit field is allocated at the next available bit boundary, provided that the resulting allocation does not cross an alignment boundary of the underlying type. Otherwise, the field is allocated at the next byte boundary that is aligned as required for the underlying type. (In the later case, the space skipped over is left permanently not allocated). In addition, if necessary, the alignment of the record as a whole is increased to that of the underlying integer type.

  • Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.

  • All other components of a record must start at the next available naturally aligned address for the data type.

  • The length of a record must be a multiple of its alignment. (This includes the case when a record is a component of another record).

  • Strings and arrays must be aligned according to the natural alignment requirements of the data type of which the string or array is composed.

  • The length of an array element is a multiple of its alignment, even if this leaves unused space at its end. The length of the whole array is the sum of the lengths of its elements.

3.8.2.2. OpenVMS VAX Compatible Record Layout

The OpenVMS VAX compatible record layout is defined by the following conventions:
  • The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.

  • Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.

  • All other components of a record must start at the next available byte in the record. Any unused bits following the last-used bit in the last-used byte of each component must be filled out to the next byte boundary so that any following data starts on a byte boundary.

  • Subrecords must be aligned according to the largest alignment of the contained elements and subrecords. A subrecord always starts at the next available byte unless it consists entirely of unaligned bit data and it immediately follows an unaligned bit string, unaligned bit array, or a subrecord consisting entirely of unaligned bit data.

  • Records must be aligned on byte boundaries.

3.9. Multithreaded Execution Environments

This section defines the conventions to support the execution of multiple threads in a multilanguage Alpha environment. Specifically defined is how compiled code must perform stack limit checking. While this standard is compatible with a multithreaded execution environment, the detailed mechanisms, data structures, and procedures that support this capability are not specified in this manual.

For a multithread environment, the following characteristics are assumed:
  • There can be one or more threads executing within a single process.

  • The state of a thread is represented in a thread environment block (TEB).

  • The TEB of a thread contains information that determines a stack limit below which the stack pointer must not be decremented by the executing code (except for code that implements the multithread mechanism itself).

  • Exception handling is fully reentrant and multithreaded.

3.9.1. Stack Limit Checking

A program that is otherwise correct can fail because of stack overflow. Stack overflow occurs when extension of the stack (by decrementing the stack pointer, SP) allocates addresses not currently reserved for the current thread's stack.

Detection of a stack overflow situation is necessary because a thread, attempting to write into stack storage, could modify data allocated in that memory for some other purpose. This would most likely produce unpredictable and undesirable results or irreproducible application failures.

The requirements for procedures that can execute in a multithread environment include checking for stack overflow. This section defines the conventions for stack limit checking in a multithreaded program environment.

In the following sections, the term new stack region refers to the region of the stack from one less than the old value of SP to the new value of the SP.

Stack Guard Region

In a multithread environment, the memory beyond the limit of each thread's stack is protected by contiguous guard pages, which form the stack's guard region.

Stack Reserve Region

In some cases, it is desirable to maintain a stack reserve region, which is a minimum-sized region that is immediately above a thread's guard region. A reserve region may be desirable to ensure that exceptions or asynchronous system traps (ASTs) have stack space to execute on a thread's stack, or to ensure that the exception dispatcher and any exception handler that it might call have stack space to execute after detection of an invalid attempt to extend the stack.

This standard does not require a reserve region.

3.9.1.1. Methods for Stack Limit Checking

Because accessible memory may be available at addresses lower than those occupied by the guard region, compilers must generate code that never extends the stack past the guard pages into accessible memory that is not allocated to the thread's stack.

A general strategy is to access each page of memory down to and possibly including the page corresponding to the intended new value for the SP. If the stack is to be extended by an amount larger than the size of a memory page, then a series of accesses is required that works from higher to lower addressed pages. If any access results in a memory access violation, then the code has made an invalid attempt to extend the stack of the current thread.

Note

An access can be performed by using either a load or a store operation; however, be sure to use an instruction that is guaranteed to make an access to memory. For example, do not use an LDQ R31,* instruction, because the Alpha architecture does not allow any memory access, even a read access, whose result is discarded because of the R31 destination.

This standard defines two methods for stack limit checking: implicit and explicit.

Implicit Stack Limit Checking
The following are two mutually exclusive strategies for implicit stack limit checking:
  • If the lowest addressed byte of the new stack region is guaranteed to be accessed prior to any further stack extension, then the stack can be extended by an increment that is equal in size to the guard region (without any further accesses).

  • If some byte (not necessarily the lowest) of the new stack region is guaranteed to be accessed prior to any further stack extension, then the stack can be extended by an increment that is equal in size to one-half the guard region (without any further accesses).

The stack frame format (see Section 3.4.3) and entry code rules (see Section 3.6.5) generally do not ensure access to the lowest address of a new stack region without introducing an extra access solely for that purpose. Consequently, this standard uses the second strategy. While the amount of implicit stack extension that can be achieved is smaller, the check is achieved at no additional cost.

This standard requires that the minimum guard region size is 8192 bytes, the size of the smallest memory protection granularity allowed by the Alpha architecture.

If the stack is being extended by an amount less than or equal to 4096 and a reserve region is not required, then explicit stack limit checking is not required. However, because asynchronous interrupts and calls to other procedures may also cause stack extension without explicit stack limit checking, stack extension with implicit limit checking must adhere to a strict set of conventions as follows:
  • Explicit stack limit checking must be performed unless the amount by which the SP is decremented is known to be less than or equal to 4096 and a reserve region is not required.

  • Some byte in the new stack region must be accessed before the SP can be decremented for a subsequent stack extension.

    This access can be performed either before or after the SP is decremented for this stack extension, but it must be done before the SP can be decremented again.

  • No standard procedure call can be made before some byte in the new stack region is accessed.

  • The system exception dispatcher ensures that the lowest addressed byte in the new stack region is accessed if any kind of asynchronous interrupt occurs after the SP is decremented, but before the access in the new stack region occurs.

These conventions ensure that the stack pointer is not decremented so that it points to accessible storage beyond the stack limit without this error being detected (either by the guard region being accessed by the thread or by an explicit stack limit check failure).

As a matter of practice, the system can provide multiple guard pages in the guard region. When a stack overflow is detected as a result of access to the guard region, one or more guard pages can be unprotected for use by the exception handling facility, and one or more guard pages can remain protected to provide implicit stack limit checking during exception processing. However, the size of the guard region and the number of guard pages is system defined and is not defined by this standard.

Explicit Stack Limit Checking

If the stack is being extended by an amount of unknown size or by a known size greater than the maximum implicit check size (4096), then a code sequence that follows the rules for implicit stack limit checking can be executed in a loop to access the new stack region incrementally in segments lesser than or equal to the minimum page size (8192 bytes). At least one access must occur in each such segment.

The first access must occur between SP and SP-4096 because, in the absence of more specific information, the previous guaranteed access relative to the current stack pointer may be as much as 4096 bytes greater than the current stack pointer address.

The last access must be within 4096 bytes of the intended new value of the stack pointer. These accesses must occur in order, starting with the highest addressed segment and working toward the lowest addressed segment.

A more optimal strategy is:
  1. Perform a read access using the intended new value of the stack pointer. This is nondestructive, even if the read is beyond the stack guard region, and may facilitate OS mapping of new stack pages, if appropriate, in a single operation.

  2. Proceed with sequential accesses as just described.


Note

A simple algorithm that is consistent with this requirement (but achieves up to twice the minimum number of accesses) is to perform a sequence of accesses in a loop starting with the previous value of SP, decrementing by the minimum no-check extension size (4096) to, but not including, the first value that is less than the new value for the stack pointer.

The stack must not be extended incrementally in procedure prologues. A procedure prologue that needs to extend the stack by an amount of unknown size or known size greater than the minimum implicit check size must test new stack segments as just described in a loop that does not modify SP, and then update the stack with one instruction that copies the new stack pointer value into the SP.

Note

An explicit stack limit check can be performed either by inline code that is part of a prologue or by a run-time support routine that is tailored to be called from a procedure prologue.

Stack Reserve Region Checking

The size of the reserve region must be included in the increment size used for stack limit checks, after which it is not included in the amount by which the stack is actually extended. (Depending on the size of the reserve region, this may partially or even completely eliminate the ability to use implicit stack limit checking).

3.9.1.2. Stack Overflow Handling

If a stack overflow is detected, one of the following results:
  • Exception SS$_ACCVIO may be raised.

  • The system may transparently extend the thread's stack, reset the TEB stack limit value appropriately, and continue execution of the thread.

Note that if a transparent stack extension is performed, a stack overflow that occurs in a called procedure might cause the stack to be extended. Therefore, the TEB stack limit value must be considered volatile and potentially modified by external procedure calls and by handling of exceptions.

Chapter 4. OpenVMS I64 Conventions

This chapter describes the fundamental concepts and conventions for calling a procedure in an OpenVMS I64 environment.

4.1. I64 Register Usage

This section describes the register conventions for OpenVMS I64. OpenVMS uses the following register types:
  • General

  • Floating-point

  • Predicate

  • Branch

  • Application

4.1.1. I64 Register Classes

Registers are partitioned into the following classes that define the way a register can be used within a procedure:
  • Scratch registers—may be modified by a procedure call; the caller must save these registers before a call if needed (caller save).

  • Preserved registers—must not be modified by a procedure call; the callee must save and restore these registers if used (callee save). A procedure using one of the preserved general registers must save and restore the caller's original contents, including the NaT bits associated with the registers, without generating a NaT consumption fault.

    One way to preserve a register is not to use it at all.

  • Automatic registers—saved and restored automatically by the hardware call/return mechanism.

  • Constant or Read-only registers—contain a fixed value that cannot be changed by the program.

  • Special registers—used in the calling standard call/return mechanism.

  • Global registers—shared across a set of cooperating routines as global static storage that happens to be allocated in a register. (Details regarding the dynamic lifetime of such storage are not addressed here).

OpenVMS further defines the way that static registers can be used between routines:
  • Special registers—used in the calling standard call/return mechanism. (These are the same as the set of special registers in the preceding list of registers used within a procedure).

  • Input registers—may be used to pass information into a procedure (in addition to the normal stacked input registers).

  • Output registers—may be used to pass information back from a called procedure to its caller (in addition to the normal return value registers).

  • Volatile registers—may be used as scratch registers within a procedure and are not preserved across a call; may not be used to pass information between procedures either as input or output.

4.1.2. I64 General Register Usage

This standard defines the usage of the OpenVMS general registers as listed in Table 4.1. General registers R0 through R31 are termed the static general registers. General registers R32 through R127 are termed the stacked general registers.
Table 4.1. I64 General Register Usage

Register

Class

Usage

R0

Constant

Always 0.

R1

Special

Global data pointer (GP). Designated to hold the address of the currently addressable global data segment. Its use is subject to the following conventions:
  1. On entry to a procedure, GP is guaranteed valid for that procedure.

  2. At any direct procedure call, GP must be valid (for the caller). This guarantees that an import stub (see Section 4.7.3) can access the caller's linkage table.

  3. Any procedure call (indirect or direct) may modify GP unless the call is known to be local to the image.

  4. At procedure return, GP must be valid (for the returning procedure). This allows the compiler to optimize calls known to be local (an exception to convention 3).

The effect of these rules is that GP must be treated as a scratch register at a point of call (that is, it must be saved by the caller), and it must be preserved from entry to exit.

R2

Volatile

May not be used to pass information between procedures, either as inputs or outputs. See also Section 4.1.9.

R3

Scratch

May be used within and between procedures in any mutually consistent combination of ways under explicit user control. See also Section 4.1.9.

R4—R7

Preserved

General-purpose preserved registers. Used for any value that needs to be preserved across a procedure call. May be used within and between procedures in any mutually consistent combination of ways under explicit user control. See also Section 4.1.9.

R8—R9

Scratch

Return Value. Can also be used as input (whether or not the procedure has a return value), but not in any additional ways. In addition, R9 is the preferred and recommended register to use when passing the environment value when calling a bound procedure. (See Section 4.7.7 and Section 6.1.2).

R10—R11

Scratch

May be used within and between procedures in any mutually consistent combination of ways under explicit user control. See also Section 4.1.9.

R12

Special

Memory stack pointer (SP). Holds the lowest address of the current stack frame. At a call, the stack pointer must point to a 0 mod 16 aligned area. The stack pointer is also used to access any memory arguments upon entry to a function. Except in the case of dynamic stack allocation, code can use the stack pointer to reference stack items without having to set up a frame pointer for this purpose.

R13

Special

Reserved as a thread pointer (TP).

R14—R18

Volatile

May not be used to pass information between procedures, either as inputs or outputs. See also Section 4.1.9.

R19—R24

Scratch

May be used within and between procedures in any mutually consistent combination of ways under explicit user control. See also Section 4.1.9.

R25

Special

Argument information (see Section 4.7.5.3).

R26—R31

Scratch

May be used within and between procedures in any mutually consistent combination of ways under explicit user control. See also Section 4.1.9.

IN0—IN7

Automatic

Stacked input registers. Code may allocate a register stack frame of up to 96 registers with the ALLOC instruction, and partition this frame into three regions: input registers (IN0, IN1, ...), local registers (LOC0, LOC1, ...), and output registers (OUT0, OUT1, ...). R32—R39 (IN0—IN7) are used as incoming argument registers. Arguments beyond these registers appear in memory, as explained in Section 4.7.4.

LOC0—LOC95

Automatic

Stacked local registers. Code may allocate a register stack frame of up to 96 registers with the ALLOC instruction, and partition this frame into three regions: input registers (IN0, IN1, ...), local registers (LOC0, LOC1, ...), and output registers (OUT0, OUT1, ...). LOC0-LOC95 are used for local storage. See Section 4.7.4 for more information.

OUT0—OUT7

Scratch

Stacked output registers. Code may allocate a register stack frame of up to 8 registers with the ALLOC instruction, and partition this frame into three regions: input registers (IN0, IN1, ...), local registers (LOC0, LOC1, ...), and output registers (OUT0, OUT1, ...). OUT0-OUT7 are used to pass the first eight arguments in calls. See Section 4.7.4 for more information.

4.1.3. I64 Floating-Point Register Usage

This standard defines the usage of the OpenVMS floating-point registers as listed in Table 4.2. Floating-point registers F0 through F31 are termed the static floating-point registers. Floating-point registers F32 through F127 are termed the rotating floating-point registers.
Table 4.2. I64 Floating-Point Register Usage

Register

Class

Usage

F0

Constant

Always 0.0.

F1

Constant

Always 1.0.

F2-F5

Preserved

Can be used for any value that needs to be preserved across a procedure call. A procedure using one of the preserved floating-point registers must save and restore the caller's original contents without generating a NaT consumption fault.

F6—F7

Scratch

May be used within and between procedures in any mutually consistent combination of ways under explicit user control.

F8—F9

Scratch

Argument/Return values. See Section 4.7.4 and Section 4.7.6 for the OpenVMS specifications for use of these registers.

F10—F15

Scratch

Argument values. See Section 4.7.4 for the OpenVMS specifications for use of these registers.

F16—F31

Preserved

Can be used for any value that needs to be preserved across a procedure call. A procedure using one of the preserved floating-point registers must save and restore the caller's original contents without generating a NaT consumption fault.

F32—F127

Scratch

Rotating registers or scratch registers.


Note

VAX floating-point data is never loaded or manipulated in the Itanium floating-point registers. However, VAX floating-point values may be converted to IEEE floating-point values, which are then manipulated in the I64 floating-point registers.

4.1.4. I64 Predicate Register Usage

Predicate registers are single-bit-wide registers used for controlling the execution of predicated instructions. Predicate registers P0 through P15 are termed the static predicate registers. Predicate registers P16 through P127 are termed the rotating predicate registers. This standard defines the usage of the OpenVMS predicate registers as listed in Table 4.3.
Table 4.3. I64 Predicate Register Usage

Register

Class

Usage

P0

Constant

Always 1.

P1—P5

Preserved

Can be used for any predicate value that needs to be preserved across a procedure call. A procedure using one of the preserved predicate registers must save and restore the caller's original contents.

P6—P13

Scratch

Can be used within a procedure as a scratch register.

P14—P15

Volatile

May not be used to pass information between procedures, either as input or output. See also Section 4.1.9.

P16—P63

Preserved

Rotating registers.

4.1.5. I64 Branch Register Usage

Branch registers are used for making indirect branches. This standard defines the usage of the OpenVMS branch registers as listed in Table 4.4.
Table 4.4. I64 Branch Register Usage

Register

Class

Usage

B0

Scratch

Contains the return address on entry to a procedure; otherwise a scratch register.

B1—B5

Preserved

Can be used for branch target addresses that need to be preserved across a procedure call.

B6—B7

Volatile

May not be used to pass information between procedures, either as input or output. See also Section 4.1.9.

4.1.6. I64 Application Register Usage

Application registers are special-purpose registers designated for application use. This standard defines the usage of the OpenVMS application registers as listed in Table 4.5.
Table 4.5. I64 Application Register Usage

Register

Class

Usage

AR.FPSR

See Usage

Floating-point status register. This register is divided into the following fields:
  • Trap Disable Bits (bits 5–0)—Must be preserved by the callee, except for procedures whose documented purpose is to change these bits.

  • Status Field 0—Must be preserved by the callee, except for procedures whose documented purpose is to change these bits. The flag bits are the IEEE floating-point standard sticky bits and are part of the static state of the machine.

  • Status Field 1—Dedicated for use by divide and square root code, and must always be set to standard values at any procedure call boundary (including entry to exception handlers). These standard values are: trap disable set, round-to-nearest mode, 80-bit (extended) precision, widest range for exponent on, and flush-to-zero mode off. The flag bits are scratch.

  • Status Fields 2 and 3—At procedure calls and returns, the control bits in these status fields must agree with the control bits in status field 0 and the trap disable bits should always be set. The flag bits are always available for scratch use.

See Section 4.1.7 for further usage and initial value information.

AR.RNAT

Automatic

RSE NaT collection register. Holds the NaT bits for values stored by the register stack engine. These bits are saved automatically in the register stack backing store.

AR.UNAT

Preserved

User NaT collection register. Holds the NaT bits for values stored by the ST8.SPILL instruction. As a preserved register, it must be saved before a procedure can issue any ST8.SPILL instructions. The saved copy of AR.UNAT in a procedure's frame holds the NaT bits from the registers spilled by its caller; these NaT bits are thus associated with values local to the caller's caller.

AR.PFS

Special

Previous function state. Contains information that records the state of the caller's register stack frame and epilogue counter. It is overwritten on a procedure call; therefore, it must be saved before issuing any procedure calls, and restored prior to returning.

AR.BSP

Read-only

Backing store pointer. Contains the address in the backing store corresponding to the base of the current frame. This register may be modified only as a side effect of writing AR.BSPSTORE while the Register Stack Engine (RSE) is in enforced lazy mode.

AR.BSPSTORE

Special

Backing store pointer. Contains the address of the next RSE store operation. It may be read or written only while the RSE is in enforced lazy mode. Under normal operation, this register is managed by the RSE, and application code should not write to it, except when performing a stack switching operation.

AR.RSC

See Usage

RSE control; the register stack configuration register. This register is divided into the following fields:
  • Mode—Controls the RSE behavior, and has scratch behavior. On a return, this field may be set to a standard value.

  • Privilege level—Controls the privilege level at which the RSE operates, and may not be changed by non-privileged software.

  • Endian mode—Controls the byte ordering used by the RSE, and must never be changed by an application.

AR.LC

Preserved

Loop counter.

AR.EC

Automatic

Epilogue counter (preserved in AR.PFS).

AR.CCV

Scratch

Compare and exchange comparison value.

AR.ITC

Read-only

Interval time counter.

AR.K0—AR.K7

Read-only

Kernel registers.

AR.CSD

Scratch

Reserved for use as implicit operand registers in future extensions to the Itanium architecture. To ensure forward compatibility, OpenVMS considers these registers as part of the thread and process state.

AR.SSD

Scratch

Reserved for use as implicit operand registers in future extensions to the Itanium architecture. To ensure forward compatibility, OpenVMS considers these registers as part of the thread and process state.

4.1.7. Floating-Point Status

The floating-point status of a program consists of two parts:
  • The AR.FPSR hardware register

  • A supplementary software register (a quadword)

The floating-point status is generally managed using three OpenVMS system services:
  • SYS$IEEE_SET_FP_CONTROL

  • SYS$IEEE_SET_PRECISION_MODE

  • SYS$IEEE_SET_ROUNDING_MODE

The AR.FPSR hardware register is described in the Intel IA-64 Architecture Software Developer's Manual. The supplementary software register is internal to OpenVMS and is not documented for general use. This register holds information used by OpenVMS to implement the three system services and floating-point exception handling generally. It can only be accessed indirectly using the system services.

The floating-point status consists of two types of information:
  • Floating-point control status bits are those bits or flags that control the operation of floating-point arithmetic operations. These bits include the trap disable flags (traps.vd, .dd, .zd, .od, ud, and .id) as well as the the ftz, wre, pc, rc, and td fields in each of the status fields (sf0, sf1, sf2, and sf3) of the AR.FPSR hardware register.

  • Floating-point information status bits are those bits or flags that record summary information about the execution of previous floating-point arithmetic operations. These bits include the v, d, z, o, u, and i flags in each of the status fields (sf0, sf1, sf2, and sf3).


Note

The floating-point control status is sometimes informally also called the floating-point mode or IEEE mode.

Using a compiler or linker switch, you can associate a floating-point control status with the main procedure of a program to set the floating-point state prior to the beginning of program execution. If no control status is explicitly set, a default status appropriate for full IEEE computation is used.

Two floating-point control status settings are of particular interest:
  • Full IEEE-format floating-point control status—the default, unless the status is explicitly set to another value.

  • VAX-format floating-point control status—can be set for programs that use VAX-format floating-point processing.

Table 4.6 shows the values placed in the AR.FPSR hardware register when the Full IEEE-format floating-point control status is used.
Table 4.6. Full IEEE-Format Floating-Point Status Register

Status Field

Flags

td

rc

pc

wre

ftz

sf0

000000

0

00

11

0

0

sf1

000000

1

00

11

1

0

sf2 and sf3

000000

1

00

11

0

0

global trap disable bits: .id, .ud, .od, .zd, .dd, .vd

111111

inherit floating-point mode on thread creation

0

Table 4.7 shows the values placed in the AR.FPSR hardware register when the VAX-format floating-point control status is used.
Table 4.7. VAX-Format Floating-Point Status Register

Status Field

Flags

td

rc

pc

wre

ftz

sf0

000000

0

00

11

0

0

sf1

000000

1

00

11

1

0

sf2 and sf3

000000

1

00

11

0

0

global trap disable bits: .id, .ud, .od, .zd, .dd, .vd

110010

inherit floating-point mode on thread creation

0

For both IEEE-format and VAX-format floating-point processing, additional floating-point status settings may be available. See your compiler documentation for other optional settings.

It is generally assumed that the initial floating-point control status will remain unchanged throughout execution of the whole program. However, a procedure (or cooperating group of procedures) may temporarily modify the floating-point control status provided the control status is restored to its value on entry. The control status can be restored by one of three methods: a normal return, resignalling, or unwinding for an exception. See Section 9.5.3.4 for additional information.

Because the floating-point control status can vary and can be changed dynamically (even if later restored), the state of the floating-point control status is generally indeterminate when a routine (especially a shared library routine) is called. Usually this is acceptable. For example, returning a NaN or raising an exception are both valid ways to handle exceptional conditions. However, if correct operation of a routine depends on a particular floating-point control setting, then the called routine must save the control status on entry, set the needed control status, perform its operation, and restore the control status when it exits. (Whether the informational status is similarly saved and restored is unspecified).

4.1.8. User Mask

The User Mask register contains five bits that may be modified by an application program, subject to the following conventions:
  • BE (Big Endian Memory Access Enable) — This bit must never be set on OpenVMS.

  • UP (User Performance Monitor Enable) — This bit is reserved.

  • AC (Alignment Check) — The application may set or clear this bit as desired. If the AC bit is clear, an unaligned memory reference may cause the system to deliver an exception to the application, or the system may emulate the unaligned reference. If the AC bit is set, an unaligned reference will always cause the system to deliver an exception to the application. At program start, the value of this bit on OpenVMS is clear.

  • MFL/MFH (Lower/Upper floating-point registers written) — The application should not clear either of these bits unless the values in the corresponding registers are no longer needed (for example, it may clear the MFH bit when returning from a procedure, because the upper set of floating-point registers is all scratch). Doing so otherwise may cause unpredictable behavior.

4.1.9. Additional Register Usage Information

As described in earlier sections, some registers are volatile and cannot be used to communicate information between routines (see Tables 4.1, 4.3, and 4.4). For example, B6 is used by OTS$JUMP_TO_BPV (see Section 4.7.7).

Of the volatile registers, the following registers are reserved for use by compiled code to communicate with specialized compiler support routines that require out of band information passing:
  • Static general registers R17—R18

  • Predicate register P15

  • Branch register B7

For example, R17 and R18 are used by OTS$CALL_PROC (see Section 6.1.2.3).

The following static general registers may be used within and between procedures in any mutually consistent combination of ways:
  • R3—R7

  • R10—R11

  • R19—R24

  • R26—R31

The normal or default use for these registers is shown in the Class column of Table 4.1. However, using suitable programming language features, it is valid for any of these registers to be used as preserved, scratch, input, output, global or not used. Of course, the unwind information (see Section A.4) for each procedure must accurately describe the actual usage.

Registers R8 and R9 may also be used as inputs (whether or not the procedure has a return value), but not in any additional ways.

General registers whose class is described as constant, special, volatile or automatic in Section 4.1.1 cannot be used in any other way.

Floating-point, predicate, branch, and application registers can be used only according to the class described in Sections 4.1.2 through 4.1.6.

4.2. Address Representation

An address is a 64-bit value used to denote a position in memory. However, for compatibility with OpenVMS VAX and Alpha, many OpenVMS applications and user-mode facilities operate in such a manner that addresses are restricted to values that are representable in 32 bits. This means that OpenVMS addresses can often be stored and manipulated as 32-bit longword values. In such cases, the 32-bit address value is always implicitly or explicitly sign-extended to form a 64-bit address for use by the Itanium hardware.

4.3. Procedure Representation

A procedure value, sometimes called a function pointer, is a value that uniquely identifies a procedure and can be used to call it.

For OpenVMS, a procedure value is the address of a function descriptor, which consists of at least two quadword fields: the address of the entry point and the GP value required by that procedure.

Every procedure whose address is taken, or might be taken, must have a unique official function descriptor. The address of this function descriptor is used for the procedure value that is passed as a parameter or when two procedure values are compared. For other purposes, additional local function descriptors may be used for efficiency (notably in images other than the image that contains the procedure).

An official function descriptor for any procedure which might be callable from a VAX or Alpha translated image must include signature information. A local function descriptor used to call a procedure that might be part of a VAX or Alpha translated image must also include additional fields to facilitate the call. Both of these cases are described in Section 6.1.2.

A function descriptor for a bound procedure uses a special pseudo-GP value and includes an uplevel frame pointer. Such function descriptors are described in Section 4.7.7.

The several kinds of function descriptors are summarized in Table 4.8.
Table 4.8. Summary of Function Descriptor Kinds

Kinds and Roles

Size (Quadwords)

Local function descriptor without translated image support

2

Local function descriptor with translated image support (jacket function descriptor)

4

Official function descriptor without translated image support

3

Official function descriptor with translated image support

3

Bound function descriptor

6

Note that the different kinds of function descriptor are not self-identifying (that is, they do not contain any form of tag or kind field).

4.4. Procedure Types

This calling standard defines the following basic types of procedures:
  • Memory stack procedure—allocates a memory stack and may maintain part or all of its caller's context on that stack.

  • Register stack procedure—allocates only a register stack and maintains its caller's context in registers.

  • Null frame procedure—allocates neither a memory stack nor a register stack and therefore preserves no context of its caller.

    Note

    Unlike an Alpha null frame procedure (see Section 3.4 and Section 3.4.6), an I64 null frame procedure does not execute in the context of its caller because the Intel® Itanium® call instruction (br.call) changes the register set so that only the caller's output registers are accessible in the called routine. The caller's input and local registers cannot be accessed at all. The call instruction also changes the previous frame state (PFS) of the Itanium processor.

A compiler may choose which type of procedure to generate based on the requirements of the procedure in question. A calling procedure does not need to know what type of procedure it is calling.

Every memory stack procedure or register stack procedure must have an associated unwind description (see Appendix A) which describes what type of procedure it is and other procedure characteristics. A null frame procedure may also have an associated unwind description. (A default description applies if not). This data structure is used to interpret the call stack at any given point in a thread's execution. It is typically built at compile time and usually is not accessed at run-time except to support exception processing or other rarely executed code.

Read access to unwind descriptions is provided through the procedural interfaces described in Section 4.8 and Section A.6.

An unwind description for a procedure is provided for the following reasons:
  • To make invocations of that procedure visible to and interpretable by facilities such as the debugger, exception handling system, and the unwinder.

  • To ensure that the context of the caller saved by the called procedure can be restored if an unwind occurs. (For a description of unwinding, see Section 9.7).

4.5. Memory Stack

The memory stack is used for local dynamic storage, spilled registers, and parameter passing. It is organized as a stack of procedure frames, beginning with the main program's frame at the base of the stack, and continuing towards the top of the stack with nested procedure calls. At the top of the stack is the frame for the currently active procedure. (There may be some system-dependent frames at the base of the stack, prior to the main program's frame, but an application program may not make any assumptions about them).

The memory stack begins at an address determined by the operating system, and grows towards lower addresses in memory. The stack pointer register (SP) always points to the lowest address in the current, top-most, frame on the stack.

Each procedure creates its frame on entry by subtracting its frame size from the stack pointer, and removes its frame from the stack on exit by restoring the previous value of SP (usually by adding its frame size, but a procedure may save the original value of SP when its frame size may vary).

Because the register stack is also used for the same purposes as the memory stack, not all procedures need a memory stack frame. However, every non-leaf procedure must save at least its return link and the previous frame marker, either on the register stack or on the memory stack. This ensures that there is an invocation context for every non-leaf procedure on one or both of the stacks.

4.5.1. Procedure Frames

A memory stack procedure frame consists of five regions, as illustrated in Figure 4.1.

Figure 4.1. Procedure Frame
Procedure Frame
These regions are:
  • Scratch area. This 16-byte region is provided as scratch storage for procedures that are called by the current procedure. Leaf procedures need not allocate this region. A procedure may use the 16 bytes pointed to by the stack pointer (SP) as scratch memory, but the contents of this area are not preserved by a procedure call.

  • Outgoing parameters. Parameters in excess of those passed in registers are stored in this region of the stack frame. A procedure accesses its incoming parameters in the outgoing parameter region of its caller's stack frame.

  • Frame marker (optional). This region may contain information required for unwinding through the stack (for example, a copy of the previous stack pointer).

  • Dynamic allocation. This variable-sized region (initially zero length) can be created as needed.

  • Local storage. A procedure can store local variables, temporaries, and spilled registers in this region. For conventions affecting the layout of this area for spilled registers, see Section A.3.

Whenever control is transferred to another procedure, the stack pointer must be octaword-aligned; at other times there is no stack alignment requirement. (A side effect of this is that the in-memory portion of the argument list will start on an octaword boundary). During a procedure invocation, the SP can never be set to a value higher than the SP at entry to that procedure invocation.

Note

A stack pointer that is not octaword aligned is valid only in a variable-sized frame (see below) because the unwind descriptor (MEM_STACK_F, see Section A.4.1.3) for a fixed-size frame specifies the size in 16-byte units.

An application may not write to memory addresses lower than the stack pointer, because this memory area may be written to asynchronously (for example, as a result of exception processing).

Most procedures are expected to have a fixed-size frame, and the conventions are biased in favor of this. A procedure with a fixed-size frame may reference all regions of the frame with a compile-time constant offset relative to the stack pointer. Compilers should determine the total size required for each region, and pad the local storage area to make the total frame size a multiple of 16 bytes. The procedure can then create the frame by subtracting an immediate constant from the stack pointer in the prologue, and remove the frame by adding the same immediate constant to the stack pointer in the epilogue.

If a procedure has a variable-size frame (for example, a C routine that calls the alloca built-in), it should make a copy of SP to serve as a frame pointer before subtracting the initial frame size from the stack pointer. The procedure can then restore the previous value of the stack pointer in the epilogue without regard for how much dynamic storage has been allocated within the frame. It can also use the frame pointer to access the local storage region, because offsets from SP will vary.

A frame pointer, as described above, is not required if both of the following conditions are true:
  • The procedure uses an equivalent method of addressing the local storage region correctly before and after dynamic allocation.

  • The code satisfies the conditions imposed by the stack unwind mechanism.

To expand a stack frame dynamically, the scratch area, outgoing parameters, and frame marker regions (which are always located relative to the current stack pointer), must be relocated to the new top of stack. If the scratch area and outgoing parameter area are both clear of any live values, there is no actual work involved in relocating these areas. For procedures with dynamically-sized frames, it is recommended that the previous stack pointer value be stored in a local stacked general register instead of the frame marker, so that the frame marker is also empty. If the previous stack pointer is stored in the frame marker, the code must take care to ensure that the stack is always unwindable while the stack is being expanded (see Appendix A).

Other issues depend on the compiler and the code being compiled. The standard calling sequence does not define a maximum stack frame size, nor does it restrict how a language system uses any stack frame region beyond those purposes described here. For example, the outgoing parameter region can be used as scratch storage whenever it is not needed for passing parameters.

4.5.2. Stack Overflow Detection

This section defines the conventions to support the execution of multiple threads in a multilanguage OpenVMS environment. Specifically defined is how compiled code must perform stack limit checking. While this standard is compatible with a multithreaded execution environment, the detailed mechanisms, data structures, and procedures that support this capability are not specified in this manual.

For a multithreaded environment, the following characteristics are assumed:
  • There can be one or more threads executing within a single process.

  • The state of a thread is represented in a thread environment block (TEB).

  • The TEB of a thread contains information that determines a stack limit below which the stack pointer must not be decremented by the executing code (except for code that implements the multithreaded mechanism itself).

  • Exception handling is fully reentrant and multithreaded.

4.5.2.1. Stack Limit Checking

A program that is otherwise correct can fail because of stack overflow. Stack overflow occurs when extension of the stack (by decrementing the stack pointer, SP) allocates addresses not currently reserved for the current thread's stack. This section defines the conventions for stack limit checking in a multithreaded environment.

In the following sections, the term new stack region refers to the region of the stack from one less than the old value of SP to the new value of SP.

Stack Guard Region

In a multithreaded environment, the address space beyond each thread's stack is protected by contiguous guard pages, which trap on any access. These pages form the stack guard region.

Stack Reserve Region
In some cases, it is useful to maintain a stack reserve region, which is a minimum-sized region that is between the current top of stack and the stack guard region. A stack reserve region can ensure that the following conditions exist:
  • Exceptions or asynchronous system traps (ASTs, analogous to asynchronous signals) have stack space to execute on a thread's stack.

  • The exception dispatcher and any exception handler that it might call have stack space to execute after detection of an invalid attempt to extend the stack.

This calling standard does not require a stack reserve region, but it does allow a language (for example, Ada) and its run-time system to implement one.

4.5.2.1.1. Methods for Stack Limit Checking

Because accessible memory may be available at addresses lower than those occupied by the stack guard region, compilers must generate code that never extends the stack past the stack guard region into accessible memory that is not allocated to the thread's stack.

A general strategy to prevent extending the stack past the stack guard region is to access each page of memory down to and possibly including the page corresponding to the intended new value of the SP. If the stack is to be extended by an amount larger than the size of a memory page, then a series of accesses is required that works from higher to lower addressed pages. If any access results in a memory access violation, then the code has made an invalid attempt to extend the stack of the current thread.

This calling standard defines two methods for stack limit checking, implicit and explicit, which are explained in the following sections.

Implicit Stack Limit Checking

If a byte (not necessarily the lowest) of the new stack region is guaranteed to be accessed prior to any further stack extension, then the stack can be extended by an increment that is up to one-half the stack guard region (without any additional accesses).

This standard requires that the minimum stack guard region size is 8192 bytes.

If the stack is being extended by 4096 bytes or less and the application does not use a stack reserve region, then explicit checking is not required. However, because asynchronous interrupts and calls to other procedures may also cause stack extension without explicit checking, stack extension with implicit checking must adhere to the following rules:
  • Explicit stack limit checking must be performed unless the amount by which the SP is decremented is known to be less than or equal to 4096 and the application does not use a stack reserve region.

  • Some byte in the new stack region must be accessed before the SP can be further decremented for a subsequent stack extension.

    This access can be performed either before or after the SP is decremented for this stack extension, but it must be done before the SP can be decremented again.

  • No standard procedure call can be made before some byte in the new stack region is accessed.

  • The system exception dispatcher ensures that the lowest addressed byte in the new stack region is accessed if any kind of asynchronous interrupt occurs both after the SP is decremented and before the access in the new stack region occurs.

These conventions ensure that the stack pointer is not decremented so that it points to accessible storage beyond the stack limit without this error being detected (either by the guard region being accessed by the thread or by an explicit stack limit check failure).

As a matter of practice, the system can provide multiple guard pages in the stack guard region. When a stack overflow is detected as a result of access to the stack guard region, one or more guard pages can be unprotected for use by the exception handling facility, as long as one or more guard pages remain protected to provide implicit stack limit checking during exception processing.

Explicit Stack Limit Checking

If the stack is being extended by an unknown amount or by a known amount that is greater than the maximum implicit check size 4096, then a code sequence that follows the rules for implicit stack limit checking can be executed in a loop to access the new stack region incrementally in segments that are less than or equal to the minimum stack guard region size 8192. At least one access must occur in each such segment.

The first access must occur between SP and SP-4096, because in the absence of more specific information, the previous guaranteed access relative to the current stack may be as much as 4096 bytes greater than the current stack pointer address.

The last access must be within 4096 of the intended new value of the stack pointer. These accesses must occur in order, starting with the highest addressed segment and working toward the lowest addressed segment.

A more optimal strategy is:
  1. Perform a read access using the intended new value of the stack pointer. This is nondestructive, even if the read is beyond the stack guard region, and may facilitate OS mapping of new stack pages, if appropriate, in a single operation.

  2. Proceed with sequential accesses as just described.


Note

A simple algorithm that is consistent with this requirement (but achieves up to twice the minimum number of accesses) is to perform a sequence of accesses in a loop starting with the previous value of SP, decrementing by the minimum no-check extension size (4096) to, but not including, the first value that is less than the new value for the stack pointer.

The stack must not be extended incrementally in procedure prologues. A procedure prologue that needs to extend the stack by an amount of unknown size or known size greater than the minimum implicit check size must test new stack segments as just described in a loop that does not modify SP, and then update the stack with one instruction that copies the new stack pointer value into the SP.

Note

An explicit stack limit check can be performed either by inline code that is part of a prologue or by a run-time support routine that is tailored to be called from a procedure prologue.

Stack Reserve Region Checking

The size of the stack reserve region must be included in the increment size used for stack limit checks, after which it is not included in the amount by which the stack is actually extended. (Depending on the size of the stack reserve region, this may partially or even completely eliminate the ability to use implicit stack limit checking).

4.6. Register Stack

General registers R32 through R127 form a register stack that is automatically managed across procedure calls and returns. Each procedure frame on the register stack is divided into two dynamically-sized regions: one for input parameters and local variables, and one for output parameters.

On a procedure call, the registers are automatically renamed by the hardware so that the caller's output registers form the base of the register stack frame of the callee. On return, the registers are restored to the previous state, so that the input and local registers are preserved across the call.

The ALLOC instruction is used at the beginning of a procedure to allocate the input, local, and output regions; the sizes of these regions are supplied as immediate operands. A procedure is not required to issue an ALLOC instruction if it does not need to store any values in its register stack frame. It may write to the first N stacked registers, where N is the value of the argument count passed in the argument information (AI) register (see Section 4.7.5.3). It may not write to any other stack register without first issuing an ALLOC instruction.

Figure 4.2 illustrates the operation of the register stack across an example procedure call. In this example, the caller allocates eight input, twelve local, and four output registers; the callee allocates four input, six local, and five output registers with the following instruction:
  ALLOC R36=rspfs, 4, 6, 5, 0

The actual registers to which the stacking registers are physically mapped are not directly addressable by the application software.

4.6.1. Input and Local Registers

The hardware makes no distinction between input and local registers. The caller's output registers automatically become the callee's register stack frame on a procedure call, with all registers initially allocated as output registers. An ALLOC instruction may increase or decrease the total size of the register stack frame, and may adjust the boundary between the input and local region and the output region.

The software conventions specify that up to eight general registers are used for parameter passing. Any registers in the input and local region beyond those eight may be allocated for use as preserved locals. Floating-point parameters may produce holes in the parameter list that is passed in the general registers; those unused input registers may also be used for preserved locals.

The caller's output registers do not need to be preserved for the caller. Once an input parameter is no longer needed, or has been copied elsewhere, that register may be reused for any other purpose within the procedure.

Figure 4.2. Operation of the Register Stack
Operation of the Register Stack

4.6.2. Output Registers

Up to eight output registers are used for passing parameters. If a procedure call requires fewer than eight general registers for its parameters, the calling procedure does not need to allocate more than are needed. If the called procedure expects more parameters, it will allocate extra input registers; these registers will be uninitialized.

A procedure may also allocate more than eight registers in the output region. While the extra registers may not be used for passing parameters, they can be used as extra scratch registers. On a procedure call, they will show up in the called procedure's output area as excess registers, and may be modified by that procedure. The called procedure may also allocate few enough total registers in its stack frame that the top of the called procedure's frame is lower than the caller's top-of-frame, but those registers will become available again when control returns to the caller.

4.6.3. Rotating Registers

A subset of the registers in the procedure frame may be designated as rotating registers. The rotating register region always starts with R32, and may be any multiple of eight registers in number, up to a maximum of 96 rotating registers. The renaming is under control of the Register Rename Base (RRB).

If the rotating registers include any or all of the output registers, software must be careful when using the output registers for passing parameters, because a non-zero RRB will change the virtual register numbers that are part of the output region. In general, software should ensure either that the rotating region does not overlap the output region, or that the RRB is cleared to zero before setting output parameter registers.

4.6.4. Frame Markers

The current application-visible state of the register stack is stored in an architecturally inaccessible register called the current frame marker. On a procedure call, this register is automatically saved by copying it to an application register, the previous function state (AR.PFS). The current frame marker is modified to describe a new stack frame whose input and local area is initially zero size, and whose output area is equal in size to the previous output area. On return, the previous frame state register is used to restore the current frame marker to its earlier value, and the base of the register stack is adjusted accordingly.

It is the responsibility of a procedure to save the previous function state register before issuing any procedure calls of its own, and to restore it before returning.

4.6.5. Backing Store for Register Stack

When the depth of the procedure call stack exceeds the capacity of the physical register file, the hardware frees physical registers by saving them into a memory stack. This backing store is distinct from the memory stack described in Section 4.5.

As returns unwind the procedure call stack, the hardware also restores previously-saved physical registers from the backing store.

The operation of this register stack engine (RSE) is mostly transparent to application software. While the RSE is running, application software may not examine the contents of the backing store, and may not make any assumptions about how much of the register stack is still in physical registers or in the backing store. In order to examine previous stack frames, application software must synchronize the RSE with the FLUSHRS instruction. Synchronizing the RSE forces all stack frames up to, but not including, the current frame to be saved in backing store, allowing the software to examine the contents of the backing store without asynchronous operations modifying the memory. Modifications to the backing store require setting the RSE to enforced lazy mode after synchronizing it, which prevents the RSE from doing any operations other than those required by calls and returns. The procedure for synchronizing the RSE and setting the mode is described in the Itanium® Software Conventions and Runtime Architecture Guide.

The backing store grows towards higher addresses. The top of the stack, which corresponds to the top of the previous procedure frame, is available in the Backing Store Pointer (BSP) application register. The BSP must always point to a valid backing store address, because the operating system may need to start the RSE to process an exception.

Backing store overflow is automatically detected by the OpenVMS operating system, which will either extend the backing store to allow continued operation or will raise an exception. Unlike for the memory stack (see Section 4.5), there are no specific rules or requirements that must be satisfied to facilitate detection of backing store overflow.

A NaT collection register is stored into the backing store following each group of 63 physical registers. The NaT bit of each register stored is shifted into the collection register. When the BSP reaches the quadword just before a 64-quadword boundary, the RSE stores the collection register. Software can determine the position of the NaT collection registers in the backing store by examining the memory address. This process is described in greater detail in the Intel IA-64 Architecture Software Developer Manual.

4.7. Procedure Linkage

This calling standard states that a standard call (see Section 1.4) can be accomplished in any way that presents the called routine with the required environment. However, typically, most standard-conforming external calls are implemented with a common sequence of instructions and conventions. Because a common set of call conventions is so pervasive, these conventions are included for reference as part of this standard.

4.7.1. The GP Register

Every procedure that references statically-allocated data or calls another procedure requires a pointer to an associated short data segment in the GP register, so that it can access its static data and its linkage tables. Typically, an image has one such data segment, and the GP register must be set correctly prior to calling any entry point within that image. Optionally, an image may be partitioned into subcomponents called clusters in which case each cluster may have its own associated data segment (clusters may also share a common data segment). For further information on images and clusters, see the VSI OpenVMS Linker Utility Manual.

Throughout this chapter, rules regarding the use of the GP register are described in terms of images. However, these same rules apply between clusters within an image (keeping in mind that clusters within an image may share a common GP address and short data segment, while images cannot share a common GP address and short data segment).

The linkage conventions require that each image (or cluster) define exactly one GP value to refer to a location within its short data segment. This location should be chosen to maximize the usefulness of short-displacement immediate instructions for addressing scalars and linkage table entries. The image activator determines the absolute value of the GP register for each image after loading its data segment into memory.

Because the GP register remains unchanged for calls within an image, calls known to be local can be optimized accordingly. For calls between images, the GP register must be initialized with the correct GP value for the new image, and the calling function must ensure that its own GP value is saved and restored.

Note that there is a small set of compiler run-time support procedures that take a special pseudo-GP value as a kind of input parameter. See Section 4.7.7 for more information about support for bound function descriptors. See Section 6.1.2 for information about support for translated images.

4.7.2. Types of Calls

The following types of procedure calls are defined:
  • Direct local calls. Direct calls within the same image can be made directly to the entry point of the target procedure. In this case, the GP register does not need to be changed.

  • Direct non-local calls. Calls made outside the same image are routed through an import stub (which can be inlined at compile time if the call is known or suspected to be to another image). The import stub obtains the address of the main entry point and the GP register value from the linkage table. Although coded in source as a direct call, a dynamically-linked call therefore becomes indirect.

  • Indirect calls. A function pointer points to a descriptor that contains both the address of the function entry point and the GP register value for the target function. The compiler must generate code for an indirect call that sets the new GP value before transferring control to the target procedure.

  • Special calls. Other special calling conventions are allowed to the extent that the compiler and the run-time library agree on the conventions, and provided that the stack can be unwound through such a call. Such calls are outside the scope of this document. See Section A.3.1 for a discussion of stack unwind requirements.

4.7.3. Calling Sequence

Direct and indirect procedure calls are described in the following sections. Because the compiler is not required to know whether any given call is local or to a dynamically linked image, the two types of direct calls are described together in Section 4.7.3.1.

4.7.3.1. Direct Calls

Direct procedure calls follow the sequence of steps shown in the following figure. The following paragraphs describe these steps in detail.

Figure 4.3. Direct Procedure Calls
Direct Procedure Calls
  • Caller: Prepare call. Values in scratch registers that must be kept live across the call must be saved. They can be saved by copying them into local stacked registers, or by saving them on the memory stack. If the NaT bits associated with any live scratch registers must be saved, the compiler should use ST8.SPILL or STF.SPILL instructions. The User NaT collection register itself is preserved by the call, so the NaT bits need no further treatment at this point.

    If the call is not known (at compile time) to be within the same image, the GP register must be saved.

    The parameters must be set up in registers and memory as described in Section 4.7.4.

  • Caller: Call. All direct calls are made with a BR.CALL instruction, specifying B0 for the return link.

    For direct local calls, the PC-relative displacement is computed at link time. Compilers may assume that the standard displacement field in the BR.CALL instruction is sufficiently wide to reach the target of the call. If the displacement is too large, the linker must supply a branch stub at some convenient point in the code; compilers must guarantee the existence of such a point by ensuring that code sections in the relocatable object files are no larger than the maximum reach of the BR.CALL instruction. With a 25-bit displacement, the maximum reach is 16 megabytes in either direction from the point of call.

    Because direct calls to other images cannot be statically bound at link time, the linker must supply an import stub for the target procedure; the import stub obtains the address of the target procedure from the linkage table. The BR.CALL instruction can then be statically bound to the import stub using the PC-relative displacement.

    The BR.CALL instruction performs the following actions:
    • Saves the return link in the return branch register

    • Saves the current frame marker in the AR.PFS register

    • Sets the base of the new register stack frame to the beginning of the output region of the old frame

  • Caller: Import stub (direct non-local calls only). The import stub is allocated in the image of the caller, so that the BR.CALL instruction can be statically bound to the address of the import stub. It must access the linkage table via the current GP (which means that GP must be valid at the point of call), and obtain the address of the target procedure's entry point and its GP value. The import stub then establishes the new GP value and branches to the target entry point.

    If the compiler knows or suspects that the target of a call is in a separate image, it can generate calling code that performs the functions of the import stub, which saves an extra branch.

    When the target of a call is in the same image, an import stub is not used (which also means that GP must be valid at the point of call).

  • Callee: Entry. The prologue code in the target procedure is responsible for allocating the register stack frame. It is also responsible for allocating a frame on the memory stack when necessary. It may use the 16 bytes at the top of its caller's stack frame as a scratch area.

    A non-leaf procedure must save the return branch register and previous function state, either in the memory stack frame or in a local stacked general register.

    The prologue must also save any preserved registers to be used in this procedure. The NaT bits for those registers must be preserved as well, by copying the NaT bits to local stacked general registers, or by using ST8.SPILL or STF.SPILL instructions. However, the User NaT collection register (AR.UNAT) must be saved first because it is guaranteed to be preserved by the call.

  • Callee: Exit. The epilogue code is responsible for restoring the return branch register and previous function state, if necessary, and any preserved registers that were saved. The NaT bits must be restored using the LD8.FILL or LDF.FILL instructions. The User NaT collection register must also be restored if it was saved.

    If a memory stack frame was allocated, the epilogue code must deallocate it.

    Finally, the procedure exits by branching through the return branch register with the BR.RET instruction.

  • Caller: After the call. Any saved values (including GP) should be restored.

4.7.3.2. Indirect Calls

Indirect procedure calls follow nearly the same sequence as direct calls (see Section 4.7.3.1), except that the branch target is established indirectly. This sequence is illustrated in Figure 4.4.

Figure 4.4. Indirect Procedure Calls
Indirect Procedure Calls
  • Caller: Function Pointer. A function pointer is always the address of a function descriptor for the target procedure (see Section 4.3). An indirect call loads the GP value into the GP register before branching to the entry point address.

    In order to guarantee the uniqueness of a function pointer, and because its value is determined at program invocation time, code must materialize function pointers only by loading a pointer from the data segment.

  • Caller: Prepare call. Indirect calls are made by first loading the function pointer into a general register, loading the entry point address and the new GP value, and using the Move to Branch Register operation to move the address of the procedure entry point into the branch register to be used for the call.

    Values in scratch registers that must be kept live across the call must be saved. They can be saved by copying them into local stacked registers, or by saving them on the memory stack. If the NaT bits associated with any live scratch registers must be saved, the compiler should use ST8.SPILL or STF.SPILL instructions. The User NaT collection register itself is preserved by the call, so the NaT bits need no further treatment at this point.

    Unless the call is known (at compile time) to be within the same image, the GP register must be saved before the new GP value is loaded.

    The parameters must be set up in registers and memory as described in Section 4.7.4

  • Caller: Call. All indirect calls are made with the indirect form of the BR.CALL instruction, specifying B0 for the return link.

    The BR.CALL instruction saves the return link in the return branch register, saves the current frame marker in the AR.PFS register, and sets the base of the new register stack frame to the beginning of the output region of the old frame. Because the indirect call sequence obtains the entry point address and new GP value from the function descriptor, control flows directly to the target procedure, without the need for any intervening stubs.

  • Callee: Entry; Exit. The remainder of the calling sequence is the same as for direct calls (see Section 4.7.3.1).

4.7.4. Parameter Passing

Parameters are passed in a combination of general registers, floating-point registers, and memory, as described below, and as illustrated in Figure 4.5.

The parameter list is formed by placing each individual parameter into fixed-size elements of the parameter list, referred to as parameter slots. Each parameter slot is 64 bits wide; parameters larger than 64 bits are placed in as many consecutive parameter slots as are needed to contain the entire parameter. The rules for allocation and alignment of parameter slots are described in Section 4.7.5.1.

The contents of the first eight parameter slots are always passed in registers, while the remaining parameters are always passed on the memory stack, beginning at the caller's stack pointer plus 16 bytes. The caller uses up to eight of the registers in the output region of its register stack for integer and VAX floating-point parameters, and up to eight floating-point registers for IEEE floating-point parameters. The maximum number of registers used is eight.

Figure 4.5. Parameter Passing in Registers and Memory
Parameter Passing in Registers and Memory

To accommodate variable argument lists in the C language, there is a fixed correspondence between parameter slots; the first parameter slot is always in either the first general output register or the first floating-point register (never both), the second parameter slot is always in the second general output register or the second floating-point register (never both), and so on. This allows a procedure to spill its register parameters easily to memory to form the argument home area before stepping through the parameter list with a pointer. The Argument Information register (AI) makes this possible, as explained in Section 4.7.5.3.

A procedure can assume that the NaT bits on its incoming general register arguments are clear, and that the incoming floating-point register arguments are not NaTVals. A procedure making a call must ensure only that registers containing actual parameters are clear of NaT bits or NaTVals; registers not used for actual parameters are undefined.

4.7.5. Parameter Passing Mechanisms

This OpenVMS calling standard defines three classes of argument items according to the mechanism used to pass the argument:
  • Immediate value

  • Reference

  • Descriptor

Argument items are not self-defining; interpretation of each argument item depends on agreement between the calling and called procedures.

This standard does not dictate which passing mechanism must be used by a given language compiler. Language semantics and interoperability considerations might require different mechanisms in different situations.

Immediate value

An immediate value argument item contains the value of the data item. The argument item, or the value contained in it, is directly associated with the parameter.

Reference

A reference argument item contains the address of a data item such as a scalar, string, array, record, or procedure. This data item is associated with the parameter.

Descriptor

A descriptor argument item contains the address of a descriptor, which contains structural information about the argument's type (such as array bounds) and the address of a data item. This data item is associated with the parameter.

Requirements for using the argument passing mechanisms follow:
  • By immediate value. An argument may be passed by immediate value only if the argument is one of the following:
    • One of the noncomplex scalar data types with a size known (at compile time) to be ≤ 64 bits

    • Either single or double precision complex

    • A record with a known size (at compile time)

    • A set, implemented as a bit vector, with a size known (at compile time) to be ≤ 64 bits

    No form of string or array data type may be passed by immediate value in a standard call.

    Unused high-order bits must be zero or sign-extended, as appropriate depending on the date type, to fill all bits of each argument list item (as specified in Table 4.10).

    A single-precision or double-precision complex value is passed as two single- or double-precision floating-point values, respectively. Note that the argument count reflects that two argument positions are used rather than just one actual argument.

    A record value, which may be larger than 64 bits, is passed by immediate value as follows:
    • Allocate as many fully occupied argument item positions to the argument value as are needed to represent the argument.

    • If the final argument position is only partially occupied by the argument, the contents of the remaining bits are undefined.

    • If an argument position is passed in one of the registers, it can only be passed in an integer register (never in a floating-point register).

    Other argument values that are larger than 64 bits can be passed by immediate value using nonstandard conventions, typically using a method similar to those for passing records. Thus, for example, a 26-byte string can be passed by value in four integer registers.

  • By reference. Nonparametric arguments (arguments for which associated information such as string size and array bounds are not required) can be passed by reference in a standard call. This includes extended precision floating and extended precision complex values.

  • By descriptor. Parametric arguments (arguments for which associated information such as string size and array bounds must be passed to the caller) are passed by a single descriptor in a standard call.

Note that extended floating values are not passed using the immediate value mechanism; rather, they are passed using the by reference mechanism. (However, when by value semantics is required, it may be necessary to make a copy of the actual parameter and pass a reference to that copy in order to avoid improper alias effects).

Also note that when a record is passed by immediate value, the component types are not material to how the argument is aligned; the record will always be quadword aligned.

4.7.5.1. Allocation of Parameter Slots

Parameter slots are allocated for each parameter, based on the parameter passing mechanism, type, and size, treating each parameter in sequence, from left to right. The rules for allocating parameter slots and placing the contents within the slot are given in Table 4.9. The allocation column of the table indicates how parameter slots are allocated to each type of parameter.
Table 4.9. Rules for Allocating Parameter Slots

Type

Size (Bits)

Number of Slots

Integer, small set

1-64

1

Address/pointer (including all types passed by reference or descriptor)

64

1

IEEE single-precision floating-point (S_floating)

32

1

IEEE single-precision floating-point complex (S_floating)

64

2

IEEE double-precision floating-point (T_floating)

64

1

IEEE double-precision floating-point complex (T_floating)

128

2

IEEE quad-precision floating-point (X_floating)

64 (by reference)

1

IEEE quad-precision floating-point complex (X_floating)

64 (by reference)

1

Aggregates (noncomplex)

any

(size+63)/64

VAX single-precision floating-point (F_floating)

32

1

VAX single-precision floating-point complex (F_floating)

64

2

VAX double-precision floating-point (D_ & G_floating)

64

1

VAX double-precision floating-point complex (D_ & G_floating)

128

2

Note

These rules are applied based on the type of the parameter after any type-promotion rules specified by the language have been applied. For example, a short integer passed without a function prototype in C is promoted to the int type, and is then passed according to the rules for the int type.

OpenVMS does not support passing the Itanium double-precision extended floating-point type (__float80), although that type may be used from time to time in code generation sequences.

This placement policy does not ensure that parameters greater than 64 bits in size will fall on a natural alignment boundary if passed in memory. Such parameters may need to be copied by the called procedure into an aligned temporary prior to use, or accessed in a way that does not depend on natural alignment.

4.7.5.2. Normal Register Parameters

The first eight parameter slots (64 bytes) are passed in registers, according to the rules in this section.
  • These eight argument slots are associated, one-to-one, with the stacked output general registers, as shown in Figure 4.5.

  • Integral scalar parameters, (including addresses and pointers), VAX floating-point parameters, and aggregate parameters in these slots are passed only in the corresponding output general registers.

  • Aggregate parameters in these slots are passed by value only in the corresponding output general registers. The aggregate is treated as a sequence of 64-bit integral values, with each value allocated into the next available slot in aggregate memory address order. If the size of the aggregate is not an even multiple of 64 bits, then the unused bits in the last slot are undefined.

  • If an aggregate or VAX floating-point complex parameter straddles the boundary between slot 7 and slot 8, the part that lies within the first eight slots is passed in general registers, and the remainder is passed in memory, as described in Table 4.10.

    Complex values (other than IEEE quad-precision floating-point complex), in those languages that include complex types, are passed as a pair of floating-point values (either single-precision or double-precision as appropriate). It is possible for the first of the two floating-point values in a complex value to occupy the last output register slot; in this case, the second floating-point value is passed in memory. IEEE quad-precision floating-point complex values are passed by reference.

  • IEEE single-precision and double-precision floating-point scalar parameters are passed in the corresponding floating-point register slot. IEEE quad-precision floating-point scalar parameters are passed by reference in the corresponding output general registers.

When IEEE floating-point parameters are passed in floating-point registers, they are passed in the register format, rounded to the appropriate precision. They are never passed in the general registers unless part of an aggregate, in which case they are passed in the aggregate memory format. When VAX floating-point parameters are passed in general registers, they are passed in memory format.

Parameters allocated beyond the eighth parameter slot are never passed in registers.

Unsigned integral (except unsigned 32-bit), set, and VAX floating-point values passed in registers are zero-filled; signed integral values as well as unsigned 32-bit integral values are sign-extended to 64 bits. For all other types passed in the general registers, unused bits are undefined.

Note

Bit 31 is replicated in bits 32—63, even for unsigned 32-bit integers.

The rules contained in this section are summarized in Tables 4.10 and 4.11.
Table 4.10. Unused Bits in Passed Data

Data Type
(OpenVMS Names)

Type Designator?

Data Size (bytes)

Register Extension Type

Memory Extension Type

Byte logical

DSC$K_DTYPE_BU

1

Zero64

Zero64

Word logical

DSC$K_DTYPE_WU

2

Zero64

Zero64

Longword logical

DSC$K_DTYPE_LU

4

Sign64

Sign64

Quadword logical

DSC$K_DTYPE_QU

8

Data64

Data64

Byte integer

DSC$K_DTYPE_B

1

Sign64

Sign64

Word integer

DSC$K_DTYPE_W

2

Sign64

Sign64

Longword integer

DSC$K_DTYPE_L

4

Sign64

Sign64

Quadword integer

DSC$K_DTYPE_Q

8

Data64

Data64

F_floating

DSC$K_DTYPE_F

4

VAXF64

Data32

D_floating

DSC$K_DTYPE_D

8

VAXDG64

Data64

G_floating

DSC$K_DTYPE_G

8

VAXDG64

Data64

F_floating complex

DSC$K_DTYPE_FC

2 * 4

2*VAXF64

2*Data32

D_floating complex

DSC$K_DTYPE_DC

2 * 8

2*VAXDG64

2*Data64

G_floating complex

DSC$K_DTYPE_GC

2 * 8

2*VAXDG64

2*Data64

S_floating

DSC$K_DTYPE_FS

4

Hard

Data32

T_floating

DSC$K_DTYPE_FT

8

Hard

Data64

X_floating

DSC$K_DTYPE_FX

16

N/A

N/A

S_floating complex

DSC$K_DTYPE_FSC

2 * 4

2*Hard

2*Data32

T_floating complex

DSC$K_DTYPE_FTC

2 * 8

2*Hard

2*Data64

X_floating complex

DSC$K_DTYPE_FXC

2 * 16

N/A

N/A

Small structures of 8 bytes or less

N/A

≤8

Nostd

Nostd

Small arrays of 8 bytes or less

N/A

≤8

Nostd

Nostd

32-bit address

N/A

4

Sign64

Sign64

64-bit address

N/A

8

Data64

Data64

Table 4.11 contains the defined meanings for the extension type symbols used in Table 4.10.
Table 4.11. Extension Type Codes

Sign Extension Type

Defined Function

Sign64

Sign-extended to 64 bits.

Zero64

Zero-extended to 64 bits.

Data32

Data is 32 bits. The state of bits <63:32> is unpredictable.

2*Data32

Two single-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data32).

Data64

Data is 64 bits.

2*Data64

Two double-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data64).

VAXF64

Data is 64 bits. Low-order 32 bits are the same as the F_floating memory format and the high-order 32 bits are zero. (Used only in a general register, never in a floating-point register).

VAXDG64

Data is 64 bits. Uses the corresponding D_floating or G_floating memory format. (Used only in a general register, never in a floating-point register).

2*VAXF64

Two single-precision parts of the complex value are stored in memory as independent floating-point values (each handled as VAXF64).

2*VAXDG64

Two double-precision parts of the complex value are stored in memory as independent floating-point values (each handled as VAXDG64).

Hard

Passed in the layout defined by the hardware SRM.

2*Hard

Two floating-point parts of the complex value are stored in a pair of registers as independent floating-point values (each handled as Hard).

Nostd

State of all high-order bits not occupied by the data is unpredictable across a call or return.

4.7.5.3. Argument Information (AI) Register

In addition to the normal parameters, an implicit argument information value is passed in register R25, the Argument Information (AI) register. This value is shown in Figure 4.6.

Figure 4.6. Argument Information Register Representation
Argument Information Register Representation

Argument Count is an unsigned byte that specifies the number of 64-bit argument slots used for the argument list. (Note that single and double-precision complex values use two slots, which is reflected in this count).

Argument Register Information is a contiguous group of eight 3-bit fields that correspond to the eight arguments passed in registers. The first group, bits <10:8>, describes the first argument, the second group, bits <13:11>, describes the second argument, and so on. The encoding for each group is described in Table 4.12.
Table 4.12. Argument Information Register Codes

Value

OpenVMS Name

Meaning

0

AI$K_AR_I64

64-bit or 32-bit sign-extended to 64-bit argument passed in an integer register (including addresses). or Argument is not present.

1

AI$K_AR_FF

F_floating (also known as VAX single-precision floating-point) argument passed in a general register.

2

AI$K_AR_FD

D_floating (also known as VAX double-precision floating-point) argument passed in a general register.

3

AI$K_AR_FG

G_floating (also known as VAX double-precision floating-point) argument passed in a general register.

4

AI$K_AR_FS

S_floating (also known as IEEE single-precision floating-point) argument passed in a floating-point register.

5

AI$K_AR_FT

T_floating (also known as IEEE double-precision floating-point) argument passed in a floating-point register.

6,7

Reserved.

4.7.5.4. Memory Stack Parameters

The remainder of the parameter list, beginning with slot 8, is passed in the outgoing parameter area of the memory stack frame, as described in Section 4.5.1. Parameters are mapped directly to memory, with slot 8 placed at location SP+16, slot 9 placed at location SP+24, and so on. Each argument is stored in memory as a series of one or more 64-bit storage units, with unused bits in the last unit undefined.

4.7.5.5. Variable Argument Lists

The rules above support variable-argument list functions in both the K&R and the ANSI dialects of the C language. (Note that argument location is independent of whether a prototype is in scope).

The nth argument is in either Rn or Fn regardless of the type of parameter in the preceding register slot. Therefore, a function with variable arguments may assume that the variable arguments that lie within the first eight argument slots can be found in either the stacked input integer registers (IN0-IN7), or in the floating-point parameter registers (F8-F15). Using the information codes from the AI (Argument Information) register (see Table 4.12), the function can then store these registers to memory using the 16-byte scratch area for IN6/F14 and IN7/F15, and up to 48 bytes at the base of its own stack frame for IN0/F8-IN5/F13, as necessary. This arrangement places all of the variable parameters in one contiguous block of memory.

4.7.5.6. Pointers to Formal Parameters

Whenever the address is formed of a formal parameter that is passed in a register, the compiler must store the parameter to the stack, as it would for a variable argument list.

4.7.5.7. Languages Other than C

The placement of arguments in general registers versus floating-point registers does not depend on any notion or concept of a prototype being in scope. It is therefore applicable to all languages at all times.

4.7.5.8. Rounding Floating-point Values

There must be no difference in behavior between a floating-point parameter passed directly in a register and a floating-point parameter that has been stored to memory and reloaded. In either case, the floating-point value must be the same. This implies that floating-point parameters passed in floating-point registers must be explicitly rounded to the proper precision by the caller.

4.7.5.9. Order of Argument Evaluation

Because most high-level languages do not specify the order of evaluation (with respect to side effects) of arguments, those language processors can evaluate arguments in any convenient order. The choice of argument evaluation order and code generation strategy is constrained only by the definition of the particular language. Programs should not depend on the order of evaluation of arguments.

4.7.5.10. Examples

The following examples illustrate the parameter passing conventions. Floating-point types are IEEE floating-point representations.

Scalar Integers and Floats, With or Without Prototype
extern int func(int, double, double, int);
func(i, a, b, j);
The parameters are passed as follows:

Slot

Variable

Allocation

Argument Register Information

0

i

OUT0

AI$K_AR_I64

1

a

F9

AI$K_AR_FT

2

b

F10

AI$K_AR_FT

3

j

OUT3

AI$K_AR_I64

Aggregates Passed by Value
extern int func();
struct { int array[20]; } a;
func(i, a);
No padding is provided in the parameter list for the structure (independent of its external alignment). The parameters are passed as follows:

Slot

Variable

Allocation

Argument Register Information

0

i

OUT0

AI$K_AR_I64

1-7

a.array[0—13]

OUT1—OUT7

AI$K_AR_I64 (all 7 slots)

8-24

a.array[14—19]

In memory, at SP+16 through SP+39

Not applicable

extern int func();
struct { __float128 x; int array[20]; } a;
func(i, a);
The parameters are passed as follows:

Slot

Variable

Allocation

Argument Register Information

0

i

OUT0

AI$K_AR_I64

1-2

a.x

OUT1—OUT2

AI$K_AR_I64 (both slots)

3-7

a.array[0—9]

OUT3—OUT7

AI$K_AR_I64 (all 5 slots)

8-21

a.array[10—19]

In memory, at SP+16 through SP+55

Not applicable

Floating-Point Aggregates, With or Without Prototype
struct s { float a, b, c; } x;
extern func();
func(x);
The parameters are passed as follows:

Slot

Variable

Allocation

Argument Register Information

0

x.a & x.b

OUT0

AI$K_AR_I64

1

x.c

OUT1

AI$K_AR_I64 (low 32 bits)

4.7.6. Return Values

Values up to 128 bits are returned directly in the registers, according to the rules in Table 4.13.

Integer, enumeration, record, and set values (bit vectors) smaller than 64 bits must be zero-filled (unsigned integers, enumerations, records, sets) or sign-extended (signed integrals) to a full 64 bits. However, for unsigned 32-bit integers, bit 31 is replicated in bits 32—63.

When floating-point values are returned in floating-point registers, they are returned in the register format, rounded to the appropriate precision. When they are returned in the general registers (for example, as part of a record), they are returned in their memory format.

OpenVMS does not support a general notion of homogeneous floating-point aggregates. However, the special case of two single-precision or double-precision floating-point values implementing values of a complex type are handled in an analogous manner.
Table 4.13. Rules for Return Values

Type

Size (Bits)

Location of Return Value

Alignment

Integer/Pointer, small Record, Set

1—64

R8

LSB

IEEE single-precision floating-point (S_floating)

32

F8

N/A

IEEE double-precision floating-point (T_floating)

64

F8

N/A

IEEE single-precision complex (S_floating)

64

F8, F9

N/A

IEEE double-precision complex (T_floating)

128

F8, F9

N/A

VAX single-precision floating-point (F_floating)

32

R8

N/A

VAX double-precision floating-point (D_ and G_floating)

64

R8

N/A

VAX single-precision floating-point complex (F_floating)

64

R8, R9

N/A

VAX double-precision floating-point complex (D_ and G_floating)

128

R8, R9

N/A


Note

X_floating and X_floating complex are not included in this table because they are returned using the hidden parameter method (see below).

The rules in Table 4.13 are expressed in more detail in Table 4.10. F_floating and F_floating complex values in the general registers are zero-extended (Zero64), because this most closely approximates the effect of using the Alpha register format.

Hidden Parameter

Return values other than those covered by Table 4.13 are returned in a buffer allocated by the caller. A pointer to the buffer is passed to the called procedure as a hidden first parameter, and all normal parameters are shifted one slot to make this possible. The return buffer must be aligned at a 16-byte boundary.

4.7.7. Simple and Bound Procedures

There are two distinct classes of procedures:
  • Simple procedure

  • Bound procedure

A simple procedure is a procedure that does not need direct access to the stack of its execution environment. In order to call a simple procedure, a simple function descriptor is created, as shown in Figure 4.7, and described in Table 4.14.

Figure 4.7. Simple Function Descriptor
Simple Function Descriptor
Table 4.14. Simple Function Descriptor

FDSC$Q_ENTRY

Entry code address for the procedure to be called.

FDSC$Q_GP

GP value for the procedure to be called.

A bound procedure is a procedure that does need direct access to the stack of its execution environment, typically to reference an up-level variable or to perform a nonlocal GOTO operation.

When a bound procedure is called, the caller must pass some kind of pointer to the called code that allows it to reference its up-level environment. Typically, this pointer is a frame pointer for that environment, but many variations are possible. When the caller itself is executing within that outer environment, it can usually make such a call directly to the code for the nested procedure without recourse to any additional function descriptors. However, when a procedure value for the nested procedure must be passed outside of that environment to a call site that has no knowledge of the target procedure, a bound function descriptor is created so that the nested procedure can be called just like a simple procedure.

Bound procedure values, as defined by this standard, are designed for multilanguage use and utilize the properties of function descriptors to allow callers of procedures to use common code to call both bound and simple procedures.

A bound function descriptor is similar to a simple function descriptor, with several additional fields as shown in Figure 4.8 and described in Table 4.15.

Figure 4.8. Bound Function Descriptor
Bound Function Descriptor
Table 4.15. Contents of Bound Function Descriptor
Field NameContents

FDSC$Q_OTS_ENTRY

Code address for a suitable library helper routine, for example, OTS$JUMP_TO_BPV

FDSC$Q_OTS_PSEUDO_GP

Address of this bound function descriptor

FDSC$Q_SIGNATURE

Signature information field (see Section 6.1.3)

FDSC$Q_TARGET_ENTRY

Entry code address for the procedure to be called

FDSC$Q_TARGET_GP

GP value for the procedure to be called

FDSC$Q_TARGET_ENVIR

Environment value for the procedure to be called

A bound procedure descriptor is inherently dynamic because the environment value must be determined at runtime by code executing within the bound procedure environment. Therefore, when a bound procedure descriptor such as this is needed, it is usually allocated on the creating procedure's stack.

When a procedure value that refers to a bound procedure descriptor is used to make a call, the routine designated in the OTS_ENTRY field (typically OTS$JUMP_TO_BPV) receives control with the GP register pointing to the bound procedure descriptor (instead of a global offset table). This routine performs the following steps:
  1. Load the "real" target entry address into a volatile branch register, for example, B6.

  2. Load the dynamic environment value into the appropriate uplevel-addressing register for the target function, for example, OTS$JUMP_TO_BPV uses R9.

  3. Load the "real" target GP address into the GP register

  4. Transfer control (branch, not call) to the target entry address.

Control arrives at the real target procedure address with both the GP and environment register values established appropriately.

Support routine OTS$JUMP_TO_BPV is included as a standard library routine. The operation of OTS$JUMP_TO_BPV is logically equivalent to the following code:
OTS$JUMP_TO_BPV::
     add     gp=gp,24        ; Adjust GP to point to entry address
     ld8     r9=[gp],16      ; Load target entry address
     mov     b6=r9
     ld8     r9=[gp],-8      ; Load target environment value
     ld8     gp=[gp]         ; Load target GP
     br      b6              ; Transfer to target

Because the address of a bound function descriptor is a valid function pointer, it may be passed to translated code which uses it to call back into native code; therefore, the value of the signature information field must be the same as that in the official function descriptor for the real target procedure (see Section 6.1.2).

Note that there can be multiple OTS$JUMP_TO_BPV-like support routines, corresponding to different target registers where the environment value should be placed. The code that creates the bound function descriptor is also necessarily compiled by the same compiler that compiles the target procedure, thus can correctly select an appropriate support routine.

4.8. Procedure Call Stack

A procedure is an active procedure while its body is executing, including while any procedure it calls is executing. When a procedure is active, its designated condition handler may handle an exception that is signaled during its execution.

Associated with each active procedure is an invocation context, informally called a frame, which consists of the set of registers and space in memory that is allocated and that may be accessed during execution for a particular call of that procedure.

When a procedure begins to execute, it has a limited invocation context that includes the output registers of its caller (which have been "shifted" to start at register R32). The initial instructions may allocate and initialize additional context, including possibly saving information from the invocation context of its caller. Such instructions, if any, are termed a procedure prologue. Once execution of the prologue is complete, the procedure is said to be active.

When a procedure is ready to return to its caller, the procedure ceases to be active after it begins to execute the instructions that deallocate and discard the procedure's invocation context (which may include restoring state of the caller's invocation context that was saved during the prologue). These instructions are termed a procedure epilogue.

A null frame procedure has no prologue and no epilogue, and consists solely of body instructions. Such a procedure becomes active immediately.

A procedure may have more than one prologue if there are multiple entry points. A procedure may also have more than one epilogue if there are multiple return points. One of each will be executed during any given invocation of the procedure.

A procedure call stack (for a thread) consists of the stack of invocation contexts that exists at any point in time. New invocation contexts are pushed on that stack as procedures are called and invocations are popped from the call stack as procedures return.

The invocation context of a procedure that calls another procedure is said to precede or be previous to the invocation context of the called procedure.

4.8.1. Current Procedure

The current procedure is the active procedure whose execution began most recently; its invocation context is at the top of the call stack. Note that a procedure executing in its prologue or epilogue is not active, and hence cannot be the current procedure.

For OpenVMS, the PC (instruction pointer) register in combination with associated unwind information determines what procedure is current (for exception handling purposes). See Section A.4 for a description of the unwind information data structures.

A procedure is current at a given PC (when OpenVMS semantics apply, see Section A.4.1) if either:
  • The PC is in a range described by any body region unwind descriptor but not in an epilogue

  • The PC is in a range not described by any unwind descriptor, and therefore by default must be within a null frame procedure (see Section A.4.1):

4.8.2. Procedure Call Tracing

Mechanisms for each of the following functions are needed to support procedure call tracing:
  • To provide the context of a procedure invocation

  • To walk (navigate) the procedure call stack

  • To refer to a given procedure invocation

  • To examine or modify the register context of an active procedure

This section describes the data structure mechanisms. The run-time library functions that support these functions are described in Section 4.8.3

4.8.2.1. Invocation Context Block

The context of a specific procedure invocation is provided through the use of a data structure called an invocation context block (ICB). Table 4.16 describes the contents of the OpenVMS I64 invocation context block.

Table 4.16. Contents of the Invocation Context Block

Field

Size

Description

LIBICB$L_CONTEXT_LENGTH

Longword

Unsigned total length in bytes of the invocation context block. See Section 4.8.3.1.

LIBICB$V_FRAME_FLAGS

3 Bytes

See Table 4.17.

LIBICB$B_BLOCK_VERSION

Byte

ICB version; initial value of 2 for OpenVMS I64 (1 is for OpenVMS Alpha). See Section 4.8.3.1.

LIBICB$IH_IREG

128 Quadwords

Array of general registers (only those allocated; unallocated registers are uninitialized).
  • LIBICB$IH_IREG[0] is reserved.
  • IREG[1], the global data pointer, can be referenced using the symbol LIBICB$IH_GP.
  • IREG[12], the memory stack pointer, can be referenced using the symbol LIBICB$IH_SP.
  • IREG[13], the thread pointer, can be referenced using the symbol LIBICB$IH_TP.
  • IREG[25], the argument information register, can be referenced using the symbol LIBICB$IH_AI.

LIBICB$IH_GRNAT

2 Quadwords

General register NaT collection.?

LIBICB$FO_F2_F31

30 Octawords

Floating-point registers F2-F31. Array of floating-point register values in register format, as saved by a SPILL instruction.

LIBICB$PH_F32_F127

Quadword

Pointer to array of floating-point values in register format for registers F32-F127, as saved by SPILL instruction. A pointer value of 0 indicates that the contents of registers F32-F127 are not defined.

LIBICB$IH_BRANCH

8 Quadwords

Array of branch registers.

LIBICB$IH_RSC

Quadword

Register Stack Configuration register.

LIBICB$IH_BSP

Quadword

Backing store pointer.

LIBICB$IH_BSPSTORE

Quadword

Backing store write pointer.

LIBICB$IH_RNAT

Quadword

RSE NaT collection register.

LIBICB$IH_CCV

Quadword

Compare and Exchange Value register.

LIBICB$IH_UNAT

Quadword

User NaT collection register.

LIBICB$IH_PFS

Quadword

Previous function state.

LIBICB$IH_LC

Quadword

Loop count register.

LIBICB$IH_EC

Quadword

Epilogue Count register.

LIBICB$IH_CSD

Quadword

Copy of the AR.CSD.

LIBICB$IH_SSD

Quadword

Copy of the AR.SSD.

LIBICB$Q_PRED

Quadword

Predicate collection register, P0—P63. This field is a bitvector with bit 0 reserved.

LIBICB$IH_PC

Quadword

Current instruction pointer; the slot number overlays <1:0>.

LIBICB$IH_CFM

Quadword

Current Frame Marker.

LIBICB$IH_UM

Quadword

User mask bits from PSR.

LIBICB$O_GR_VALID

Octaword

General Register validity mask.?

LIBICB$L_FR_VALID

Longword

Floating-Point Register validity mask for registers F2-F31.?

LIBICB$Q_BR_VALID

Quadword

Branch Register validity mask.?

LIBICB$Q_AR_VALID

Quadword

Application Register validity mask.?

LIBICB$Q_OTHER_VALID

Quadword

PC and CFM validity mask.?

LIBICB$Q_PR_VALID

Quadword

Predicate Register validity mask.?

LIBICB$IH_ORIGINAL_
SPILL_ADDR

Quadword

Original address of the general register spill area (normally &icb->LIBICB$IH_IREG[0]).?

LIBICB$IH_PSP

Quadword

Previous stack pointer.

LIBICB$IH_RETURN_PC

Quadword

Return PC.

LIBICB$IH_PREV_BSP

Quadword

Previous BSP

LIBICB$PH_CHFCTX_ADDR

Quadword

Pointer to condition handler facility context block.

LIBICB$IH_OSSD

Quadword

Copy of OSSD from Unwind Information Block.

LIBICB$IH_HANDLER_FV

Quadword

Condition Handler Function Value.

LIBICB$PH_LSDA

Quadword

Address of the Language Specific Data Area of the Unwind Information Block

Beginning of User Override Parameters (offset LIBICB$R_UO_BASE)

LIBICB$Q_UO_FLAGS

Quadword

Operational flags: LIBICB$V_UO_FLAG_CACHE_UNWIND – Cache unwind information during a walk of the call stack. See Section 4.8.3.2.

LIBICB$IH_UO_IDENT

Quadword

User context variable; passed by value to the callback routines. See Section 4.8.5.

LIBICB$PH_UO_READ_MEM

Quadword

Pointer to user read memory routine. See Section 4.8.5.3.

LIBICB$PH_UO_GETUEINFO

Quadword

Pointer to user get unwind entry information routine. See Section 4.8.5.1.

LIBICB$PH_UO_GETCONTEXT

Quadword

Pointer to user get initial context routine. See Section 4.8.5.2.

LIBICB$PH_UO_WRITE_MEM

Quadword

Pointer to user write memory routine. See Section 4.8.5.4.

LIBICB$PH_UO_WRITE_REG

Quadword

Pointer to user write register routine. See Section 4.8.5.5.

LIBICB$PH_UO_MALLOC

Quadword

Pointer to user memory allocate routine. See Section 4.8.5.6.

LIBICB$PH_UO_FREE

Quadword

Pointer to user memory free routine. See Section 4.8.5.7.

End of user override parameters (length of LIBICB$K_UO_LENGTH)

LIBICB$L_ALERT_CODE

Longword

Stack walk detailed status. Alert codes are enumerated in the LIBICB include files. See Section 4.8.3.7.

LIBICB$IH_SYSTEM_
DEFINED[n]

n Quadwords

Variable-sized area; unused and undefined at this time.

Table 4.17. Flags in LIBICB$V_FRAME_FLAGS Field of the Invocation Context Block

Flag

Description

LIBICB$V_BOTTOM_OF_STACK

Set to 1 if this is the bottom of the stack and there is absolutely no previous frame.

LIBICB$V_HANDLER_PRESENT

Set to 1 if this frame has a condition handler.

LIBICB$V_IN_PROLOGUE

Set to 1 if the PC is in a prologue region.

LIBICB$V_IN_EPILOGUE

Set to 1 if the PC is in an epilogue region.

LIBICB$V_HAS_MEM_STK_FRAME

Set to 1 if this frame has a memory stack.

LIBICB$V_HAS_REG_STK_FRAME

Set to 1 if this frame has a register stack.

Static scratch registers, unless saved and described in the unwind table information, are not realizable except for an invocation context preceding an exception or AST frame.

4.8.2.2. Invocation Context Handle

To refer to a specific procedure invocation at run-time, an invocation context handle (ICH) can be used. The invocation context handle is a quadword that uniquely identifies any one of the active frames on a call stack, even when one or more of the frames correspond to procedures that have no associated stack storage.

The characteristics of the caller are used to determine the invocation context handle. If the caller has a register frame, then the RSE Backing Store Pointer (BSP) is used as the handle; otherwise, the caller's Stack Pointer is used. (The caller's Stack Pointer is sometimes called Stack Pointer on Entry or Previous Stack Pointer (PSP)).

4.8.3. Invocation Context Block Access Routines

A thread can manipulate the invocation context of any procedure in the thread's virtual address space by calling the run-time library functions described in this section.

Note

The OpenVMS I64 stack tracing routines use heap storage during the analysis of unwind descriptors. The default heap storage mechanism uses a LIBRTL implementation of the C RTL function malloc, the use of which may result in virtual memory being expanded using the $EXPREG system service. See Section 4.8.5 on how to override the defaults. See also Section 4.8.3.12.

4.8.3.1. Initializing the Invocation Context Block

When allocating a new invocation context block, the user must perform the following steps prior to calling any of the routines described in Section 4.8.3:
  • Allocate the block on an octaword (16-byte) boundary.

  • Clear (set to all zero bytes) the entire block.

  • Initialize the LIBICB$L_CONTEXT_LENGTH field to LIBICB$K_INVO_CONTEXT_BLK_SIZE and the LIBICB$B_BLOCK_VERSION field to LIBICB$K_INVO_CONTEXT_VERSION.

  • Set any required parameters in the user override portion of the invocation context block.

  • Set the LIBICB$V_UO_FLAG_CACHE_UNWIND flag if appropriate. See also Section 4.8.3.2 and Section 4.8.3.12 regarding subsequent use of LIB$I64_PREV_INVO_END.

Failure to do so will cause these routines to return an error status. Note that this is a change from Alpha, where initialization was not necessary.

To simplify the initialization process, the following convenience routines are provided:

4.8.3.2. Walking the Call Stack

During the course of program execution, it is sometimes necessary to walk the call stack. Frame-based exception handling is one case where this is done. Call stack navigation is possible only in the reverse direction (in a latest-to-earliest or top-to-bottom sequence).

To walk the call stack, perform the following steps:
  1. Given a program state (which contains a register set), build an invocation context.

    For the current routine, an initial invocation context block can be obtained by calling the LIB$I64_GET_CURR_INVO_CONTEXT routine (see Section 4.8.3.7).

  2. Repeatedly call the LIB$I64_GET_PREV_INVO_CONTEXT routine (see Section 4.8.3.8) until the desired invocation context, or the end of the call chain, has been reached.

    LIB$I64_GET_PREV_INVO_CONTEXT indicates the end of the invocation call chain if either of the following conditions is true:
    • The OSSD$V_BOTTOM_OF_STACK flag is set for the target frame (see Table A.14).

    • The return address (IP) of the target frame is zero.

To make the stack walk more efficient, you can set the LIBICB$V_UO_FLAG_CACHE_UNWIND flag. This causes unwind information to be carried over from one call to LIB$I64_GET_PREV_INVO_CONTEXT to the next. At the conclusion of the stack walk, you must call LIB$I64_PREV_INVO_END to free any cached unwind information. This is the recommended practice, but not the default behavior.

Compilers are allowed to optimize high-level language procedure calls in such a way that they do not appear in the invocation chain. For example, inline procedures never appear in the invocation chain.

Make no assumptions about the relative positions of any memory used for procedure frame information. There is no guarantee that successive stack frames will always appear at higher addresses.

4.8.3.3. LIB$I64_CREATE_INVO_CONTEXT

This convenience routine simplifies creating and properly initializing an invocation context block. The routine allocates an invocation context block from heap storage and initializes it according to the steps described in Section 4.8.3.1. Users of this routine should call LIB$I64_FREE_INVO_CONTEXT when the invocation context block is no longer required.

This routine sets the cache unwind flag LIBICB$V_UO_FLAG_CACHE_UNWIND in the invocation context block to speed the stack walk. Do not use this routine in conjunction with LIB$I64_INIT_INVO_CONTEXT, as the same initialization is performed by both routines.
LIB$I64_CREATE_INVO_CONTEXT ([malloc] [, free] [, ident])

Argument

OpenVMS Usage

Type

Access

Mechanism

malloc

function_value

procedure

read

by value

free

function_value

procedure

read

by value

ident

user_value

quadword

read

by value

Arguments:

malloc

A procedure reference for a user callback routine that allocates memory. See Section 4.8.5.6 for details of this routine. This is an optional argument. The default is to use an implementation of the C RTL routine malloc. If specified, this routine is used to allocate the invocation context block and is also placed in the invocation context block field LIBICB$PH_UO_MALLOC for use during the stack walk.

free

A procedure reference for a user callback routine that deallocates memory. This value is placed in the invocation context block field LIBICB$PH_UO_FREE. See Section 4.8.5.7 for details on this routine. This is an optional argument; however, it must be specified if malloc is specified. The default is to use an implementation of the C RTL routine free.

ident

Specifies a user ident value to be placed in the invocation context block LIBICB$IH_UO_IDENT field. In turn, this value is passed to the malloc and free routines, described in Section 4.8.5.6 and Section 4.8.5.7 respectively. This is an optional argument; the default value is zero.

Function Value Returned:

invo_context

A non-zero value represents the address of the invocation context block allocated. A value of 0 indicates failure.

4.8.3.4. LIB$I64_FREE_INVO_CONTEXT

Deallocates an invocation context block that was previously allocated using LIB$I64_CREATE_INVO_CONTEXT. This routine calls LIB$I64_PREV_INVO_END as a convenience.
LIB$I64_FREE_INVO_CONTEXT (invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

Argument:

invo_context

Address of an invocation context block.

Function Value Returned:

None.

 

4.8.3.5. LIB$I64_INIT_INVO_CONTEXT

Initializes an invocation context block that the user has already allocated (on the stack, or from heap, or other storage) in accordance with Section 4.8.3.1. Use this routine as an alternative to LIB$I64_CREATE_INVO_CONTEXT, which both allocates and initializes an invocation context block.
LIB$I64_INIT_INVO_CONTEXT
  (invo_context, invo_version [, cache_unwind_flag])

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

invo_version

version_number

byte

read

by value

cache_unwind_flag

flag

longword

read

by value

Arguments:

invo_context

Address of an invocation context block.

invo_version

The value LIBICB$K_INVO_CONTEXT_VERSION. This is used to verify the operating environment.

cache_unwind_flag

A flag indicating if the cache unwind flag, LIBICB$V_UO_FLAG_CACHE_UNWIND, should be set in the invocation context block. A value of zero clears the flag; a value of one sets the flag. This is an optional argument. The default is zero.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates a version number mismatch.

4.8.3.6. LIB$I64_GET_INVO_CONTEXT

A thread can obtain the invocation context of any active procedure by using this function:
LIB$I64_GET_INVO_CONTEXT(invo_handle, invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_handle

invo_handle

quadword

read

by reference

invo_context

invo_context_blk

structure

modify

by reference

Arguments:

invo_handle

Address of the location that contains the handle for the desired invocation.

invo_context

Address of an invocation context block into which the procedure context of the frame specified by invo_handle will be written.

Note

The invocation context block must be properly initialized as described in Section 4.8.3.1 before calling this routine.

Function Value Returned:

status

Status value. A value of 1 indicates success; a value of 0 indicates failure.

Note

If the invocation handle that was passed does not represent any procedure context in the active call stack, the new contents of the context block is unpredictable.

4.8.3.7. LIB$I64_GET_CURR_INVO_CONTEXT

A thread can obtain the invocation context of a current procedure by using this function:
LIB$I64_GET_CURR_INVO_CONTEXT(invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

Argument:

invo_context

Address of an invocation context block into which the procedure context of the caller will be written.

Note

The invocation context block must be properly initialized as described in Section 4.8.3.1 before calling this routine.

Function Value Returned:

Zero

This facilitates use in the implementation of the C language unwind setjmp or longjmp function. Check the LIBICB$L_ALERT_CODE field of the invocation context block for further status indication.

4.8.3.8. LIB$I64_GET_PREV_INVO_CONTEXT

A thread can obtain the invocation context of the procedure context preceding any other procedure context by using this function:
LIB$I64_GET_PREV_INVO_CONTEXT(invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

Argument:

invo_context

Address of a valid invocation context block. The given invocation context block is updated to represent the context of the previous (calling) frame.

The LIBICB$V_BOTTOM_OF_STACK flag of the invocation context block is set if the target frame represents the end of the invocation call chain or if stack corruption is detected.

Function Value Returned:

status

Status value. A value of 1 indicates success. When the initial context represents the bottom of the call stack, a value of 0 is returned.

4.8.3.9. LIB$I64_GET_INVO_HANDLE

A thread can obtain an invocation handle corresponding to any invocation context block by using this function:
LIB$I64_GET_INVO_HANDLE(invo_context, invo_handle)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

read

by reference

invo_handle

invo_handle

quadword

write

by reference

Arguments:

invo_context

Address of a valid invocation context block.

invo_handle

Address of the location into which the invocation context handle is to be written. If the call fails, the value of the invocation context handle is LIB$K_INVO_HANDLE_NULL.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.3.10. LIB$I64_GET_CURR_INVO_HANDLE

A thread can obtain the invocation handle for the current procedure by using this function.
LIB$I64_GET_CURR_INVO_HANDLE(invo_handle)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_handle

invo_handle

quadword

write

by reference

Arguments:

invo_handle

Address of a quadword into which the invocation handle of the caller will be written.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.3.11. LIB$I64_GET_PREV_INVO_HANDLE

A thread can obtain an invocation handle of the procedure context preceding that of a specified procedure context by using this function:
LIB$I64_GET_PREV_INVO_HANDLE (invo_handle_in, invo_handle_out)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_handle_in

invo_handle

quadword

read

by reference

invo_handle_out

invo_handle

quadword

write

by reference

Argument:

invo_handle_in

The address of an invocation handle that represents a target invocation context.

invo_handle_out

Address of the location into which the invocation context handle of the previous context is to be written. If the call fails, the value of the previous invocation context handle is LIB$K_INVO_HANDLE_NULL.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

Note

Each call to this routine involves a stack walk from the top of the stack to find the procedure matching the input handle. Consequently, using this routine repeatedly is an inefficient way to walk the stack, compared to using LIB$I64_GET_PREV_INVO_CONTEXT.

4.8.3.12. LIB$I64_PREV_INVO_END

This routine should be called at the conclusion of call tracing operations to free the memory used to process unwind descriptors. The call tracing routines are LIB$I64_GET_INVO_CONTEXT, LIB$I64_GET_PREV_INVO_CONTEXT, LIB$I64_GET_CURR_INVO_CONTEXT.

To provide efficient call tracing, some unwind information is tracked in heap storage from one call to the next. This heap storage should be freed before you release or reuse the invocation context block.

Calling this routine is necessary if the LIBICB$V_UO_FLAG_CACHE_UNWIND flag is set in the LIBICB$Q_UO_FLAGS field of the invocation context block. If this flag is not set, unwind information is released and recreated at each call, and calling this routine is not required.
LIB$I64_PREV_INVO_END (invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

Arguments:

invo_context

Address of a valid invocation context block previously used for call tracing.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.3.13. LIB$I64_PUT_INVO_REGISTERS

The fields of a given procedure invocation context can be updated with new register contents by using this function:
LIB$I64_PUT_INVO_REGISTERS
  (invo_handle, invo_context [,gr_mask] [,fr_mask] [,br_mask]
  [,pr_mask] [,misc_mask])
Note that if user override routines are specified in the invocation context block, then they are used to find and modify the invocation context.

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_handle

invo_handle

quadword

read

by reference

invo_context

invo_context_blk

structure

read

by reference

gr_mask

mask_octaword

128-bit vector

read

by reference

fr_mask

mask_octaword

128-bit vector

read

by reference

br_mask

mask_byte

8-bit vector

read

by reference

pr_mask

mask_quadword

64-bit vector

read

by reference

misc_mask

mask_quadword

64-bit vector

read

by reference

Arguments:

invo_handle

Handle for the invocation to be updated.

invo_context

Address of a valid invocation context block that contains new register contents.

At least one of the following mask arguments (gr_mask, fr_mask, br_mask, pr_mask, or misc_mask) must be specified; otherwise an error status is returned. Each register that is set in the xx_mask argument (along with its NaT bit, if any) is updated using the value found in the corresponding IREG[n], FREG[n], BRANCH[n], or PRED[n] field. GP, TP, and AI can also be updated in this way. No other fields of the invocation context block are used.

gr_mask

Address of a 128-bit bit vector, where each bit corresponds to a register field in the invo_context argument. Bits 0 through 127 correspond to IREG[0] through IREG[127].

  • Bit 0 corresponds to R0, which can not be written, and is ignored.
  • Bit 1 corresponds to the global data pointer (GP).
  • Bit 13 corresponds to the thread pointer (TP).
  • Bit 25 corresponds to the argument information register (AI).
  • If bit 12, which corresponds to SP, is set, then no changes are made.

fr_mask

Address of a 128-bit bit vector, where each bit corresponds to a register field in the passed invo_context. To update floating-point registers F32-F127, provide a pointer to an array of 96 octawords in LIBICB$PH_F32_F127. Bits 0 through 127 correspond to FREG[0] through FREG[127]. Bit 0 corresponds to F0, which can not be written, and is ignored. Bit 1 corresponds to F1, which can not be written, and is ignored.

br_mask

Address of a 8-bit bit vector, where each bit corresponds to a register field in the passed invo_context. Bits 0 through 7 correspond to BRANCH[0] through BRANCH[7].

pr_mask

Address of a 64-bit bit vector, where each bit corresponds to a register field in the passed invo_context. Bits 0 through 63 correspond to PRED[0] through PRED[63].

misc_mask

Address of a 64-bit bit vector, where each bit corresponds to a register field in the passed invo_context as follows:
  • Bit 0=PC.
  • Bits 1—63 are reserved.

Note that PC can only be updated when the invocaton in question has been interrupted (either by exception or by an interrupt) and is logically previous to an invocation with the OSSD$V_EXCEPTION_FRAME bit set.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 is returned (and nothing is changed) in the following circumstances:
  • When the invocation handle does not represent an active invocation context.

  • When bit 12 of the gr_mask argument is set

  • When a scratch register has not been saved, or a register's save location or status cannot be determined (valid bit clear).

Caution

Great care must be taken to assure that a valid stack frame and execution environment result; otherwise, execution may become unpredictable.

4.8.4. Supplemental Invocation Context Access Routines

The routines described in this section can be used to perform some of the more common operations involving invocation contexts.

4.8.4.1. LIB$I64_GET_FR

Given an invocation context block and floating-point register index such that 0 <= index < 128, copy the register value to fr_copy. For example, an index value of 4 fetches the value, which represents the contents of F4 for the context.

LIB$I64_GET_FR returns failure status if the index represents a scratch register whose contents have not been realized.

LIB$I64_GET_FR (invo_context, index, fr_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

read

by reference

indexindexlongwordreadby value
fr_copyfloating-point valueoctawordwrite

by reference

Arguments:

invo_context

Address of a valid invocation context block.

index

Floating-point register index.

fr_copy

Address of an octaword to receive the contents of the specified floating-point register.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.4.2. LIB$I64_SET_FR

Given an invocation context block, a floating-point register index, and a floating-point register value in fr_copy, writes the corresponding invocation context block FREG entry, and calls LIB$I64_PUT_INVO_REGISTERS to write the actual context. The invocation context block remains unchanged if the routine fails.

LIB$I64_SET_FR fails if LIB$I64_PUT_INVO_REGISTERS fails.

LIB$I64_SET_FR (invo_context, index, fr_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

indexindexlongwordreadby value
fr_copyfloating-point valueoctawordread

by reference

Arguments:

invo_context

Address of a valid invocation context block.

index

Index into the FREG array of the invocation context block.

fr_copy

Address of an octaword that contains the floating-point value to be written to the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.4.3. LIB$I64_GET_GR

Given an invocation context block and general register index such that 0 <= index < 128, copy the register value to gr_copy, for example, index 4 fetches the invocation context block IREG[4] value, which represents the contents of R4 for the context.

If the register represented by index has its corresponding NaT bit set, the read succeeds and the return status is set to 3. If the register represented by index lies beyond the allocated general registers, the read fails and gr_copy is unchanged. That is, the highest allowed index is 32 + ICB.CFM.SOF - 1.

LIB$I64_GET_GR fails if the index represents a scratch register whose contents have not been realized.

LIB$I64_GET_GR (invo_context, index, gr_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

read

by reference

indexindexlongwordreadby value
gr_copyinteger valuequadwordwrite

by reference

Arguments:

invo_context

Address of a valid invocation context block.

index

Index into the IREG array of the invocation context block.

gr_copy

Address of a quadword to receive the value from the invocation context block.

Function Value Returned:

status

A value of 3 indicates success, and the NaT bit was set.

A value of 1 indicates success, and the NaT bit was clear.

A value of 0 indicates failure.

4.8.4.4. LIB$I64_SET_GR

Given an invocation context block, a general register index such that 1 <= index < 128, and a quadword value gr_copy, writes the corresponding invocation context block general register, clears the corresponding NaT bit and uses LIB$I64_PUT_INVO_REGISTERS to write to the actual context. The invocation context block remains unchanged if the routine fails.

LIB$I64_SET_GR fails if LIB$I64_PUT_INVO_REGISTERS fails.

LIB$I64_SET_GR (invo_context, index, gr_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

indexindexlongwordreadby value
gr_copyinteger valuequadwordread

by reference

Arguments:

invo_context

Address of a valid invocation context block.

index

Index into the IREG array of the invocation context block.

gr_copy

Address of a quadword that contains the value to be written to the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.4.5. LIB$I64_SET_PC

Given an invocation context block and a quadword PC value in pc_copy, write the pc_copy value to the invocation context block PC and then use LIB$I64_PUT_INVO_REGISTERS to write to the actual context. The invocation context block remains unchanged if the routine fails.

LIB$I64_SET_PC fails if LIB$I64_PUT_INVO_REGISTERS fails.

LIB$I64_SET_PC (invo_context, pc_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

pc_copyPC valuequadwordread

by reference

Arguments:

invo_context

Address of a valid invocation context block.

pc_copy

Address of a quadword that contains the PC value to be written to the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.4.6. LIB$I64_GET_UNWIND_LSDA

Given a pc_value, find the address of the unwind information block language specific data area (LSDA), and write it to unwind_lsda_p. If not present, then write 0 to unwind_lsda_p.

LIB$I64_GET_UNWIND_LSDA (pc_value, unwind_lsda_p)

Argument

OpenVMS Usage

Type

Access

Mechanism

pc_valuePC valuequadwordread

by reference

unwind_lsda_paddressquadwordwriteby reference
Arguments:

pc_value

Address of a location that contains the PC value. pc_value is used to find the unwind information block and the unwind information block language-specific data area address.

unwind_lsda_p

Address of a quadword to receive the address of the language-specific data area, if there is one.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.4.7. LIB$I64_GET_UNWIND_OSSD

Given a pc_value, find the address of the unwind information block operating system-specific data area, if present, and write it to unwind_ossd_p. If not present, then write 0 to unwind_ossd_p.

LIB$I64_GET_UNWIND_OSSD (pc_value, unwind_ossd_p)

Argument

OpenVMS Usage

Type

Access

Mechanism

pc_value

PC value

quadword

read

by reference

unwind_ossd_paddressquadwordwrite

by reference

Arguments:

pc_value

Address of a location that contains the PC value. pc_value is used to find the unwind information block and the unwind information block operating system-specific data area address.

unwind_ossd_p

Address of a quadword to receive the address of the operating system-specific data area.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.4.8. LIB$I64_GET_UNWIND_HANDLER_FV

Given a pc_value, find the function value (address of the procedure descriptor) for the condition handler, if present, and write it to handler_fv. If not present, then write 0 to handler_fv.

LIB$I64_GET_UNWIND_HANDLER_FV (pc_value, handler_fv)

Argument

OpenVMS Usage

Type

Access

Mechanism

pc_value

PC value

quadword

read

by reference

handler_fvaddressquadwordwrite

by reference

Arguments:

pc_value

Address of a location that contains the PC value. pc_value is used to find the unwind information block and the unwind information block condition handler pointer.

handler_fv

A quadword to receive the function value of the procedure descriptor for the condition handler, if there is one.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.4.9. LIB$I64_IS_EXC_DISPATCH_FRAME

Used to determine whether a given PC value represents an exception dispatch frame.

LIB$I64_IS_EXC_DISPATCH_FRAME (pc_value)

Argument

OpenVMS Usage

Type

Access

Mechanism

pc_value

PC value

quadword

read

by reference

Arguments:

pc_value

Address of a quadword that contains the PC value. The pc_value is used to find the operating system-specific data area in the unwind information for this routine.

Function Value Returned:

status

Returns 1 if the operating system-specific data area is present and the EXCEPTION_FRAME flag is set.

Returns 0 if the operating system-specific data area is present and the EXCEPTION_FRAME flag is clear.

Returns 0 if the operating system-specific data area is not present.

4.8.4.10. LIB$I64_IS_AST_DISPATCH_FRAME

Used to determine whether a given PC value represents an AST dispatch frame.

LIB$I64_IS_AST_DISPATCH_FRAME (pc_value)

Argument

OpenVMS Usage

Type

Access

Mechanism

pc_value

PC value

quadword

read

by reference

Arguments:

pc_value

Address of a quadword that contains the PC value. The pc_value is used to find the operating system-specific data area in the unwind information block for this routine.

Function Value Returned:

status

Returns 1 if the operating system-specific data area is present and the AST_FRAME flag is set.

Returns 0 if the operating system-specific data area is present and the AST_FRAME flag is clear.

Returns 0 if the operating system-specific data area is not present.

4.8.5. Invocation Context Callback Routines

Advanced users can override the way the call stack is traced by providing custom callback routines. These routines can be used to perform the following functions:
  • Perform a call trace on a process other than the current process.

  • Override the heap storage mechanism used to allocate memory used during the analysis of unwind descriptors.

The user override callback mechanism provides a user ident value that is passed to each callback routine. The user ident value is stored in the LIBICB$IH_UO_IDENT field of the invocation context block.

The routines described in this section must be provided to override the call stack walk.

Note

The callback routines cannot be used with the following routines, which are not passed a context block:
  • LIB$I64_GET_CURR_INVO_HANDLE

  • LIB$I64_GET_PREV_INVO_HANDLE

4.8.5.1. The Get Unwind Information Routine

Place a function pointer for this routine in the LIBICB$PH_UO_GETUEINFO field of the invocation context block.
int (* getueinfo) (uint64 pc, void *get_ue_block, void *name, ...);

This routine should mimic SYS$GET_UNWIND_ENTRY_INFO for the target process. See Section A.7 for detailed argument descriptions and return status, with the following notes:

The name argument is not used, and can be ignored. If a read memory callback has been specified, the contents of LIBICB$PH_UO_READ_MEM are passed as a fourth argument, and the contents of LIBICB$PH_UO_IDENT are passed as a fifth argument, otherwise the routine is called with three arguments.

4.8.5.2. The Get Initial Context Routine

Place a function pointer for this routine in the LIBICB$PH_UO_GETCONTEXT field of the invocation context block.

The get initial context routine is used to seed the invocation context block from the target process. This routine should initialize the invocation context block structure with the preserved general, floating, branch, and predicate registers, as well as Application Registers such as AR.RSC, AR.BSP, and AR.PFS from the target process. This routine should set the valid bits corresponding to the saved registers in the VALID fields. This routine must store the original spill address corresponding to R0 in the ORIGINAL_SPILL_ADDR field. This callback routine is used by LIB$I64_GET_CURR_INVO_CONTEXT and should be followed by at least one call to LIB$I64_GET_PREV_INVO_CONTEXT to generate a working context.
int (* getcontext) (void *invo_context, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

ident

user_value

quadword

read

by value

Arguments:

invo_context

The address of the invocation context block.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.5.3. The Read Memory Routine

Place a function pointer for this routine in the LIBICB$PH_UO_READ_MEM field of the invocation context block.

The read memory routine is used to transfer data from the target process.
int (* read_mem) (void *dst, uint64 src, size_t length, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

dst

memory_access

byte_array

write

by reference

src

memory_address

quadword

read

by value

length

size_t

longword

read

by value

ident

user_value

quadword

read

by value

Arguments:

dst

A local memory address and the destination for the read operation.

src

An address in the target process to be read.

length

The length in bytes to be read.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.5.4. The Write Memory Routine

Place a function pointer for this routine in the LIBICB$PH_UO_WRITE_MEM field of the invocation context block.

The write memory routine is used to transfer data to the target process. It is used by LIB$I64_PUT_INVO_REGISTERS for a register that has been saved in memory.
int (* write_mem) (void *src, uint64 dst, size_t length, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

src

memory_access

byte_array

read

by value

dst

memory_address

quadword

write

by reference

length

size_t

longword

read

by value

ident

user_value

quadword

read

by value

Arguments:

src

A local memory address and the source for the write operation.

dst

An address in the target process to be written.

length

The length in bytes to be written.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.5.5. The Write Register Routine

Place a function pointer for this routine in the LIBICB$PH_UO_WRITE_REG field of the invocation context block.

The write register routine is used to write a register in the target process. It is used by LIB$I64_PUT_INVO_REGISTERS for a register that has not been saved in memory.

This routine is optional, or subset of registers can be implemented, in this case LIB$I64_PUT_INVO_REGISTERS will return an error if this routine is not present, or is unable to write the desired register.
int (* write_reg)
    (int whichReg, uint64 value_1, uint64 value_2, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

whichReg

enumeration

longword

read

by value

value_1

register_value

quadword

read

by value

value_2

register_value

quadword

read

by value

ident

user_value

quadword

read

by value

Arguments:

whichReg

Indicates the register to be written (see enum in libicb.h).

value_1

Specifies the register contents, or lower quadword for a FR fill operation.

value_2

Specifies the NaT bit for GRs, or upper quadword for a FR fill.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

4.8.5.6. The Memory Allocation Routine

The memory allocation routine is used to allocate heap storage required during the analysis of unwind descriptors. This routine should mimic the behavior of the C RTL routine malloc.
void * (* malloc) (size_t size, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

length

size_t

longword

read

by value

ident

user_value

quadword

read

by value

Arguments:

length

The length in bytes of memory to be allocated. The returned memory block should be aligned on a 16-byte boundary.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

ptr

Address of the memory block allocated, or 0 for failure.

In the case where local memory is being read, that is, you have not overridden the read memory routines, the malloc requests are reduced to:
  • One Unwind Context block of size LIBICB$K_CONTEXT_BLK_SIZE

  • One Unwind Descriptor block of size LIBICB$K_DESCRIPTOR_BLK_SIZE

  • Several Unwind region blocks of size LIBICB$K_REGION_BLK_SIZE

  • Several Unwind region label blocks of size LIBICB$K_REGIONLABEL_BLK_SIZE

The number of the last two required depends on the complexity of the unwind descriptors for a given procedure being traced.

4.8.5.7. The Memory Deallocation Routine

The memory deallocation routine is used to free heap storage allocated by the memory allocation routine (see Section 4.8.5.6). This routine should mimic the behavior of the C RTL routine free.
void (* free) (void * ptr, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

ptr

address

quadword

read

by value

ident

user_value

quadword

read

by value

Arguments:

ptr

Address of a memory block previously allocated by a call to the user malloc routine.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

None.

4.9. Data Allocation

In order to make the most effective use of the addressing modes available to Intel Itanium processors, each image's data is partitioned into one or two short data segments and some number of long data segments. The short data segments, addressed by the GP register in each image, contain the following areas:
  • A linkage table, containing pointers to imported data and functions, and to data in the code segments and long data segments. This area is generally protected by OpenVMS against being written after image activation is complete.

  • A read-only short data area, containing small initialized own data items. This area is generally protected by OpenVMS against being written after image activation is complete. (This area is optional).

  • A read-write short data area, containing small initialized own data items.

  • A read-write short bss area, containing small uninitialized own data items.

The long data segments contain either or both of the following areas:
  • One or more long data areas, which contain large initialized data items, and initialized non-own data items of any size.

  • One or more long bss areas, which contain large uninitialized data items, and uninitialized non-own data items of any size.

Own data items are those that are either local to an image, or are such that all references to these items from the same image will always refer to these items. Because non-own variables cannot be referenced directly, there is no benefit to placing them in the short data area or bss area. Small own data items are placed in the short bss area or short data areas, and are guaranteed to be within 2 megabytes (in either direction) of the GP address; this allows compilers to use a short direct addressing sequence (using the add with 22-bit immediate instruction) to access any data item allocated in these areas.

The compiler should place all own data items that are 8 bytes or less in size (regardless of structure) in one of the short data areas or the short bss area. All other data items, including items that are larger than 8 bytes in size, must be placed in one of the long data areas or long bss areas. The compiler must address these items indirectly, using a linkage table entry. Linkage table entries are typically allocated by the linker in response to a relocation request generated by the compiler; an entry in the linkage table is either a pointer to a data item, or a function descriptor. A function descriptor placed in the linkage table is a local copy of an official function descriptor that is generally allocated by the linker or image activator.

This design allows for a maximum size of 4 megabytes for the short data segment, because everything must be addressable via the GP register using the 22-bit add immediate instruction. This allows for up to 256,000 individually-named variables and functions. If an image requires more than this, linker options may be used to divide the image into multiple clusters (see Section 4.7.1).

4.9.1. Data Alignment

On Itanium hardware, memory references to data that is not naturally aligned can result in alignment faults, which can severely degrade the performance of all procedures that reference the unaligned data. To avoid such performance degradation, all data values should be naturally aligned, as shown in Table 4.18.

In addition, common blocks, dynamically allocated (heap) regions (for example from malloc), and global data items greater than 8 bytes must be aligned on a 16-byte boundary.
Table 4.18. Natural Alignment Requirements

Data Type

Alignment Starting Position

8-bit character string

Byte boundary

16-bit integer

Address that is a multiple of 2 (word alignment)

32-bit integer

Address that is a multiple of 4 (longword alignment)

64-bit integer

Address that is a multiple of 8 (quadword alignment)

  • F_floating
  • F_floating complex

Address that is a multiple of 4 (longword)

  • D_floating
  • D_floating complex

Address that is a multiple of 8 (quadword)

  • G_floating
  • G_floating complex

Address that is a multiple of 8 (quadword)

  • S_floating
  • S_floating complex

Address that is a multiple of 4 (longword)

  • T_floating
  • T_floating complex

Address that is a multiple of 8 (quadword)

  • X_floating
  • X_floating complex

Address that is a multiple of 16 (octaword)

For aggregates such as strings, arrays, and records, the data type to be considered for purposes of alignment is not the aggregate itself, but rather the elements of which the aggregate is composed. The alignment requirement of an aggregate is that all elements of the aggregate be naturally aligned. For example, varying 8-bit character strings must start at addresses that are a multiple of at least 2 (word alignment) because of the 16-bit count at the beginning of the string; 32-bit integer arrays start at a longword boundary, irrespective of the extent of the array.

The rules for passing a record in an argument that is passed by immediate value (see Section 4.7.4) always provide quadword alignment of the record value independent of the normal alignment requirement of the record. If deemed appropriate by an implementation, normal alignment can be established within the called procedure by making a copy of the record argument at a suitably aligned location.

4.9.2. Global Data

Access to global variables that are not known (at compile time) to be defined in the same image must be indirect. Each image has a linkage table in its data segment, pointed to by the GP register; code must load a pointer to the global variable from the linkage table, then access the global variable through the pointer. Access to global variables known to be defined in the same image or to static locals that are placed in the short data area may be made with a GP-relative offset.

4.9.3. Local Static Data

Access to short local static data can be made with a GP-relative offset; access to long local static data must be indirect.

4.9.4. Constants and Literals

Constants and literals may be placed in the text segment or in the data segment. If placed in the text segment, the access must be PC-relative or indirect using a linkage table entry. Literals placed in the data segment may be placed in the short initialized data area if they are 8 bytes or less in size. Larger literals must be placed in the long initialized data area or in the text segment. Literals in the long initialized data area require an indirect access using a linkage table entry.

4.9.5. Record Layout Conventions

The OpenVMS I64 calling standard rules for record layout are designed to provide good run-time performance on all implementations of the Itanium architecture and to provide the required level of compatibility with conventional VAX and Alpha operating environments.

Therefore, this standard defines the following record layout conventions:
  • Those optimized for optimal access characteristics (referred to as aligned record layouts)

  • Those compatible with conventions that are traditionally used by VAX languages (referred to as VAX compatible record layouts)

Only these record layouts may be used across standard interfaces or between languages. Languages can support other language-specific record layout conventions, but such layouts are nonstandard.

The aligned record layout conventions should be used unless interchange is required with conventional VAX applications that use the OpenVMS VAX compatible record layouts.

4.9.5.1. Aligned Record Layout

The aligned record layout conventions ensure that:
  • All components of a record or subrecord are naturally aligned.

  • Layout and alignment of record elements and subrecords are independent of any record or subrecord in which they are embedded.

  • Layout and alignment of a subrecord is the same as if it were a top-level record.

  • Declaration in high-level languages of standard records for interlanguage use is straightforward and obvious, and meets the requirements for source-level compatibility between OpenVMS I64 languages and OpenVMS Alpha and VAX languages.

The aligned record layout is defined by the following conventions:
  • The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.

  • The first bit of a record or subrecord must be directly addressable (byte aligned).

  • Records and subrecords must be aligned according to the largest natural alignment requirements of the contained elements and subrecords.

  • Bit fields (packed subranges of integers) are characterized by an underlying integer type that is a byte, word, longword, or quadword in size together with an allocation size in bits. A bit field is allocated at the next available bit boundary, provided that the resulting allocation does not cross an alignment boundary of the underlying type. Otherwise, the field is allocated at the next byte boundary that is aligned as required for the underlying type. (In the later case, the space skipped over is left permanently not allocated). In addition, if necessary, the alignment of the record as a whole is increased to that of the underlying integer type.

  • Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.

  • All other components of a record must start at the next available naturally aligned address for the data type.

  • The length of a record must be a multiple of its alignment. (This includes the case when a record is a component of another record).

  • Strings and arrays must be aligned according to the natural alignment requirements of the data type of which the string or array is composed.

  • The length of an array element is a multiple of its alignment, even if this leaves unused space at its end. The length of the whole array is the sum of the lengths of its elements.

4.9.5.2. OpenVMS VAX Compatible Record Layout

The OpenVMS VAX compatible record layout is defined by the following conventions:
  • The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.

  • Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.

  • All other components of a record must start at the next available byte in the record. Any unused bits following the last-used bit in the last-used byte of each component must be filled out to the next byte boundary so that any following data starts on a byte boundary.

  • Subrecords must be aligned according to the largest alignment of the contained elements and subrecords. A subrecord always starts at the next available byte unless it consists entirely of unaligned bit data and it immediately follows an unaligned bit string, unaligned bit array, or a subrecord consisting entirely of unaligned bit data.

  • Records must be aligned on byte boundaries.

4.9.6. Sample Code Sequences

In the sample code sequences in this section, register names of the form t1, t2, and so on, are temporary registers, and may be assigned to any available scratch register. The code sequences show necessary cycle breaks, but no other scheduling considerations have been made. It is assumed that these code sequences will be scheduled with surrounding code to make best use of the processor resources.

4.9.6.1. Addressing Own Data in the Short Data Area

Own short data can be addressed with a simple direct reference relative to the GP register, as shown in the following example:
    addl   t1=@gprel(var),gp ;;   // calc. address of var
    ld8    loc0=[t1]              // load contents of var
Own long data can be addressed either via the linkage table, as shown in Section 4.9.6.2, or directly as shown in the following example:
    movl   t1=@gprel(var) ;;      // form gp-relative offset of var
    add    t2=t1,gp ;;            // calc. address of var
    ld8    loc0=[t2]              // load contents of var

4.9.6.2. Addressing External Data or Data in a Long Data Area

When data is not known to be defined in the current image (that is, it is not own), or if it is too large for the short data region, it must be accessed indirectly through the linkage table, as shown in the following example:
    addl   t1=@ltoff(var),gp ;;   // calc. address of LT entry
    ld8    t2=[t1] ;;             // load address of var
    ld8    loc0=[t2]              // load contents of var

4.9.6.3. Addressing Literals in the Text Segment

Literals in the text segment may be addressed either through the linkage table, as in Section 4.9.6.2, or with PC-relative addressing, as shown in the following example:
L1: mov    r3=ip ;;                 // get current IP
    addl   loc0=litbase-L1,r3 ;;    // calc. addr. of lit. area
    adds   t2=(lit-litbase),loc0 ;; // calc. address of lit.
    ld8    loc1=[t2]                // load value of literal

Note

The first two instructions can be moved towards the beginning of the procedure, and the base address of the literal area (in LOC0) can be shared by other literal references in the same procedure.

4.9.6.4. Materializing Function Pointers

Function pointers must always be obtained from the data segment, either as an initialized quadword or through the linkage table, as shown in the following examples:

Materializing function pointers through linkage table:
    addl   t1=@ltoff(@fptr(func)),gp ;; // calc address of LT entry
    ld8    loc0=[t1]                    // load function pointer
Materializing function pointers in data:
fptr:
    data8  @ftpr(func)                  // initialize function ptr

4.9.6.5. Jump Tables

High-level language constructs such as case and switch statements, where there are several possible local targets of a branch, may use a number of different code generation strategies, ranging from sequential conditional branches to a direct-lookup branch table.

Two branch table methods are described: The first places the branch table in a read-only segment separate from the code segment. The second places the branch table in the code segment. The advantage of the first is that it allows the code segment to have execute-only access, while the second may require the code segment to allow read access as well. The advantage of the second is that it does not require addressing the branch table via the GP and hence may be slightly faster. Both methods avoid the need for relocation during image activation.

The branch table method descriptions that follow include examples that use 64-bit entries. It is also valid to use 32-bit, 16-bit or even 8-bit entries providing it is known that the smaller entry size is sufficient to allow the required displacement to be represented (without overflow).

Preferred Method

If a branch table is placed in a data segment separate from the code, each entry should be a byte displacement from a dispatch address located in the code segment to the branch target for that entry.

The following is a sample branch table and its associated code segment:
   //
   // Assume case index in loc0
   //
         addl    loc1=@ltoff($DSPTBL1), gp // addr of GOT entry
         ld8     loc2=[loc1]               // load addr of dsp table
         shladd  loc3=loc0,3,loc2          // calc addr of dsp entry
         ld8     loc4=[loc3]               // load dsp table entry
   $DA1: mov     loc5=ip                   // get "dispatch address"
         add     loc6=loc5,loc4            // calc target address
         mov     b6=loc6
         br.cond b6                        // perform dispatch

   $L1:  {target for case 1}
         ...
   $L2:  {target for case 2}
         ...
    ...   etc

   // The dispatch table is in the linkage section. It consists
   // of only constants (no relocations involved)
   //
   $DSPTBL1:
          .data8  $L1-$DA1
          .data8  $L2-$DA1
            .
            .
            .
Alternative Method

If a branch table is placed in the same segment as the code, each table entry should be a 64-bit byte displacement from the base of the branch table to the branch target for that entry.

A sample indirect branch is shown below. The branch table is assumed to be an array of entries, each of which is an offset relative to the beginning of the branch table to the branch target. The branch table index is assumed to have been computed or loaded into register LOC0.
      addl loc1=@ltoff(brtab),gp      // calc. address of
      ;;                              // linkage table entry
      ld8 loc2=[loc1] ;;              // load addr. of br. table
      shladd loc3=loc0,3,loc2 ;;      // calc. address of branch
                                      // table entry
      ld8 loc4=[loc3] ;;              // load branch table entry
      add loc5=loc4,loc2 ;;           // calc. target address
      mov b6=loc5 ;;                  // move address to B6...
      br.cond b6 ;;                   // ...and branch

Chapter 5. OpenVMS x86-64 Conventions

This chapter describes the fundamental concepts and conventions for calling a procedure in an OpenVMS x86-64 environment. These conventions are based on industry standards with extensions to be compatible with other OpenVMS systems. See Section C.2 for additional information.

5.1. x86-64 Register Usage

This section describes the register conventions for OpenVMS x86-64. OpenVMS uses the following register types:
  • General-purpose

  • Floating-point and related control/status

  • Segment

  • Legacy pseudo-registers

5.1.1. x86-64 Register Classes

The x86-64 registers are partitioned into the following classes that define the way a register can be used within a procedure:
  • Scratch registers—may be modified by a procedure call; the caller must save these registers before a call if needed (caller save).

  • Preserved registers—must not be modified by a procedure call; the callee must save and restore these registers if used (callee save). A procedure using one of the preserved general-purpose registers must save and restore the original content of the caller.

    One way to preserve a register is not to use it at all.

  • Special registers—used in the calling standard call/return mechanism.

  • Volatile registers—may be used as scratch registers within a procedure and are not preserved across a call; may not be used to pass information between procedures either as input or output.

5.1.2. x86-64 General-Purpose Register Usage

This calling standard defines the usage of the OpenVMS x86-64 general-purpose registers as listed in Table 5.1.
Table 5.1. x86-64 General-Purpose Register Usage
RegisterClassUsage
%rax %eax %ax %al %ahScratch
  • Pass the argument information.
  • 1st return value register.
%rbx %ebx %bx %bl %bhPreservedCallee-saved registers.
%rcx %ecx %cx %cl %chScratchPass the 4th argument to procedures.
%rdx %edx %dx %dl %dhScratch
  • Pass the 3rd argument to procedures.
  • 2nd return value register.
%rsi %esi %si %silScratchPass the 2nd argument to procedure.
%rdi %edi %di %dilScratchPass the 1st argument to procedures.
%rbp %ebp %bp %bplPreservedUsed as a frame pointer, if manifested in a register.
%rsp %esp %sp %splSpecialStack pointer.
%r8 %r8d %r8w %r8lScratchPass the 5th argument to procedures.
%r9 %r9d %r9w %r9lScratchPass the 6th argument to procedures.
%r10 %r10d %r10w %r10lScratchPass the environment value when calling a bound procedure.
%r11 %r11d %r11w %r11lVolatileAvailable for use in call stubs, trampolines, and other constructs.
  • %r12 %r12d %r12w %r12l
  • %r13 %r13d %r13w %r13l
  • %r14 %r14d %r14w %r14l
  • %r15 %r15d %r15w %r15l
PreservedCallee-saved registers.
RFLAGSPreservedThe Direction Flag (DF) bit must be zero at procedure call and return.
ScratchAll other bits.
%ripSpecialInstruction pointer, not directly addressable by software.

5.1.3. x86-64 Floating-Point Register Usage (SSE)

The base x86-64 architecture provides 16 SSE floating-point registers, each 128 bits wide.

Intel AVX (Advanced Vector Extensions) option provides 16 256-bit wide AVX registers (%ymm0%ymm15). The lower 128 bits of %ymm0%ymm15 are aliased to the respective 128-bit SSE registers (%xmm0%xmm15?).

Intel AVX-512 option provides 32 512-bit wide SIMD registers (%zmm0%zmm31). The lower 128 bits of %zmm0%zmm31 are aliased to the respective 128-bit SSE registers (%xmm0%xmm31). The lower 256 bits of %zmm0%zmm31 are aliased to the respective 256-bit AVX registers (%ymm0%ymm31?).

In addition, Intel AVX-512 also provides 8 vector mask registers (%k0%k7), each 64 bits wide.

For the purposes of parameter passing and function return, %xmmN, %ymmN, and %zmmN refer to the same register. Only one of them can be used at a time.

Vector register is used to refer to either an SSE, AVX, or AVX-512 register (but not a vector mask register). This document often uses the name SSE to refer collectively to the SSE registers together with either the AVX or AVX-512 options.

This calling standard defines the usage of the OpenVMS x86-64 SSE floating-point registers as listed in Table 5.2.
Table 5.2. SSE (xmm, ymm, and zmm) Register Usage
RegisterClassUsage
%xmm0 %ymm0 %zmm0Scratch
  • Pass the 1st argument to procedures.
  • 1st return value register.
%xmm1 %ymm1 %zmm1Scratch
  • Pass the 2nd argument to procedures.
  • 2nd return value register.
%xmm2 %ymm2 %zmm2ScratchPass the 3rd argument to procedures.
%xmm3 %ymm3 %zmm3ScratchPass the 4th argument to procedures.
%xmm4 %ymm4 %zmm4ScratchPass the 5th argument to procedures.
%xmm5 %ymm5 %zmm5ScratchPass the 6th argument to procedures.
%xmm6 %ymm6 %zmm6ScratchPass the 7th argument to procedures.
%xmm7 %ymm7 %zmm7ScratchPass the 8th argument to procedures.
  • %xmm8—%xmm31
  • %ymm8—%ymm31
  • %zmm8—%zmm31
ScratchTemporary registers.
MXCSR

Preserved

The control flags (bits 6-15) are preserved.
ScratchThe other bits are scratch.
This calling standard defines the usage of the OpenVMS x86-64 vector mask register as listed in Table 5.3.
Table 5.3. Vector Mask Register Usage
RegisterClassUsage
%k0—%k7ScratchTemporary registers

5.1.4. x86-64 Floating-Point Register Usage (FPU)

OpenVMS x86-64 applications may use the x87 registers though there is little reason to do so. Packed, single- and double-precision floating-point operations are usually performed in the SSE registers, while the 80-bit extended-precision floating-point format is not supported by the OpenVMS compilers or run-times.

This calling standard defines the usage of the OpenVMS x86-64 FPU floating-point registers as listed in Table 5.4.
Table 5.4. x87 Register Usage
RegisterClassUsage
%st0Scratch1st return value register.
%st1Scratch2nd return value register.
%st2—%st7ScratchTemporary registers.
%mm0—%mm7ScratchThe MMX registers. Overlay the x87 floating-point (%st0—%st7) registers.
Control WordPreservedStores the value of the control word.
Status WordScratchStores the value of the status word.
  • Tag Word
  • Operand Pointer
  • Instruction Pointer
Not used by applications.

The CPU should be in x87 mode, not MMX mode, on procedure entry and exit.

5.1.5. Floating-Point Status Management on OpenVMS

The floating-point status of a program consists of two parts:
  • The floating-point hardware registers

  • A supplementary software register (a quadword)

The floating-point status is normally managed by three OpenVMS system services:

  • SYS$IEEE_SET_FP_CONTROL

  • SYS$IEEE_SET_PRECISION_MODE

  • SYS$IEEE_SET_ROUNDING_MODE

The supplementary software register is internal to OpenVMS and is not documented for general use. This register holds information that is used by OpenVMS to implement the three system services and handle floating-point exceptions in general. It can only be accessed indirectly using the system services.

The floating-point status consists of two types of information:
  • Floating-point control status bits are bits or flags that control the floating-point arithmetic operations.

  • Floating-point information status bits are bits or flags that record summary information about the execution of previous floating-point arithmetic operations.


Note

The floating-point control status is sometimes informally called the floating-point mode or IEEE mode.

Two floating-point control status settings are of particular interest:
  • Full IEEE-format floating-point control status is the default, unless the status is explicitly set to another value.

  • VAX-format floating-point control status can be set for programs that use VAX-format floating-point processing.

At program startup, the SSE control/status register (MXCSR) is set as shown in Table 5.5.
Table 5.5. MXCSR Values at Program Startup
BitFieldIEEE-format settingVAX-format setting
0Invalid OperationFlags00
1Denormal00
2Zero Divide00
3Overflow00
4Underflow00
5Inexact00
6Denormals are Zeros 00
7Invalid OperationMasks10
8Denormal11
9Zero Divide10
10Overflow10
11Underflow11
12Inexact11
14:13Rounding Control 00 (nearest)00
15Flush to Zero 00
31:16Reserved 00

Note

VAX floating-point data is never loaded or manipulated in the x86-64 floating-point registers. However, VAX floating-point values may be converted to IEEE floating-point values, which are then manipulated in the x86-64 floating-point registers.

At program startup, the x87 control word is set as shown in Table 5.6.
Table 5.6. x87 Control Word Values at Program Startup
BitFieldIEEE-format settingVAX-format setting
0Invalid OperationMasks10
1Denormal11
2Zero Divide10
3Overflow10
4Underflow11
5Inexact11
7:6Reserved 00
9:8Precision Control 1111
11:10Rounding Control 00 (nearest)00
15:13Reserved 00

Using a compiler or linker switch, you can associate a floating-point control status with the main procedure of a program to set the floating-point state prior to the beginning of program execution. If no control status is explicitly set, a default status appropriate for full IEEE computation is used.

5.1.6. x86-64 Segment Register Usage

This calling standard defines the usage of the OpenVMS x86-64 segment registers as listed in Table 5.7.
Table 5.7. x86-64 Segment Register Usage
RegisterClassUsage
%cs %ds %ss %esManaged by OpenVMS and implicitly used by applications
%fsReserved to OpenVMS
%gsReserved to OpenVMS

5.1.7. x86-64 Bound Register Usage

Use of the x86-64 bound registers is deprecated on OpenVMS. The only support provided is to context switch the contents of the bound registers as part of the normal application context; they are otherwise unused and unsupported.

5.1.8. Legacy Pseudo-Registers

The OpenVMS MACRO compiler for x86-64 (XMACRO) generates code that uses a set of pseudo-registers to emulate the Alpha register set. The pseudo-register set consists of 32 64-bit registers (R0—R31). The contents of these pseudo-registers are well defined only at procedure calls and returns; otherwise, XMACRO uses pseudo-registers at its discretion. No special semantics are associated with the pseudo-registers, even for the registers that would otherwise be considered special or part of the Alpha hardware.

The pseudo-registers are invisible to high-level languages, except for BLISS and VSI C. BLISS linkage attributes and VSI C linkage pragmas may be used to access pseudo-registers on calls and returns. See Chapter 3 for more information regarding Alpha register conventions and usage.

Use of such registers for other than legacy applications from other OpenVMS environments is deprecated.

The pseudo-registers are stored as a per-thread vector of quadwords in memory.

alpha_reg_vector_t* LIB$GET_ALPHA_REG_VECTOR ();
Arguments:
None. 
Function Value Returned:
ptrPointer to the Alpha pseudo-register vector for the current thread.

LIB$GET_ALPHA_REG_VECTOR preserves all registers other than the return value register %rax.

Any procedure that accesses the pseudo-registers must make its own call to LIB$GET_ALPHA_REG_VECTOR to obtain the array address. Passing the array address to another procedure by any means is an error that may result in undefined behavior.

5.2. Address and Pointer Representation

An address is a 64-bit value that is used to denote a position in memory. However, for compatibility with OpenVMS VAX and Alpha, many OpenVMS applications and user-mode facilities operate in such a manner that addresses are restricted to values that are representable in 32 bits. This means that OpenVMS addresses can often be stored and manipulated as 32-bit longword values. In such cases, the 32-bit address value is always implicitly or explicitly sign-extended to form a 64-bit address for use by the x86-64 hardware.

The OpenVMS run-time environment supports a mix of 32- and 64-bit pointers. For backward compatibility, the default pointer size is 32 bits. A 32-bit pointer is converted to a 64-bit pointer by sign-extending its value. A 64-bit pointer can be converted to a valid 32-bit pointer only if the high-order 33 bits are all zero or all one.

5.3. Procedure Values

An x86-64 procedure value (a function pointer) is a pointer to code. To call through a procedure value, call through the value itself, not through a location in the memory pointed to by the value.

All procedure values must be representable in 32 bits. Because 32-bit addresses and pointers are always sign-extended before use (see Section 5.2), this means that the code they point to must reside in either the (hexadecimal) range 0..00000000 7FFFFFFF or FFFFFFFF 80000000..FFFFFFFF FFFFFFFF (see the VSI OpenVMS Programming Concepts Manual, Volume I for discussion of the structure of the OpenVMS address space). If the code is not in either of these regions, the linker creates a 32-bit-addressable trampoline for it. The trampoline code simply jumps to the procedure. The address of this trampoline becomes the value for that procedure.

Unbound procedures normally do not require an associated trampoline. They need a trampoline only if code in the same image takes the address of the procedure, or if it is a universal symbol.

Bound procedure values always point to trampolines. These trampolines are created by the containing procedure at the time it is called. When the bound procedure value trampolines pass control to the procedure, they pass an environment pointer (a pointer to the containing procedure stack frame) as an additional hidden parameter to the procedure. (See Section 5.6.5 regarding creation and deletion of bound procedure values).

5.4. Procedure Types

This calling standard defines the following basic types of procedure:
  • Variable-size stack procedure (sometimes known as a normal procedure in industry x86-64 documentation)—allocates a memory stack that is addressable using either %rbp (the frame pointer register) or %rsp (the stack pointer register). The size of the stack may vary during the procedure execution. The called procedure may maintain a part or the whole context of its caller on that stack.

  • Fixed-size stack procedure (sometimes known as a framepointerless procedure in industry x86-64 documentation)—allocates a memory stack that is addressable only using %rsp (the stack pointer register). The size of the stack is fixed during the procedure execution. The called procedure may maintain a part or the whole context of its caller on that stack.

  • Null frame procedure (sometimes known as a frameless procedure in industry x86-64 documentation)—allocates no memory stack (other than the implicit saving of the caller return address that is a part of the CALL instruction). No context of its caller is saved.

All types of procedures allow use of 128 bytes of temporary storage below the address given in the stack pointer. This so-called red zone is not preserved across procedure calls, but is preserved by signal and condition handlers. Outside of the kernel, procedures may use this for temporary storage. Because hardware interrupts do not preserve the red zone, kernel code cannot use it. The use of the red zone can be disabled with a compiler option or pragma.

The red zone is useful in frameless leaf procedures (that call no other procedures). It gives them 128 bytes of scratch storage without the performance overhead of setting up and taking down a stack frame.

A compiler chooses which type of procedure to generate based on the requirements of the procedure in question. A calling procedure does not need to know what type of procedure it is calling.

Every variable-size stack or fixed-size stack procedure must have an associated unwind description (see Appendix B) that provides information on the procedure type and its characteristics. A null frame procedure may also have an associated unwind description. (The default description applies if there is no unwind description). This data structure is used to interpret the call stack at any given point in a thread execution. It is built at compile time and usually is not accessed at run-time except to support exception processing or other rarely executed code.

5.4.1. Variable-Size Stack Procedures

Variable-size stack procedures allocate the stack that grows towards lower addresses. The stack pointer (SP) is contained in the %rsp register. The frame pointer (FP) is contained in the %rbp register. The stack pointer is normally 0mod16 aligned and must be 0mod16 aligned when making a call. Because the return address is pushed on the stack by the caller, the stack pointer is 8mod16 aligned on entry to a procedure. The %rbp register is saved immediately below the return address. The frame pointer points to the saved %rbp.

The resulting stack frame layout is illustrated in Figure 5.1.
Figure 5.1. Stack Frame for Variable-Size Stack Procedures
Stack Frame for Variable-Size Stack Procedures

5.4.2. Fixed-Size Stack Procedures

Fixed-size stack procedures allocate the stack that grows towards lower addresses. The stack pointer (SP) is contained in the %rsp register. No frame pointer (FP) is used, so that the %rbp register is available as an additional preserved register. The stack pointer is normally 0mod16 aligned and must be 0mod16 aligned when making a call. Because the return address is pushed on the stack by the caller, the stack pointer is 8mod16 aligned on entry to a procedure.

The resulting stack frame layout is illustrated in Figure 5.2.
Figure 5.2. Stack Frame for Fixed-Size Stack Procedures
Stack Frame for Fixed-Size Stack Procedures

5.4.3. Null Frame Procedures

A null frame procedure is almost a special case of a fixed-size stack procedure. It is like a fixed-size stack which has no local storage other than the return address that is pushed on the stack as a result of the call. Because no additional stack is allocated it is unlike a fixed-size stack in that the alignment of the stack pointer is 8mod16 (not 0mod16).

A null frame procedure is necessarily a leaf procedure because the stack pointer must be 0mod16 aligned in order to make a call.

The resulting stack frame layout is illustrated in Figure 5.3.
Figure 5.3. Stack Frame for Null Frame Procedures
Stack Frame for Null Frame Procedures

5.5. Stack Overflow Detection on OpenVMS x86-64

This section defines the conventions to support the execution of multiple threads in a multilanguage OpenVMS environment. Specifically defined is how compiled code must perform stack limit checking. While this standard is compatible with a multithreaded execution environment, the detailed mechanisms, data structures, and procedures that support this capability are not specified in this manual.

For a multithreaded environment, the following characteristics are assumed:
  • There can be one or more threads executing within a single process.

  • The state of a thread is represented in a thread environment block (TEB).

  • The TEB of a thread contains information that determines a stack limit below which the stack pointer must not be decremented by the executing code (except for code that implements the multithreaded mechanism itself).

  • Exception handling is fully reentrant and multithreaded.

5.5.1. Stack Limit Checking

A program that is otherwise correct can fail because of stack overflow. Stack overflow occurs when extension of the stack (by decrementing the stack pointer, SP) allocates addresses not currently reserved for the current thread's stack. This section defines the conventions for stack limit checking in a multithreaded environment.

In the following sections, the term new stack region refers to the region of the stack from one less than the old value of SP to the new value of SP.

Stack Guard Region

In a multithreaded environment, the address space beyond each thread's stack is protected by contiguous guard pages, which trap on any access. These pages form the stack guard region.

Stack Reserve Region

In some cases, it is useful to maintain a stack reserve region, which is a minimum-sized region that is between the current top of stack and the stack guard region. A stack reserve region can ensure that the following conditions exist:
  • Exceptions or asynchronous system traps (ASTs, analogous to asynchronous signals) have stack space to execute on a thread's stack.

  • The exception dispatcher and any exception handler that it might call have stack space to execute after detection of an invalid attempt to extend the stack.

This calling standard does not require a stack reserve region, but it does allow a language and its run-time system to implement one.

5.5.1.1. Methods for Stack Limit Checking

Because accessible memory may be available at addresses lower than those occupied by the stack guard region, compilers must generate code that never extends the stack past the stack guard region into accessible memory that is not allocated to the thread's stack.

A general strategy to prevent extending the stack past the stack guard region is to access each page of memory down to and possibly including the page corresponding to the intended new value of %rsp. If the stack is to be extended by an amount larger than the size of a memory page, then a series of accesses is required that works from higher to lower addressed pages. If any access results in a memory access violation, then the code has made an invalid attempt to extend the stack of the current thread.

For the purposes of this section, the amount by which the stack is to be extended must include the size of the red zone in addition to the size of the needed stack extension for the executing procedure.

This calling standard defines two methods for stack limit checking, implicit and explicit, which are explained in the following sections.

Implicit Stack Limit Checking

If a byte (not necessarily the lowest) of the new stack region is guaranteed to be accessed prior to any further stack extension, then the stack can be extended by an increment that is up to one-half the stack guard region (without any additional accesses).

This standard requires that the minimum stack guard region size is 8192 bytes.

If the stack is being extended by 4096 bytes or less and the application does not use a stack reserve region, then explicit checking is not required. However, because asynchronous interrupts and calls to other procedures may also cause stack extension without explicit checking, stack extension with implicit checking must adhere to the following rules:
  • Explicit stack limit checking must be performed unless the amount by which %rsp is decremented is known to be less than or equal to 4096 and the application does not use a stack reserve region.

  • Some byte in the new stack region must be accessed before %rsp can be further decremented for a subsequent stack extension.

  • This access can be performed either before or after %rsp is decremented for this stack extension, but it must be done before %rsp can be decremented again.

  • No standard procedure call can be made before some byte in the new stack region is accessed.

  • The system exception dispatcher ensures that the lowest addressed byte in the new stack region is accessed if any kind of asynchronous interrupt occurs both after %rsp is decremented and before the access in the new stack region occurs.

These conventions ensure that the stack pointer is not decremented so that it points to accessible storage beyond the stack limit without this error being detected (either by the guard region being accessed by the thread or by an explicit stack limit check failure).

As a matter of practice, the system can provide multiple guard pages in the stack guard region. When a stack overflow is detected as a result of access to the stack guard region, one or more guard pages can be unprotected for use by the exception handling facility, as long as one or more guard pages remain protected to provide implicit stack limit checking during exception processing.

Explicit Stack Limit Checking

If the stack is being extended by an unknown amount or by a known amount that is greater than the maximum implicit check size 4096, then a code sequence that follows the rules for implicit stack limit checking can be executed in a loop to access the new stack region incrementally in segments that are less than or equal to the minimum stack guard region size 8192. At least one access must occur in each such segment.

The first access must occur between %rsp and %rsp-4096, because in the absence of more specific information, the previous guaranteed access relative to the current stack may be as much as 4096 bytes greater than the current stack pointer address.

The last access must be within 4096 of the intended new value of the stack pointer. These accesses must occur in order, starting with the highest addressed segment and working toward the lowest addressed segment.

A more optimal strategy is:
  1. Perform a read access using the intended new value of the stack pointer. This is nondestructive, even if the read is beyond the stack guard region, and may facilitate OS mapping of new stack pages, if appropriate, in a single operation.

  2. Proceed with sequential accesses as just described.


Note

A simple algorithm that is consistent with this requirement (but achieves up to twice the minimum number of accesses) is to perform a sequence of accesses in a loop starting with the previous value of %rsp, decrementing by the minimum no-check extension size (4096) to, but not including, the first value that is less than the new value for the stack pointer.

The stack must not be extended incrementally in procedure prologues. A procedure prologue that needs to extend the stack by an amount of unknown size or known size greater than the minimum implicit check size must test new stack segments as just described in a loop that does not modify %rsp, and then update the stack with one instruction that copies the new stack pointer value into %rsp.

Note

An explicit stack limit check can be performed either by inline code that is part of a prologue or by a run-time support routine that is tailored to be called from a procedure prologue.

5.6. Procedure Call and Return

Calls may be direct, which are performed directly to the entry point of a target procedure, or indirect, which are performed through a procedure value. The target of a call may be either an unbound or a bound procedure. Returns are the same for all types of calls.

From the perspective of a compiler or assembly language programmer, all calls are local, that is, the call target is always assumed to be in the same segment as the caller. In case a call resolves to a procedure in a different segment or image, the linker creates a local code stub that forwards that call to the target.

5.6.1. Direct Local Calls to an Unbound Procedure

Within a single segment, direct local calls to an unbound procedure can be performed with a simple CALL instruction using a 32-bit PC-relative displacement. This is sufficient in the small and medium memory models (see Section 5.10.1).

If the code in a single segment grows beyond 2GB, the segment can be broken up into multiple segments.

5.6.2. Direct Local Calls to a Bound Procedure

Direct local calls to a bound procedure can only come from somewhere within the containing scope; which is why this type of calls can be performed with the CALL instruction using a 32-bit PC-relative displacement. The only difference between direct local calls to a bound procedure and direct local calls to an unbound procedure is that a bound procedure requires an additional implicit parameter, the procedure’s environment pointer, to be passed in %r10.

5.6.3. Direct Local Calls to a Non-Local Procedure

Calls between images, or between segments in a single image, are performed via an entry in the Global Offset Table (GOT) that points to the target procedure. In most cases, compilers do not know whether a call target is local or external to the image or segment, and so generate a local call. The linker creates a trampoline and redirects this local call to it. The trampoline forwards the call to the target procedure via an indirect jump through the GOT entry. In cases where a compiler knows that a call target is external, it can generate an indirect call via a GOT entry itself.

5.6.4. Indirect Calls to an Unbound Procedure

Indirect calls to an unbound procedure transfer control to the address that is specified by a procedure value.

5.6.5. Indirect Calls to a Bound Procedure

There is no distinction between the unbound and bound procedure values, so the caller does not know whether the called procedure is bound or not. Therefore, the called side must make special arrangements to pass the environment pointer to the called procedure.

When code takes the address of a bound procedure, the value is not the address of the procedure itself, but a trampoline. This trampoline loads the environment pointer into %r10 and then jumps to the actual procedure.

The trampoline is created when the value of the environment pointer becomes known during run-time. Since a bound procedure value is specific to a particular activation of the containing scope, multiple recursive invocations create multiple trampolines. This means that the storage for the bound procedure trampolines must be dynamically allocated either on the stack or from the heap.

Allocating bound procedure trampolines on the stack is the common industry practice on x86-64, but this is deprecated on OpenVMS because the stack is normally non-executable by default. To use this method on OpenVMS, applications have to explicitly make stack memory executable either with a flag in the object file that has a .note.GNU-stack option or with a run-time call.

The preferred method of creating and allocating bound procedure trampolines on OpenVMS is to call a run-time routine. This routine dynamically allocates and manages a linked list of executable memory pages where the trampolines reside. A second routine must be called to deallocate a bound procedure trampoline. This should be done when the containing procedure exits.

A procedure may create a bound procedure value using LIB$X86_ALLOC_BOUND_PROC_VALUE as follows:
void* LIB$X86_ALLOC_BOUND_PROC_VALUE (size)

Argument

OpenVMS Usage

Type

Access

Mechanism

size

integer

quadword

read

by value

Argument:

size

Number of bytes needed to hold a bound procedure value.

Function Value Returned:

Pointer to a block of memory of the given size

The returned memory must be initialized by the caller to complete the creation of the bound procedure value. Typically the contents will consist of an instruction to copy the appropriate invocation context (which might be saved in the same block) into %r10 followed by an instruction to transfer control to the entry point of the target procedure.

Storage for bound procedure values is local to the thread in which they are created.

Bound procedure values logically form a stack on which any newly allocated value is added and one or more of the most recently added entries may be deleted (as a group).

When returning from a procedure in which a bound procedure was created, a procedure should call LIB$X86_FREE_BOUND_PROC_VALUE as follows:
LIB$X86_DELETE_BOUND_PROC_VALUE (bpv)

Argument

OpenVMS Usage

Type

Access

Mechanism

bpv

address

quadword

read

by value

Argument:

bpv

Pointer to a bound procedure value (created by LIB$X86_ALLOC_BOUND_PROC_VALUE).

Function Value Returned:

None.

 

The effect of calling LIB$X86_FREE_BOUND_PROC_VALUES is to delete an existing bound procedure value, as well as any additional bound procedure values that were created subsequent to it.

5.6.6. Returns

All calls push a 64-bit return address on the stack. When the called procedure returns, it uses the RET instruction to pop the return address from the stack and jump to that address.

5.7. Parameter and Return Value Passing

On OpenVMS x86-64, procedure parameters are passed in registers and/or on the stack. Procedures can return results in registers or in a memory location designated by the caller.

All calls use %rax as an argument information register as described in Section 5.7.4.

5.7.1. Scalar Argument Types

The following memory locations are used for passing scalar argument types to procedures:
  • the six general-purpose registers (%rdi, %rsi, %rdx, %rcx, %r8, and %r9)

  • the eight XMM registers (%xmm0—%xmm7)

  • the stack.


Table 5.8. Memory Locations Used for Passing Scalar Argument Types and Return Values

Nominal Type
[OpenVMS Type Code]
(prefix DSC$K_DTYPE_)

Argument LocationReturn Value Location
Pointer [Q]

The next available general-purpose register. Otherwise, in the next argument slot on the stack.

General-purpose register %rax
Boolean [B, BU]
Integers (size ≤ 64 bits) [B, W, L, Q, BU, WU, LU, QU]
Integers (64 < size ≤ 128 bits) [O, OU]

The next two available general-purpose registers. Otherwise, in the next two argument slots on the stack.

General-purpose registers %rax (low half) and %rdx (high half)
VAX float (F_floating, D_floating, and G_floating) [F, D, G]

The next available general-purpose register. Otherwise, in the next argument slot on the stack.

General-purpose register %rax
IEEE single-precision float (S_floating) [FS]

Bits 31:0 of the next available XMM register. Otherwise, in the next argument slot on the stack.

Bits 31:0 of register %xmm0
IEEE double-precision float (T_floating) [FT]

Bits 63:0 of the next available XMM register. Otherwise, in the next argument slot on the stack.

Bits 63:0 of register %xmm0
IEEE quadruple-precision float (X_floating) [FX]

The next available XMM register. Otherwise, in the next two argument slots on the stack.

Register %xmm0
VAX complex single-precision float (F_floating) [FC]

The next available general-purpose register. Otherwise, in the next argument on the stack.

General-purpose register %rax

VAX complex double-precision float (D_floating and G_floating) [DC, GC]

The next two available general-purpose registers. Otherwise, in the next two argument slots on the stack.

Registers %rax (the real part of a value) and %rdx (the imaginary part of a value)
IEEE complex single-precision float [FSC]

In the next available XMM register, real part in bits 31:0, imaginary part in bits 63:32. Otherwise, in the next argument slot on the stack.

Register %xmm0, the real part of a value in bits 31:0, the imaginary part in bits 63:32
IEEE complex double-precision float [FTC]

In bits 63:0 of the next two available XMM registers. Otherwise, the next two argument slots on the stack.

Bits 63:0 of registers %xmm0 (the real part of a value) and %xmm1 (the imaginary part of a value)
IEEE complex quadruple-precision float [FXC]In the next four available argument slots on the stack.In a caller-allocated memory buffer whose address is passed as a hidden first argument

An argument that requires two registers is never split so that the first part is in a register and the second part is on the stack. Either both parts are in registers or both parts are on the stack.

For example, a procedure that takes ten integer scalar arguments will find the first six arguments in the general-purpose registers, and the last four on the stack. A procedure that takes ten IEEE double-precision floating-point scalars as arguments will find the first eight arguments in the XMM registers, and the last two on the stack. And, a procedure that takes six integer arguments and eight floating-point arguments, regardless of how the integer and floating-point arguments are intermixed, will find all 14 arguments in registers.

5.7.2. Aggregate Argument Types

This section describes how the aggregate argument types are passed to procedures.

First, the argument types are assigned in the appropriate classes and then the registers are allocated for passing them.

The following classes are defined:
  • INTEGER class consists of integral types that fit in one of the general-purpose registers including pointers.

  • SSE class consists of types that fit in a floating-point register.

  • SSEUP class consists of types that fit into a floating-point register and can be passed and returned in the upper bytes of it.

  • X87, X87UP, COMPLEX_X87 classes consist of types that can be returned via the x87 FPU.

  • NO_CLASS is used as initializer in the algorithms. It is used for padding as well as empty structures and unions.

  • MEMORY class consists of types that are passed and returned in memory via the stack.

The size of each argument is rounded up to a quadword (8 bytes). Therefore, the stack will always be 8-byte aligned.

For purposes of the aggregate argument classification algorithm that follows below, the scalar components of an aggregate are classified as shown in Table 5.9.
Table 5.9. Classification of Scalar Components of Aggregate Types

Nominal Type
[OpenVMS Type Code]
(prefix DSC$K_DTYPE_)

Equivalent C/C++ Type(s)Argument Passing Class
Pointer [Q]*INTEGER
Boolean [B, BU]_Bool (bool)
Integers (size ≤ 64 bits) [B, W, L, Q, BU, WU, LU, QU]char, short, int, long (signed and unsigned)
Integers (64 < size ≤ 128 bits) [O, OU]__int128 (signed and unsigned)Split into two 8-byte chunks. Both belong to class INTEGER.
VAX floating-point types (up to 64 bits) [F, D, G] INTEGER
VAX floating-point complex (64 bits) [FC] INTEGER
VAX floating-point complex (128 bits) [DC, GC] Split into two 8-byte chunks. Both belong to class INTEGER.
IEEE binary floating-point types (up to 64 bits) [FS, FT]float, doubleSSE
IEEE extended binary floating-point type (128 bits) [FX]__float128Split into two halves. The first (lower addressed) 64-bits belong to class SSE and the second half to class SSEUP.
IEEE binary floating-point complex (64 bits) [FSC]complex floatTreat as two successive binary floating-point values, each treated as a scalar of half the size (see above).
IEEE binary floating-point complex (128 bits) [FTC]complex double
IEEE binary floating-point complex (256 bits) [FXC]complex long double
Aggregate (structures, records and arrays) and union types are classified as follows:
  1. If the size of an object is larger than eight quadwords (64 bytes), or it contains unaligned fields, it belongs to the MEMORY class.

  2. If a C++ object is non-trivial for the purpose of calls, as specified in the C++ ABI?, it is passed by an invisible reference—that is, the object is replaced in the parameter list by a pointer that has the INTEGER class.?

  3. If the size of the aggregate exceeds a single quadword, each quadword is classified separately. Each quadword is initialized to the NO_CLASS class.

  4. Each field of an object is classified recursively so that always two fields are considered. The two fields are the containing quadword as a whole and the lowest level field components of the quadword, considered in order:
    1. If both classes are equal, this is the resulting class.

    2. If one of the classes is NO_CLASS, the resulting class is the other class.

    3. If one of the classes is MEMORY, the result is the MEMORY class.

    4. If one of the classes is INTEGER, the result is the INTEGER class.

    5. If one of the classes is X87, X87UP, or COMPLEX_X87, the result is the MEMORY class.

    6. Otherwise the result is the SSE class.

  5. Then a post merger cleanup is done:
    1. If one of the classes is MEMORY, the whole argument is passed in memory.

    2. If X87UP is not preceded by X87, the whole argument is passed in memory.

    3. If the size of the aggregate exceeds two quadwords and the first quadword is not SSE or any other quadword is not SSEUP, the whole argument is passed in memory.

    4. If SSEUP is not preceded by SSE or SSEUP, it is converted to SSE.

Once arguments are classified, the registers are assigned (in left-to-right order) for passing as follows:
  1. If the class is MEMORY, the argument is passed on the stack.

  2. If the class is INTEGER, the next available register of the sequence %rdi, %rsi, %rdx, %rcx, %r8, and %r9 is used.

  3. If the class is SSE, the argument is passed in the next available floating-point register. The registers are taken in order from %xmm0 to %xmm7.

  4. If the class is SSEUP, the quadword is passed in the next available 8-byte chunk of the last used floating-point register.

  5. If the class is X87, X87UP, or COMPLEX_X87, the argument is passed in memory.

When a value of a boolean type is returned or passed in a register or on the stack, bit 0 contains the truth value, bits 1 to 7 must be zero, and all other bits are left unspecified. A consumer of such values can rely on it being 0 or 1 only when truncated to the low byte.

If there are no registers available for any quadword of an argument, the whole argument is passed on the stack. If registers have already been assigned for some quadwords of such an argument, the assignments are reverted.

Once registers are assigned, the arguments passed in memory are pushed on the stack in reversed (right-to-left?) order.

Certain arrays of IEEE floating-point components are given special case treatment to take advantage of SSE/AVX floating-point features. These arrays must have both a size and an alignment that is one of 64, 128, 256 or 512 bytes. Multiples of these sizes are also allowed. These are shown in Table 5.10.
Table 5.10. Classification of Special Floating-Point Array Components of Aggregate Types

Nominal Type
[OpenVMS Type Code]
(prefix DSC$K_DTYPE_)

Equivalent C/C++ Type(s)Argument Passing Class
IEEE binary floating-point vector (up to 64 bits) [M64]__m64SSE
IEEE extended binary floating-point vector (128 bits) [M128]__m128Split into two halves. The first (lower addressed) 64-bits belong to class SSE and the second half to class SSEUP.
IEEE binary floating-point vector (256 bits) [M256]__m256Split into four 8-byte chunks. The first chunk belongs to class SSE and the rest to class SSEUP.
IEEE binary floating-point vector (512 bits) [M512]__m512Split into eight 8-byte chunks. The first chunk belongs to class SSE and the rest to class SSEUP.

When passing the __m256 or __m512 arguments to functions that use varargs or stdarg, function prototypes must be provided. Otherwise, the run-time behavior is undefined.

5.7.3. Unused Bits in Passed Data

Whenever data is passed by value between two procedures in registers or in memory, the bits not used by the data elements are sign-extended or zero-extended as appropriate to the type. Unsigned integral (except unsigned 32-bit), set, and VAX floating-point values passed in general-purpose registers are zero-extended, while signed integral values as well as unsigned 32-bit integral values are sign-extended to 64 bits. For all other types passed in the general-purpose registers, unused bits are undefined.

Note

Bit 31 is replicated in bits 32—63, even for unsigned 32-bit integers.

This rule applies to the argument types described in Section 5.7.1 as well as the individual elements of aggregate types passed in general-purpose registers as described in Section 5.7.2.

The rules contained in this section are summarized in Tables 5.11 and 5.12.
Table 5.11. Unused Bits in Passed Data

Data Type
(OpenVMS Names)

Type Designator?

Data Size (bytes)

Register Extension Type

Memory Extension Type

Byte logical

DSC$K_DTYPE_BU

1

Zero64

Zero64

Word logical

DSC$K_DTYPE_WU

2

Zero64

Zero64

Longword logical

DSC$K_DTYPE_LU

4

Sign64

Sign64

Quadword logical

DSC$K_DTYPE_QU

8

Data64

Data64

Byte integer

DSC$K_DTYPE_B

1

Sign64

Sign64

Word integer

DSC$K_DTYPE_W

2

Sign64

Sign64

Longword integer

DSC$K_DTYPE_L

4

Sign64

Sign64

Quadword integer

DSC$K_DTYPE_Q

8

Data64

Data64

F_floating

DSC$K_DTYPE_F

4

VAXF64

Data32

D_floating

DSC$K_DTYPE_D

8

VAXDG64

Data64

G_floating

DSC$K_DTYPE_G

8

VAXDG64

Data64

F_floating complex

DSC$K_DTYPE_FC

2 * 4

2*VAXF64

2*Data32

D_floating complex

DSC$K_DTYPE_DC

2 * 8

2*VAXDG64

2*Data64

G_floating complex

DSC$K_DTYPE_GC

2 * 8

2*VAXDG64

2*Data64

S_floating

DSC$K_DTYPE_FS

4

Hard

Data32

T_floating

DSC$K_DTYPE_FT

8

Hard

Data64

X_floating

DSC$K_DTYPE_FX

16

N/A

N/A

S_floating complex

DSC$K_DTYPE_FSC

2 * 4

Hard?

2*Data32

T_floating complex

DSC$K_DTYPE_FTC

2 * 8

2*Hard

2*Data64

X_floating complex

DSC$K_DTYPE_FXC

2 * 16

N/A

N/A

Small structures of 8 bytes or less

N/A

≤8

Nostd

Nostd

Small arrays of 8 bytes or less

N/A

≤8

Nostd

Nostd

32-bit address

N/A

4

Sign64

Sign64

64-bit address

N/A

8

Data64

Data64

Table 5.12 contains the defined meanings for the extension type symbols used in Table 5.11.
Table 5.12. Extension Type Codes

Sign Extension Type

Defined Function

Sign64

Sign-extended to 64 bits.

Zero64

Zero-extended to 64 bits.

Data32

Data is 32 bits. The state of bits <63:32> is unpredictable.

2*Data32

Two single-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data32).

Data64

Data is 64 bits.

2*Data64

Two double-precision parts of the complex value are stored in memory as independent floating-point values (each handled as Data64).

VAXF64

Data is 64 bits. Low-order 32 bits are the same as the F_floating memory format and the high-order 32 bits are zero. (Used only in a general register, never in a floating-point register).

VAXDG64

Data is 64 bits. Uses the corresponding D_floating or G_floating memory format. (Used only in a general register, never in a floating-point register).

2*VAXF64

Two single-precision parts of the complex value are stored in memory as independent floating-point values (each handled as VAXF64).

2*VAXDG64

Two double-precision parts of the complex value are stored in memory as independent floating-point values (each handled as VAXDG64).

Hard

Passed in the layout defined by the hardware SRM.

2*Hard

Two floating-point parts of the complex value are stored in a pair of registers as independent floating-point values (each handled as Hard).

Nostd

State of all high-order bits not occupied by the data is unpredictable across a call or return.

5.7.4. Argument Information Register (AI)

On all standard calls, the caller must pass information on the number, location and limited type information of all arguments. The called procedure can use this information in various argument count and argument list built-ins. To support this, %rax is used as the AI register. It must contain the argument information that is presented in Table 5.13.
Table 5.13. Contents of the Argument Information Register (%rax)
BitContents
7:0 (%al)Upper bound on the number of XMM registers that are used to pass arguments
15:8 (%ah)Total number of passed argument slots
47:16Argument Info Offset relative to the return address of the caller, or zero
63:48

Reserved and must be either 0x0000 or 0xFFFF?

If the Argument Info Offset field is non-zero, it contains a signed byte offset to an Argument Info Block (AIB). This byte offset is relative to the return address of the caller, that is, an offset from the location of the instruction after the call instruction. The Argument Info Block must be close enough to the call site for the offset to fit in 32 bits. If the AIB is in the same section as the code, this offset can be calculated at compile time.

Table 5.14 shows the format of an Argument Info Block.
Table 5.14. Argument Info Block Format
BitNameUsage
7:0versionFormat version. This format is version 1.
15:8arg info countNumber of argument slots represented in this block.
19:161st arg infoInformation on the 1st argument slot.
23:202nd arg infoInformation on the 2nd argument slot.
  

.
.
.

  Information on the nth argument slot.

The arg info count may be less than, equal to, or greater than the actual number of passed arguments. If it is less, the missing argument information fields are assumed to be 0 (AI$K_AR_I64). If it is greater, the extra entries in this block are ignored.

If all the passed arguments are integers and pointers, there is no need to pass an Argument Info Block. Instead, the Argument Info Offset should be set to zero.

The values of the argument information fields are shown in Table 5.15.
Table 5.15. Argument Slot Information Values
ValueNameMeaning
0AI$K_AR_I64

Argument is passed in a general-purpose register, if one is available, otherwise on the stack. or Argument is not present.

1AI$K_AR_FFF_floating argument is passed in a general-purpose register.
2AI$K_AR_FDD_floating argument is passed in a general-purpose register.
3AI$K_AR_FGG_floating argument is passed in a general-purpose register.
4AI$K_AR_FSArgument is passed in bits 31:0 of an XMM register.
5AI$K_AR_FTArgument is passed in bits 63:0 of an XMM register.
6AI$K_AR_FXLLow half of argument is passed in bits 63:0 of an XMM register.
7AI$K_AR_FXHHigh half of argument is passed in bits 127:64 of an XMM register.
8AI$K_AR_MEMArgument is pushed on the stack.
9—15Reserved.
Note that the AI$K_AR_FXL and AI$K_AR_FXH argument fields always occur in pairs.

5.7.5. Variable Argument Lists

The x86-64 industry standards define how C-style variable argument lists (va_start, va_arg and so on) are implemented. OpenVMS also allows variable argument lists to be accessed as arrays. On prior OpenVMS architectures, a single common mechanism supports both. On OpenVMS x86-64, different mechanisms are implemented.

5.7.5.1. Standard Variable Arguments

The x86-64 standard mechanism uses the va_list structure and the register save area. The register save area structure is presented in Table 5.16.
Table 5.16. Register Save Area Structure
OffsetRegisterUsage
0%rdi1st general-purpose argument register
8%rsi2nd general-purpose argument register
16%rdx3rd general-purpose argument register
24%rcx4th general-purpose argument register
32%r85th general-purpose argument register
40%r96th general-purpose argument register
48%xmm01st floating-point argument register
64%xmm12nd floating-point argument register
80%xmm23rd floating-point argument register
96%xmm34th floating-point argument register
112%xmm45th floating-point argument register
128%xmm56th floating-point argument register
144%xmm67th floating-point argument register
160%xmm78th floating-point argument register

The register save area is always allocated in the stack frame of the called function. Any function that contains an invocation of the va_start macro must save argument registers in the register save area. The six general-purpose registers are always saved. The number of floating-point registers to be saved depends on the value passed in the %al register. In theory, code should not save more registers than indicated in %al, but in practice, it either saves none (if %al is zero) or all the registers.

The standard requires the caller to pass a floating-point register argument count in the %al register whenever the called function uses the C variable arguments. This includes not only functions explicitly declared with the variable arguments, but all unprototyped functions as well.

Note that the OpenVMS “arginfo notused” linkage does not influence whether this value is passed in the %al or not. The passed value does not need to be absolutely correct, but should at least be an upper bound on the number of arguments passed in floating-point registers.

The x86-64 va_list structure contains the following fields that are described in Table 5.17.
Table 5.17. va_list Structure
OffsetFieldUsage
0gp_offsetByte offset from the start of the register save area of the next available saved integer argument register
4fp_offsetByte offset from the start of the register save area of the next available saved floating-point argument register
8overflow_arg_areaPointer to the first available stack argument
16reg_save_areaPointer to the register save area
The va_start macro initializes the va_list structure as follows:
  • gp_offset is the byte offset within the register save area of the first unused general-purpose register.

  • fp_offset is the byte offset within the register save area of the first unused floating-point register.

  • overflow_arg_area points to the first unused stack argument.

  • reg_save_area points to the register save area that is already initialized.

For example, for the printf(const char *fmt, ...) function, the va_list structure is initialized as follows:
  • gp_offset is set to +8, the offset of the second general-purpose argument; the first argument (fmt) is already used.

  • fp_offset is set to +48, the offset of the first floating-point argument.

  • overflow_arg_area is set to FP+16, the location of the first stack argument.

When the va_arg macro is invoked, it fetches the argument from a saved register or the stack and increments one field on the va_list structure accordingly. For example, if an integer argument is requested, the va_arg macro will compare the value of gp_offset against 48. If gp_offset is less than 48, the va_arg macro will return a saved integer register and increment gp_offset. Otherwise, it will return a stack argument and increment overflow_arg_area.

5.7.5.2. OpenVMS Variable Argument Lists

A number of OpenVMS languages allow a procedure to query the total number of arguments and to access arguments as a single array. The following language constructs allow this:
  • ARGPTR, ACTUALPARAMETER and ACTUALCOUNT in BLISS

  • [list], argument, and argument_list_length in VSI Pascal

  • va_count in VSI C

All rely on OpenVMS extensions to the standard calling conventions.

On OpenVMS standard calls, the caller passes argument information in the %rax register that specifies the total number of the used argument slots and location of each register argument. In theory, this information only needs to be passed if the called procedure uses one of the above mentioned language constructs, but since the caller is not able to determine this, the argument information is passed in %rax on all OpenVMS standard calls.

If a called procedure requests its argument count, it is in %ah. If a called procedure requests an argument list, the called procedure performs the following:
  1. Allocates the storage in its own stack frame for the entire arglist (8 * %ah).

  2. Copies all general-purpose registers, floating-point registers, and memory arguments to the arglist as indicated by the values in %rax.

Unlike the prior OpenVMS architectures, on OpenVMS x86-64 it is not possible to create a register “home” on the stack that is contiguous with the incoming memory arguments.

5.7.6. Procedure Return Values

Procedure return values are classified and returned to the appropriate locations depending on their classes as defined for arguments in Section 5.7.2.
  1. If the class is MEMORY, then the caller provides the space for the return value and passes the address of this storage in %rdi as if it were the first argument to the function. In effect, this address becomes a hidden first argument. This storage must not overlap any data visible to the callee through the other parameters in this argument list.

    On return %rax will contain the address that was passed in %rdi by the caller.

  2. If the class is INTEGER, the next available register of the sequence %rax, %rdx is used.

  3. If the class is SSE, the next available floating-point register of the sequence %xmm0, %xmm1 is used.

  4. If the class is SSEUP, the quadword is returned in the next available 8-byte chunk of the last used floating-point register.

  5. If the class is X87, the value is returned on the X87 stack in %st0 as an 80-bit x87 number.

  6. If the class is X87UP, the value is returned together with the previous X87 value in %st0.

  7. If the class is COMPLEX_X87, the real part of the value is returned in %st0 and the imaginary part in %st1.

As a result scalar values and complex floating-point values are returned in registers %rax, %rax and %rdi, %xmm0, or %xmm0 and %xmm1. The exception is an IEEE complex quadruple precision value which is returned in a caller-provided temporary location.

5.7.7. Parameter Passing and Return Result Examples

This section includes examples that illustrate the parameter passing and return result rules.

Example 1

As an example of the register passing conventions, consider the declarations and function call shown in Figure 5.4. The corresponding register allocation is given in Figure 5.5 where the stack frame offset given shows the frame before calling the function.

Figure 5.4. Parameter Passing Example 1
typedef struct {
    int a, b;
    double d;
} structparm;
structparm s;
int e, f, g, h, i, j, k;
long double ld;
double m, n;
__m256 y;
__m512 z;

extern void func (int e, 
                  int f,
                  structparm s, 
                  int g, 
                  int h,
                  long double ld, 
                  double m,
                  __m256 y,
                  __m512 z,
                  double n, 
                  int i, 
                  int j, 
                  int k);

func (e, f, s, g, h, ld, m, y, z, n, i, j, k);
Figure 5.5. Register Allocation Example 1
Register Allocation Example 1

Example 2

This C example illustrates some subtle effects and differences that can result between several closely related sets of declarations as shown in Figure 5.6. Each part begins with a structure declaration that has three fields:
  1. An int (4 bytes) or a long (8 bytes) named a.

  2. A short (2 bytes) named b.

  3. A float (4 bytes) or a double (8 bytes) named c.

All four alternatives are included. This structure is followed by a declaration for a function that returns a value of that structure type and a function that has one parameter of that structure type.

Figure 5.6. Declarations Used in Example 2
// Part A Declarations: Fields of type int, short, double
typedef struct {
      int a;
      short b;
      double c;
      } structparm_isd;
structparm_isd s_isd;
extern structparm_isd set_isd();
extern void func_isd (structparm_isd p_isd);

// Part B Declarations: Fields of type long, short, double
typedef struct {
      long a;
      short b;
      double c;
      } structparm_lsd;
structparm_lsd s_lsd;
extern structparm_lsd set_lsd();
extern void func_lsd(structparm_lsd p_lsd);
// Part C Declarations: Fields of type int, short, float
typedef struct {
      int a;
      short b;
      float c;
      } structparm_isf;
structparm_isf s_isf;
extern structparm_isf set_isf();
extern void func_isf(structparm_isf p_isf);

// Part D Declarations: Fields of type long, short, float
typedef struct {
      long a;
      short b;
      float c;
      } structparm_lsf;
structparm_lsf s_lsf;
extern structparm_lsf set_lsf();
extern void func_lsf(structparm_lsf p_lsf);
Figure 5.7 illustrates the allocation and alignment of the fields in the respective structures.
Figure 5.7. Allocation and Alignment for Example Declarations
Allocation and Alignment for Example Declarations
Table 5.18 illustrates how the fields of the respective fields are passed.
Table 5.18. Parameter Passing Locations for Example Declarations
CallField aField bField c
func_isd(s_isd)%rdi%xmm0
func_lsd(s_lsd)memory (stack)
func_isf(s_isf)%rdi%xmm0
func_lsf(s_lsf)%rdi%rsi
Table 5.19 illustrates how the fields of the respective fields are returned as a function result.
Table 5.19. Function Return Locations for Example Declarations
CallField aField bField c
set_isd(s_isd)%rax%xmm0
set_lsd(s_lsd)memory pointed to by %rax (passed in %rdi)
set_isf(s_isf)%rax%xmm0
set_lsf(s_lsf)%rax%rdx

5.8. Procedure Call Stack

A procedure is an active procedure while its body is executing, including while any procedure it calls is executing. When a procedure is active, its designated condition handler may handle an exception that is signaled during its execution.

Associated with each active procedure is an invocation context, informally called a frame, which consists of the set of registers and space in memory that is allocated and that may be accessed during execution for a particular call of that procedure.

When a procedure begins to execute, it has a limited invocation context that includes the parameter passing registers of its caller. The initial instructions may allocate and initialize additional context, including possibly saving information from the invocation context of its caller. Such instructions, if any, are termed a procedure prologue. Once execution of the prologue is complete, the procedure is said to be active.

When a procedure is ready to return to its caller, the procedure ceases to be active after it begins to execute the instructions that deallocate and discard the procedure's invocation context (which may include restoring state of the caller's invocation context that was saved during the prologue). These instructions are termed a procedure epilogue.

A null frame procedure has no prologue and no epilogue, and consists solely of body instructions. Such a procedure becomes active immediately.

A procedure may have more than one prologue if there are multiple entry points. A procedure may also have more than one epilogue if there are multiple return points. One of each will be executed during any given invocation of the procedure.

A procedure call stack (for a thread) consists of the stack of invocation contexts that exists at any point in time. New invocation contexts are pushed on that stack as procedures are called and invocations are popped from the call stack as procedures return.

The invocation context of a procedure that calls another procedure is said to precede or be previous to the invocation context of the called procedure.

5.8.1. Current Procedure

The current procedure is the active procedure whose execution began most recently; its invocation context is at the top of the call stack. Note that a procedure executing in its prologue or epilogue is not active, and hence cannot be the current procedure.

For OpenVMS x86-64, the IP (instruction pointer) register in combination with associated unwind information determines what procedure is current (for exception handling purposes). See Section B.3 for a description of the unwind information data structures.

5.8.2. Procedure Call Tracing

Mechanisms for each of the following functions are needed to support procedure call tracing:

  • To provide the context of a procedure invocation

  • To walk (navigate) the procedure call stack

  • To refer to a given procedure invocation

  • To examine or modify the register context of an active procedure

This section describes the data structure mechanisms. The run-time library functions that support these functions are described in Section 5.8.3.

5.8.2.1. Invocation Context Block

The context of a specific procedure invocation is provided through the use of a data structure called an invocation context block (ICB). Table 5.20 describes the contents of the OpenVMS x86-64 invocation context block.

Table 5.20. Contents of the Invocation Context Block

Field

Size

Description

LIBICB$L_CONTEXT_LENGTH

Longword

Unsigned total length in bytes of the invocation context block. See Section 5.8.3.1.

LIBICB$V_FRAME_FLAGS

3 Bytes

See Table 5.21.

LIBICB$B_BLOCK_VERSION

Byte

ICB version; initial value of 3 for OpenVMS x86-64. (1 is for OpenVMS Alpha, 2 is for OpenVMS I64). See Section 5.8.3.1.

  • LIBICB$IH_UC_FLAGS
  • LIBICB$IH_UC_LINK
2 Quadwords

Internal (opaque) unwind context data.

LIBICB$IH_IREG

16 Quadwords

Array of general registers.
  • IREG[0], the argument information register, can be referenced using the symbol LIBICB$IH_AI.
  • IREG[6], the frame pointer, can be referenced using the symbol LIBICB$IH_BP.
  • IREG[7], the stack pointer, can be referenced using the symbol LIBICB$IH_SP.

LIBICB$IH_IP

Quadword

Current instruction pointer (IP).

LIBICB$IH_PSEUDO_REGS

32 Quadwords

Array of Alpha pseudo-registers.

LIBICB$IH_RFLAGS

QuadwordProcessor RFLAGS register.

LIBICB$IH_FSGS

Quadword
  • Segment register %fs: LIBICB$W_FS.
  • Segment register %gs: LIBICB$W_GS.

LIBICB$IH_XSAVE_STATE

Quadword

XSAVE state control register value indicating what information is contained in the XSAVE area.

This is the state-component bit map needed by the XRSTOR to restore the floating-point state from the XSAVE area (0 if the XSAVE pointer is null).

LIBICB$PH_XSAVE

Quadword

Pointer to an XSAVE area (null if floating-point is not in use).

LIBICB$L_XSAVE_LENGTH

LongwordThe number of bytes in the block pointed to by LIBICB$PH_XSAVE (0 if LIBICB$PH_XSAVE is null).

LIBICB$PH_CHFCTX_ADDR

Quadword

Pointer to condition handler facility context block.

LIBICB$IH_OSSD

Quadword

Copy of OSSD from unwind information.

LIBICB$IH_HANDLER_PV

Quadword

Condition Handler Procedure Value (if any).

LIBICB$PH_LSDA

Quadword

Address of the Language Specific Data Area (if any).

Beginning of User Override Parameters (offset LIBICB$R_UO_BASE)

LIBICB$Q_UO_FLAGS

Quadword

Operational flags: LIBICB$V_UO_FLAG_CACHE_UNWIND – Cache unwind information during a walk of the call stack. See Section 5.8.3.2.

LIBICB$IH_UO_IDENT

Quadword 

LIBICB$PH_UO_READ_MEM

Quadword 

LIBICB$PH_UO_GETUEINFO

Quadword 

LIBICB$PH_UO_GETCONTEXT

Quadword 

LIBICB$PH_UO_WRITE_MEM

Quadword 

LIBICB$PH_UO_WRITE_REG

Quadword 

LIBICB$PH_UO_MALLOC

Quadword 

LIBICB$PH_UO_FREE

Quadword 
End of user override parameters (length of LIBICB$K_UO_LENGTH)

LIBICB$L_ALERT_CODE

Longword

Stack walk detailed status. Alert codes are enumerated in the LIBICB include files (see Section 5.8.3.7).

LIBICB$IH_SYSTEM_
DEFINED[n]

n Quadwords

Variable-sized area; unused and undefined at this time.

Table 5.21. Flags in LIBICB$V_FRAME_FLAGS Field of the Invocation Context Block
FlagDescription

LIBICB$V_EXCEPTION_FRAME

Set to 1 if this is an exception frame.

LIBICB$V_AST_FRAME

Set to 1 if this is an AST frame.

LIBICB$V_BOTTOM_OF_STACK

Set to 1 if this is the bottom of the stack and there is absolutely no previous frame.

LIBICB$V_HANDLER_PRESENT

Set to 1 if this frame has a condition handler.

LIBICB$V_IN_PROLOGUE

Set to 1 if the IP is in a prologue region.

LIBICB$V_IN_EPILOGUE

Set to 1 if the IP is in an epilogue region.

Static scratch registers, unless saved and described in the unwind table information, are not realizable except for an invocation context preceding an exception or AST frame.

5.8.2.2. Invocation Context Handle

To refer to a specific procedure invocation at run-time, an invocation context handle (ICH) can be used. The invocation context handle is a quadword that uniquely identifies any one of the active frames on a call stack.

On OpenVMS x86-64, the invocation context handle for a frame is simply the stack pointer value at procedure entry (that is, the address of the caller’s return address on the stack).

5.8.3. Invocation Context Block Access Routines

A thread can manipulate the invocation context of any procedure in the thread's virtual address space by calling the run-time library functions described in this section.

Note

The OpenVMS x86-64 stack tracing routines use heap storage during the analysis of unwind descriptors. The default heap storage mechanism uses a LIBRTL implementation of the C RTL function malloc, the use of which may result in virtual memory being expanded using the $EXPREG system service. See Section 5.8.5 on how to override the defaults. See also Section 5.8.3.12.

5.8.3.1. Initializing the Invocation Context Block

When allocating a new invocation context block, the user must perform the following steps prior to calling any of the routines described in Section 5.8.3:
  • Allocate the block on an octaword (16-byte) boundary.

  • Clear (set to all zero bytes) the entire block.

  • Initialize the LIBICB$L_CONTEXT_LENGTH field to LIBICB$K_INVO_CONTEXT_BLK_SIZE and the LIBICB$B_BLOCK_VERSION field to LIBICB$K_INVO_CONTEXT_VERSION.

  • Set any required parameters in the user override portion of the invocation context block.

  • Set the LIBICB$V_UO_FLAG_CACHE_UNWIND flag if appropriate. See also Section 5.8.3.2 and Section 5.8.3.12 regarding subsequent use of LIB$X86_PREV_INVO_END.

Failure to do so will cause these routines to return an error status. Note that this is a change from Alpha, where initialization was not necessary.

To simplify the initialization process, the following convenience routines are provided:

5.8.3.2. Walking the Call Stack

During the course of program execution, it is sometimes necessary to walk the call stack. Frame-based exception handling is one case where this is done. Call stack navigation is possible only in the reverse direction (in a latest-to-earliest or top-to-bottom sequence).

To walk the call stack, perform the following steps:
  1. Given a program state (which contains a register set), build an invocation context.

    For the current routine, an initial invocation context block can be obtained by calling the LIB$X86_GET_CURR_INVO_CONTEXT routine (see Section 5.8.3.7).

  2. Repeatedly call the LIB$X86_GET_PREV_INVO_CONTEXT routine (see Section 5.8.3.8) until the desired invocation context, or the end of the call chain, has been reached.

    LIB$X86_GET_PREV_INVO_CONTEXT indicates the end of the invocation call chain if either of the following conditions is true:
    • The OSSD$V_BOTTOM_OF_STACK flag is set for the target frame (see Table A.14).

    • The return address (IP) of the target frame is zero.

To make the stack walk more efficient, you can set the LIBICB$V_UO_FLAG_CACHE_UNWIND flag. This causes unwind information to be carried over from one call to LIB$X86_GET_PREV_INVO_CONTEXT to the next. At the conclusion of the stack walk, you must call LIB$X86_PREV_INVO_END to free any cached unwind information. This is the recommended practice, but not the default behavior.

Compilers are allowed to optimize high-level language procedure calls in such a way that they do not appear in the invocation chain. For example, inline procedures never appear in the invocation chain.

Make no assumptions about the relative positions of any memory used for procedure frame information. There is no guarantee that successive stack frames will always appear at higher addresses.

5.8.3.3. LIB$X86_CREATE_INVO_CONTEXT

This convenience routine simplifies creating and properly initializing an invocation context block. The routine allocates an invocation context block from heap storage and initializes it according to the steps described in Section 5.8.3.1. Users of this routine should call LIB$X86_FREE_INVO_CONTEXT when the invocation context block is no longer required.

This routine sets the cache unwind flag LIBICB$V_UO_FLAG_CACHE_UNWIND in the invocation context block to speed the stack walk. Do not use this routine in conjunction with LIB$X86_INIT_INVO_CONTEXT, as the same initialization is performed by both routines.

LIB$X86_CREATE_INVO_CONTEXT ([malloc] [, free] [, ident])

Argument

OpenVMS Usage

Type

Access

Mechanism

malloc

function_value

procedure

read

by value

free

function_value

procedure

read

by value

ident

user_value

quadword

read

by value

Arguments:

malloc

A procedure value for a user callback routine that allocates memory. See Section 5.8.5.6 for details of this routine. This is an optional argument. The default is to use an implementation of the C RTL routine malloc. If specified, this routine is used to allocate the invocation context block and is also placed in the invocation context block field LIBICB$PH_UO_MALLOC for use during the stack walk.

free

A procedure value for a user callback routine that deallocates memory. This value is placed in the invocation context block field LIBICB$PH_UO_FREE. See Section 5.8.5.7 for details on this routine. This is an optional argument; however, it must be specified if malloc is specified. The default is to use an implementation of the C RTL routine free.

ident

Specifies a user ident value to be placed in the invocation context block LIBICB$IH_UO_IDENT field. In turn, this value is passed to the malloc and free routines, described in Section 5.8.5.6 and Section 5.8.5.7 respectively. This is an optional argument; the default value is zero.

Function Value Returned:

invo_context

A non-zero value represents the address of the invocation context block allocated. A value of 0 indicates failure.

5.8.3.4. LIB$X86_FREE_INVO_CONTEXT

Deallocates an invocation context block that was previously allocated using LIB$X86_CREATE_INVO_CONTEXT. This routine calls LIB$X86_PREV_INVO_END as a convenience.

LIB$X86_FREE_INVO_CONTEXT (invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

Argument:

invo_context

Address of an invocation context block.

Function Value Returned:

None.

 

5.8.3.5. LIB$X86_INIT_INVO_CONTEXT

Initializes an invocation context block that the user has already allocated (on the stack, or from heap, or other storage) in accordance with Section 5.8.3.1. Use this routine as an alternative to LIB$X86_CREATE_INVO_CONTEXT, which both allocates and initializes an invocation context block.

LIB$X86_INIT_INVO_CONTEXT
  (invo_context, invo_version [, cache_unwind_flag])

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

invo_version

version_number

byte

read

by value

cache_unwind_flag

flag

longword

read

by value

Arguments:

invo_context

Address of an invocation context block.

invo_version

The value LIBICB$K_INVO_CONTEXT_VERSION. This is used to verify the operating environment.

cache_unwind_flag

A flag indicating if the cache unwind flag, LIBICB$V_UO_FLAG_CACHE_UNWIND, should be set in the invocation context block. A value of zero clears the flag; a value of one sets the flag. This is an optional argument. The default is zero.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates a version number mismatch.

5.8.3.6. LIB$X86_GET_INVO_CONTEXT

A thread can obtain the invocation context of any active procedure by using this function:
LIB$X86_GET_INVO_CONTEXT(invo_handle, invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_handle

invo_handle

quadword

read

by reference

invo_context

invo_context_blk

structure

modify

by reference

Arguments:

invo_handle

Address of the location that contains the handle for the desired invocation.

invo_context

Address of an invocation context block into which the procedure context of the frame specified by invo_handle will be written.

Note

The invocation context block must be properly initialized as described in Section 5.8.3.1 before calling this routine.

Function Value Returned:

status

Status value. A value of 1 indicates success; a value of 0 indicates failure.

Note

If the invocation handle that was passed does not represent any procedure context in the active call stack, the new contents of the context block is unpredictable.

5.8.3.7. LIB$X86_GET_CURR_INVO_CONTEXT

A thread can obtain the invocation context of a current procedure by using this function:
LIB$X86_GET_CURR_INVO_CONTEXT(invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

Argument:

invo_context

Address of an invocation context block into which the procedure context of the caller will be written.

Note

The invocation context block must be properly initialized as described in Section 5.8.3.1 before calling this routine.

Function Value Returned:

Zero

This facilitates use in the implementation of the C language unwind setjmp or longjmp function. Check the LIBICB$L_ALERT_CODE field of the invocation context block for further status indication.

5.8.3.8. LIB$X86_GET_PREV_INVO_CONTEXT

A thread can obtain the invocation context of the procedure context preceding any other procedure context by using this function:
LIB$X86_GET_PREV_INVO_CONTEXT(invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

Argument:

invo_context

Address of a valid invocation context block. The given invocation context block is updated to represent the context of the previous (calling) frame.

The LIBICB$V_BOTTOM_OF_STACK flag of the invocation context block is set if the target frame represents the end of the invocation call chain or if stack corruption is detected.

Function Value Returned:

status

Status value. A value of 1 indicates success. When the initial context represents the bottom of the call stack, a value of 0 is returned.

5.8.3.9. LIB$X86_GET_INVO_HANDLE

A thread can obtain an invocation handle corresponding to any invocation context block by using this function:
LIB$X86_GET_INVO_HANDLE(invo_context, invo_handle)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

read

by reference

invo_handle

invo_handle

quadword

write

by reference

Arguments:

invo_context

Address of a valid invocation context block.

invo_handle

Address of the location into which the invocation context handle is to be written. If the call fails, the value of the invocation context handle is LIB$K_INVO_HANDLE_NULL.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.3.10. LIB$X86_GET_CURR_INVO_HANDLE

A thread can obtain the invocation handle for the current procedure by using this function:
LIB$X86_GET_CURR_INVO_HANDLE(invo_handle)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_handle

invo_handle

quadword

write

by reference

Arguments:

invo_handle

Address of a quadword into which the invocation handle of the caller will be written.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.3.11. LIB$X86_GET_PREV_INVO_HANDLE

A thread can obtain an invocation handle of the procedure context preceding that of a specified procedure context by using this function:
LIB$X86_GET_PREV_INVO_HANDLE(invo_handle_in, invo_handle_out)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_handle_in

invo_handle

quadword

read

by reference

invo_handle_out

invo_handle

quadword

write

by reference

Argument:

invo_handle_in

The address of an invocation handle that represents a target invocation context.

invo_handle_out

Address of the location into which the invocation context handle of the previous context is to be written. If the call fails, the value of the previous invocation context handle is LIB$K_INVO_HANDLE_NULL.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

Note

Each call to this routine involves a stack walk from the top of the stack to find the procedure matching the input handle. Consequently, using this routine repeatedly is an inefficient way to walk the stack, compared to using LIB$X86_GET_PREV_INVO_CONTEXT.

5.8.3.12. LIB$X86_PREV_INVO_END

This routine should be called at the conclusion of call tracing operations to free the memory used to process unwind descriptors. The call tracing routines are LIB$X86_GET_INVO_CONTEXT, LIB$X86_GET_PREV_INVO_CONTEXT, and LIB$X86_GET_CURR_INVO_CONTEXT.

To provide efficient call tracing, some unwind information is tracked in heap storage from one call to the next. This heap storage should be freed before you release or reuse the invocation context block.

Calling this routine is necessary if the LIBICB$V_UO_FLAG_CACHE_UNWIND flag is set in the LIBICB$Q_UO_FLAGS field of the invocation context block. If this flag is not set, unwind information is released and recreated at each call, and calling this routine is not required.

LIB$X86_PREV_INVO_END (invo_context)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

Arguments:

invo_context

Address of a valid invocation context block previously used for call tracing.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.3.13. LIB$X86_PUT_INVO_REGISTERS

The fields of a given procedure invocation context can be updated with new register contents by using this function:
LIB$X86_PUT_INVO_REGISTERS
  (invo_handle, invo_context [,gr_mask] [,xmm_mask]
  [,ymm_mask] [,zmm_mask] [,apr_mask] [,misc_mask])
Note that if user override routines are specified in the invocation context block, then they are used to find and modify the invocation context.

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_handle

invo_handle

quadword

read

by reference

invo_context

invo_context_blk

structure

read

by reference

gr_mask

mask_word

16-bit vector

read

by reference

xmm_mask

mask_word

16-bit vector

read

by reference

ymm_mask

mask_word

16-bit vector

read

by reference

zmm_mask

mask_longword

32-bit vector

read

by reference

apr_mask

mask_longword

32-bit vector

read

by reference

misc_mask

mask_quadword

64-bit vector

read

by reference

Arguments:

invo_handle

Handle for the invocation to be updated.

invo_context

Address of a valid invocation context block that contains new register contents.

At least one of the following register masks must be specified and contain a non-zero value. Each register that is set in the xx_mask argument is updated using the value found in the corresponding ICB field. For example, bit n set in gr_mask corresponds to IREG[n].

gr_mask

Address of a 16-bit bit vector, where each bit corresponds to a register field in the invo_context argument.

Bits 0 through 15 correspond to IREG[0] through IREG[15].

Bit 0 corresponds to the argument information register (AI).

If bit 7, which corresponds to SP, is set, then no changes are made.

xmm_mask

Address of a 16-bit bit vector, where each bit corresponds to an SSE XMM register field in the XSAVE area, pointed to from the passed invo_context. Bit 7 corresponds to XMM7.

ymm_mask

Address of a 16-bit bit vector, where each bit corresponds to an SSE YMM register field in the XSAVE area, pointed to from the passed invo_context. Bit 14 corresponds to YMM14.

zmm_mask

Address of a 32-bit bit vector, where each bit corresponds to an SSE ZMM register field in the XSAVE area, pointed to from the passed invo_context. Bit 21 corresponds to ZMM21.

Note that if the same bit position is set in more than one of the xmm_mask, ymm_mask, and zmm_mask, the result is undefined.

apr_mask

Address of a 32-bit bit vector, where each bit corresponds to a register field in the pointed to Alpha pseudo-register area passed. Bits 0 through 31 correspond to Alpha registers R0 through R31. If bit 30, which corresponds to SP, or 31, which corresponds to RZ are set, then no changes are made.

misc_mask

Address of a 64-bit bit vector, where each bit corresponds to a register field in the passed invo_context as follows:
  • Bit 0=IP
  • Bit 1=RFLAGS register
  • Bit 2=FS register
  • Bit 3=GS register
  • Bit 4=MXCSR register
  • Bit 5=FCW register
  • Bit 6=FSW register
  • Bits 7—63 are reserved

Note that IP can only be updated when the invocaton in question has been interrupted (either by exception or by an interrupt) and is logically previous to an invocation with the OSSD$V_EXCEPTION_FRAME bit set.

Note that MXCSR, FCW, and FSW can only be updated when there is a valid address and an XSAVE area in the invo_context.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 is returned (and nothing is changed) in the following circumstances:
  • When the invocation handle does not represent an active invocation context.

  • When bit 7 of the gr_mask argument is set.

  • When a scratch register has not been saved, or a register's save location or status cannot be determined.

Caution

Great care must be taken to assure that a valid stack frame and execution environment result; otherwise, execution may become unpredictable.

5.8.4. Supplemental Invocation Context Access Routines

The routines described in this section can be used to perform some of the more common operations involving invocation contexts.

5.8.4.1. LIB$X86_GET_GR

Given an invocation context block and general-purpose register index such that 0 <= index < 16, copy the register value to gr_copy, for example, index 4 fetches the invocation context block IREG[4] value, which represents the contents of %rsi for the context.

LIB$X86_GET_GR fails if the index represents a scratch register whose contents have not been realized.

LIB$X86_GET_GR (invo_context, index, gr_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

read

by reference

indexindexlongwordreadby value
gr_copyinteger valuequadwordwrite

by reference

Arguments:

invo_context

Address of a valid invocation context block.

index

Index into the IREG array of the invocation context block.

gr_copy

Address of a quadword to receive the value from the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.4.2. LIB$X86_SET_GR

Given an invocation context block, a general-purpose register index such that 1 <= index < 16, and a quadword value gr_copy, writes the corresponding invocation context block general register and uses LIB$X86_PUT_INVO_REGISTERS to write to the actual context. The invocation context block remains unchanged if the routine fails.

LIB$X86_SET_GR fails if LIB$X86_PUT_INVO_REGISTERS fails.
LIB$X86_SET_GR (invo_context, index, gr_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_contextinvo_context_blkstructuremodifyby reference
indexindexlongwordreadby value
gr_copyinteger valuequadwordreadby reference
Arguments:

invo_context

Address of a valid invocation context block.

index

Index into the IREG array of the invocation context block.

gr_copy

Address of a quadword that contains the value to be written to the invocation context block.

5.8.4.3. LIB$X86_GET_XMM

Given an invocation context block and a register index that is 0 <= index < 16 for SSE (Streaming SIMD Extensions) or 0 <= index < 32 for AVX-512 (512-bit Advanced Vector Extensions), copy the register value to xmm_copy. For example, an index value of 4 fetches the value, which represents the contents of xmm4.

LIB$X86_GET_MMX returns failure status if there is no corresponding XSAVE area in the invo_context or if the index represents a register or register set not saved in the XSAVE area.

LIB$X86_GET_XMM (invo_context, index, xmm_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

read

by reference

index

index

longword

read

by value

xmm_copy

register contents

16 bytes

write

by reference

Arguments:

invo_context

Address of a valid invocation context block.

index
Index into the virtual array of XMM registers constructed from the XSAVE area. The XSAVE area is pointed to from the invocation context block.

Note

In case of CPUs implementing the AVX-512 or AVX10 Advanced Vector Extensions, the additional XMM/YMM registers are part of the ZMM registers. For more information on Advanced Vector Extensions, refer to the official documentation on the Intel website.

xmm_copy

Address of a 16-byte buffer to receive the contents of the specified register.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.4.4. LIB$X86_SET_XMM

Given an invocation context block, a register index that is 0 <= index < 16 for SSE (Streaming SIMD Extensions) or 0 <= index < 32 for AVX-512 (512-bit Advanced Vector Extensions), and a register value in xmm_copy, writes the corresponding entry in the XSAVE area pointed to from the invocation context block, and calls LIB$X86_PUT_INVO_REGISTERS to write the actual context. The XSAVE area remains unchanged if the routine fails.

LIB$X86_SET_XMM fails if LIB$X86_PUT_INVO_REGISTERS fails.

LIB$X86_SET_XMM (invo_context, index, xmm_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

indexindexlongwordreadby value
xmm_copyregister contents16 bytesread

by reference

Arguments:

invo_context

Address of a valid invocation context block.

index

Index into the virtual array of XMM registers constructed from the XSAVE area. The XSAVE area is pointed to from the invocation context block.

Note

In case of CPUs implementing the AVX-512 or AVX10 Advanced Vector Extensions, the additional XMM/YMM registers are part of the ZMM registers. For more information on Advanced Vector Extensions, refer to the official documentation on the Intel website.

xmm_copy

Address of a 16-byte buffer that contains the value to be written to the invocation context.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.4.5. LIB$X86_GET_YMM

Given an invocation context block and a register index that is 0 <= index < 16 for AVX (Advanced Vector Extensions) or 0 <= index < 32 for AVX-512 (512-bit Advanced Vector Extensions), copy the register value to ymm_copy. For example, an index value of 4 fetches the value, which represents the contents of ymm4.

LIB$X86_GET_YMM returns failure status if there is no corresponding XSAVE area in the invo_context or if the index represents a register or register set not saved in the XSAVE area.

LIB$X86_GET_YMM (invo_context, index, ymm_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_contextinvo_context_blkstructurereadby reference
indexindexlongwordreadby value
ymm_copyregister contents32 byteswriteby reference
Arguments:

invo_context

Address of a valid invocation context block.

index

Index into the virtual array of YMM registers constructed from the XSAVE area. The XSAVE area is pointed to from the invocation context block.

Note

In case of CPUs implementing the AVX-512 or AVX10 Advanced Vector Extensions, the additional XMM/YMM registers are part of the ZMM registers. For more information on Advanced Vector Extensions, refer to the official documentation on the Intel website.

ymm_copy

Address of a 32-byte buffer to receive the contents of the specified register.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.4.6. LIB$X86_SET_YMM

Given an invocation context block, a register index that is 0 <= index < 16 for AVX (Advanced Vector Extensions) or 0 <= index < 32 for AVX-512 (512-bit Advanced Vector Extensions), and a register value in ymm_copy, writes the corresponding entry in the XSAVE area pointed to from the invocation context block, and calls LIB$X86_PUT_INVO_REGISTERS to write the actual context. The XSAVE area remains unchanged if the routine fails.

LIB$X86_SET_YMM fails if LIB$X86_PUT_INVO_REGISTERS fails.

LIB$X86_SET_YMM (invo_context, index, ymm_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_contextinvo_context_blkstructuremodifyby reference
indexindexlongwordreadby value
ymm_copyregister contents32 bytesreadby reference
Arguments:

invo_context

Address of a valid invocation context block.

index

Index into the virtual array of YMM registers constructed from the XSAVE area. The XSAVE area is pointed to from the invocation context block.

Note

In case of CPUs implementing the AVX-512 or AVX10 Advanced Vector Extensions, the additional XMM/YMM registers are part of the ZMM registers. For more information on Advanced Vector Extensions, refer to the official documentation on the Intel website.

ymm_copy

Address of a 32-byte buffer that contains the value to be written to the invocation context.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.4.7. LIB$X86_GET_ZMM

Given an invocation context block and a register index that is 0 <= index < 32 for for AVX-512 (512-bit Advanced Vector Extensions), copy the register value to zmm_copy. For example, an index value of 4 fetches the value, which represents the contents of zmm4.

LIB$X86_GET_ZMM returns failure status if there is no corresponding XSAVE save area in the invo_context or if the index represents a register or register set not saved in the XSAVE save area.

LIB$X86_GET_YMM (invo_context, index, zmm_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_contextinvo_context_blkstructurereadby reference
indexindexlongwordreadby value
zmm_copyregister contents64 byteswriteby reference
Arguments:

invo_context

Address of a valid invocation context block.

index

Index into the virtual array of ZMM registers constructed from the XSAVE area. The XSAVE area is pointed to from the invocation context block.

zmm_copy

Address of a 64-byte buffer to receive the contents of the specified register.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.4.8. LIB$X86_SET_ZMM

Given an invocation context block, a register index that is 0 <= index < 32 for AVX-512 (512-bit Advanced Vector Extensions), and a register value in zmm_copy, writes the corresponding entry in the XSAVE area pointed to from the invocation context block, and calls LIB$X86_PUT_INVO_REGISTERS to write the actual context. The XSAVE area remains unchanged if the routine fails.

LIB$X86_SET_ZMM fails if LIB$X86_PUT_INVO_REGISTERS fails.

LIB$X86_SET_ZMM (invo_context, index, zmm_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_contextinvo_context_blkstructuremodifyby reference
indexindexlongwordreadby value
zmm_copyregister contents64 bytesreadby reference
Arguments:

invo_context

Address of a valid invocation context block.

index

Index into the virtual array of ZMM registers constructed from the XSAVE area. The XSAVE area is pointed to from the invocation context block.

zmm_copy

Address of a 64-byte buffer that contains the value to be written to the invocation context.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.4.9. LIB$X86_SET_IP

Given an invocation context block and a quadword IP value in ip_copy, write the ip_copy value to the invocation context block IP and then use LIB$X86_PUT_INVO_REGISTERS to write to the actual context. The invocation context block remains unchanged if the routine fails.

LIB$X86_SET_IP fails if LIB$X86_PUT_INVO_REGISTERS fails.

LIB$X86_SET_IP (invo_context, ip_copy)

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

ip_copyinteger valuequadwordread

by reference

Arguments:

invo_context

Address of a valid invocation context block.

ip_copy

Address of a quadword that contains the IP value to be written to the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.4.10. LIB$X86_GET_UNWIND_LSDA

Given an ip_value, find the address of the unwind information block language specific data area (LSDA), and write it to unwind_lsda_p. If not present, then write 0 to unwind_lsda_p.

LIB$X86_GET_UNWIND_LSDA (ip_value, unwind_lsda_p)

Argument

OpenVMS Usage

Type

Access

Mechanism

ip_valueIP valuequadwordread

by reference

unwind_lsda_paddressquadwordwriteby reference
Arguments:

ip_value

Address of a location that contains the IP value. ip_value is used to find the unwind information and language-specific data area address.

unwind_lsda_p

Address of a quadword to receive the address of the language-specific data area, if there is one.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.4.11. LIB$X86_GET_UNWIND_OSSD

Given an ip_value, find the address of the unwind information block operating system-specific data area, if present, and write it to unwind_ossd_p. If not present, then write 0 to unwind_ossd_p.
LIB$X86_GET_UNWIND_OSSD (ip_value, unwind_ossd_p)

Argument

OpenVMS Usage

Type

Access

Mechanism

ip_value

IP value

quadword

read

by reference

unwind_ossd_paddressquadwordwrite

by reference

Arguments:

ip_value

Address of a location that contains the IP value. ip_value is used to find the unwind information block and the unwind information block operating system-specific data area address.

unwind_ossd_p

Address of a quadword to receive the address of the operating system-specific data area.

Note that the OSSD value is contained in the FDE unwind information (see Section B.3.2.3) and is therefore not writable.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.4.12. LIB$X86_GET_UNWIND_HANDLER_PV

Given an ip_value, find the procedure value for the condition handler, if present, and write it to handler_pv. If not present, then write 0 to handler_pv.

LIB$X86_GET_UNWIND_HANDLER_PV (ip_value, handler_pv)

Argument

OpenVMS Usage

Type

Access

Mechanism

ip_value

IP value

quadword

read

by reference

handler_pvaddressquadwordwrite

by reference

Arguments:

ip_value

Address of a location that contains the IP value. ip_value is used to find the unwind information and the unwind condition handler pointer.

handler_pv

A quadword to receive the procedure value for the condition handler, if there is one.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.4.13. LIB$X86_IS_EXC_DISPATCH_FRAME

Used to determine whether a given IP value represents an exception dispatch frame.

LIB$X86_IS_EXC_DISPATCH_FRAME (ip_value)

Argument

OpenVMS Usage

Type

Access

Mechanism

ip_value

IP value

quadword

read

by reference

Arguments:

ip_value

Address of a quadword that contains the IP value. The ip_value is used to find the operating system-specific data area in the unwind information for this routine.

Function Value Returned:

status

Returns 1 if the operating system-specific data area is present and the EXCEPTION_FRAME flag is set.

Returns 0 if the operating system-specific data area is present and the EXCEPTION_FRAME flag is clear.

Returns 0 if the operating system-specific data area is not present.

5.8.4.14. LIB$X86_IS_AST_DISPATCH_FRAME

Used to determine whether a given IP value represents an AST dispatch frame.

LIB$X86_IS_AST_DISPATCH_FRAME (ip_value)

Argument

OpenVMS Usage

Type

Access

Mechanism

ip_value

IP value

quadword

read

by reference

Arguments:

ip_value

Address of a quadword that contains the IP value. The ip_value is used to find the operating system-specific data area in the unwind information block for this routine.

Function Value Returned:

status

Returns 1 if the operating system-specific data area is present and the AST_FRAME flag is set.

Returns 0 if the operating system-specific data area is present and the AST_FRAME flag is clear.

Returns 0 if the operating system-specific data area is not present.

5.8.5. Invocation Context Callback Routines

Advanced users can override the way the call stack is traced by providing custom callback routines. These routines can be used to perform the following functions:
  • Perform a call trace on a process other than the current process.

  • Override the heap storage mechanism used to allocate memory used during the analysis of unwind descriptors.

The user override callback mechanism provides a user ident value that is passed to each callback routine. The user ident value is stored in the LIBICB$IH_UO_IDENT field of the invocation context block.

The routines described in this section must be provided to override the call stack walk.

Note

The callback routines cannot be used with the following routines, which are not passed a context block:
  • LIB$X86_GET_CURR_INVO_HANDLE

  • LIB$X86_GET_PREV_INVO_HANDLE

5.8.5.1. The Get Unwind Information Routine

Place a procedure value for this routine in the LIBICB$PH_UO_GETUEINFO field of the invocation context block.?

int (* getueinfo) (uint64 ip, void *get_ue_block, void *name, ...);

This routine should mimic SYS$GET_UNWIND_ENTRY_INFO for the target process. See Section B.5 for detailed argument descriptions and return status, with the following notes:

The name argument is not used, and can be ignored. If a read memory callback has been specified, the contents of LIBICB$PH_UO_READ_MEM are passed as a fourth argument, and the contents of LIBICB$PH_UO_IDENT are passed as a fifth argument, otherwise the routine is called with three arguments.

5.8.5.2. The Get Initial Context Routine

Place a function pointer for this routine in the LIBICB$PH_UO_GETCONTEXT field of the invocation context block.

The get initial context routine is used to seed the invocation context block from the target process. This routine should initialize the invocation context block structure with the preserved registers, as well as applicable control and status registers, from the target process. This callback routine is used by LIB$X86_GET_CURR_INVO_CONTEXT and should be followed by at least one call to LIB$X86_GET_PREV_INVO_CONTEXT to generate a working context.

int (* getcontext) (void *invo_context, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

invo_context

invo_context_blk

structure

modify

by reference

ident

user_value

quadword

read

by value

Arguments:

invo_context

The address of the invocation context block.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.5.3. The Read Memory Routine

Place a function pointer for this routine in the LIBICB$PH_UO_READ_MEM field of the invocation context block.

The read memory routine is used to transfer data from the target process.

int (* read_mem) (void *dst, uint64 src, size_t length, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

dst

memory_access

byte_array

write

by reference

src

memory_address

quadword

read

by value

length

size_t

longword

read

by value

ident

user_value

quadword

read

by value

Arguments:

dst

A local memory address and the destination for the read operation.

src

An address in the target process to be read.

length

The length in bytes to be read.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.5.4. The Write Memory Routine

Place a procedure value for this routine in the LIBICB$PH_UO_WRITE_MEM field of the invocation context block.

The write memory routine is used to transfer data to the target process. It is used by LIB$X86_PUT_INVO_REGISTERS for a register that has been saved in memory.

int (* write_mem) (void *src, uint64 dst, size_t length, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

src

memory_access

byte_array

read

by value

dst

memory_address

quadword

write

by reference

length

size_t

longword

read

by value

ident

user_value

quadword

read

by value

Arguments:

src

A local memory address and the source for the write operation.

dst

An address in the target process to be written.

length

The length in bytes to be written.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.5.5. The Write Register Routine

Place a procedure value for this routine in the LIBICB$PH_UO_WRITE_REG field of the invocation context block.

The write register routine is used to write a register in the target process. It is used by LIB$X86_PUT_INVO_REGISTERS for a register that has not been saved in memory.

This routine is optional, or a subset of registers can be implemented, in this case LIB$X86_PUT_INVO_REGISTERS will return an error if this routine is not present, or is unable to write the desired register.

int (* write_reg)
    (int whichReg, uint64 value_1, uint64 value_2, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

whichReg

enumeration

longword

read

by value

value_p

address

quadword

read

by value

ident

user_value

quadword

read

by value

Arguments:

whichReg

Indicates the register to be written (see enum in libicb.h).

value_p

Specifies the address of the register contents to be written. The number of bytes written is determined by the size of the register.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

status

A value of 1 indicates success. A value of 0 indicates failure.

5.8.5.6. The Memory Allocation Routine

The memory allocation routine is used to allocate heap storage required during the analysis of unwind descriptors. This routine should mimic the behavior of the C RTL routine malloc.

void * (* malloc) (size_t size, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

length

size_t

longword

read

by value

ident

user_value

quadword

read

by value

Arguments:

length

The length in bytes of memory to be allocated. The returned memory block should be aligned on a 16-byte boundary.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

ptr

Address of the memory block allocated, or 0 for failure.

In the case where local memory is being read, that is, you have not overridden the read memory routines, the malloc requests are reduced to:
  • One Unwind Context block of size LIBICB$K_CONTEXT_BLK_SIZE

  • One Unwind Descriptor block of size LIBICB$K_DESCRIPTOR_BLK_SIZE

  • Several Unwind region blocks of size LIBICB$K_REGION_BLK_SIZE

  • Several Unwind region label blocks of size LIBICB$K_REGIONLABEL_BLK_SIZE

The number of the last two required depends on the complexity of the unwind descriptors for a given procedure being traced.

5.8.5.7. The Memory Deallocation Routine

The memory deallocation routine is used to free heap storage allocated by the memory allocation routine (see Section 5.8.5.6). This routine should mimic the behavior of the C RTL routine free.

void (* free) (void * ptr, uint64 ident);

Argument

OpenVMS Usage

Type

Access

Mechanism

ptr

address

quadword

read

by value

ident

user_value

quadword

read

by value

Arguments:

ptr

Address of a memory block previously allocated by a call to the user malloc routine.

ident

Specifies a user ident value from the invocation context block.

Function Value Returned:

None.

5.9. Data Alignment and Layout

On x86-64 hardware, a memory reference to data that is not naturally aligned does not result in alignment faults. However, natural alignment is nonetheless generally more efficient and recommended on OpenVMS x86-64.

In addition, common blocks, dynamically allocated (heap) regions (for example from malloc), and global data items greater than 8 bytes should be aligned on a 16-byte boundary.

5.9.1. Scalars

For scalar data, natural alignment is achieved as shown in Table 5.22.
Table 5.22. Natural Alignment Recommendations

Data Type

Alignment Starting Position

8-bit character string

Byte boundary

16-bit integer

Address that is a multiple of 2 (word alignment)

32-bit integer

Address that is a multiple of 4 (longword alignment)

64-bit integer

Address that is a multiple of 8 (quadword alignment)

  • F_floating
  • F_floating complex

Address that is a multiple of 4 (longword)

  • D_floating
  • D_floating complex

Address that is a multiple of 8 (quadword)

  • G_floating
  • G_floating complex

Address that is a multiple of 8 (quadword)

  • S_floating
  • S_floating complex

Address that is a multiple of 4 (longword)

  • T_floating
  • T_floating complex

Address that is a multiple of 8 (quadword)

  • X_floating
  • X_floating complex

Address that is a multiple of 16 (octaword)

For aggregates such as strings, arrays, and records, the data type to be considered for purposes of alignment is not the aggregate itself, but rather the elements of which the aggregate is composed. The alignment requirement of an aggregate is that all elements of the aggregate be naturally aligned. For example, varying 8-bit character strings must start at addresses that are a multiple of at least 2 (word alignment) because of the 16-bit count at the beginning of the string; 32-bit integer arrays start at a longword boundary, irrespective of the extent of the array.

However, some languages allow definition of aggregate types with an alignment that is greater than that of any of its components, or provide predefined types with such an alignment (for example, the __m128, __m256, and __m512 types in C/C++ for x86-64). The alignment of such types becomes the natural alignment for elements of those types when included in a containing aggregate.

The rules for passing a record in an argument that is passed by immediate value (see Section 5.7) always provide quadword alignment of the record value independent of the normal alignment requirement of the record. If deemed appropriate by an implementation, normal alignment can be established within the called procedure by making a copy of the record argument at a suitably aligned location.

5.9.2. Record Layout Conventions

The OpenVMS x86-64 calling standard rules for record layout are designed to provide good run-time performance on all implementations of the x86-64 architecture and to provide the required level of compatibility with conventional VAX, Alpha, and I64 operating environments.

Therefore, this standard defines the following record layout conventions:
  • Those optimized for optimal access characteristics (referred to as aligned record layouts)

  • Those compatible with conventions that are traditionally used by VAX languages (referred to as VAX compatible record layouts)

Only these record layouts may be used across standard interfaces or between languages. Languages can support other language-specific record layout conventions, but such layouts are nonstandard.

The aligned record layout conventions should be used unless interchange is required with conventional VAX applications that use the OpenVMS VAX compatible record layouts.

5.9.2.1. Aligned Record Layout

The aligned record layout conventions ensure that:
  • All components of a record or subrecord are naturally aligned.

  • Layout and alignment of record elements and subrecords are independent of any record or subrecord in which they are embedded.

  • Layout and alignment of a subrecord is the same as if it were a top-level record.

  • Declaration in high-level languages of standard records for interlanguage use is straightforward and obvious, and meets the requirements for source-level compatibility between OpenVMS x86-64 languages and OpenVMS I64, Alpha, and VAX languages.

The aligned record layout is defined by the following conventions:
  • The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.

  • The first bit of a record or subrecord must be directly addressable (byte aligned).

  • Records and subrecords must be aligned according to the largest natural alignment requirements of the contained elements and subrecords.

  • Bit fields (packed subranges of integers) are characterized by an underlying integer type that is a byte, word, longword, or quadword in size together with an allocation size in bits. A bit field is allocated at the next available bit boundary, provided that the resulting allocation does not cross an alignment boundary of the underlying type. Otherwise, the field is allocated at the next byte boundary that is aligned as required for the underlying type. (In the later case, the space skipped over is left permanently not allocated). In addition, if necessary, the alignment of the record as a whole is increased to that of the underlying integer type.

  • Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.

  • All other components of a record must start at the next available naturally aligned address for the data type.

  • The length of a record must be a multiple of its alignment. (This includes the case when a record is a component of another record).

  • Strings and arrays must be aligned according to the natural alignment requirements of the data type of which the string or array is composed.

  • The length of an array element is a multiple of its alignment, even if this leaves unused space at its end. The length of the whole array is the sum of the lengths of its elements.

5.9.2.2. OpenVMS VAX Compatible Record Layout

The OpenVMS VAX compatible record layout is defined by the following conventions:

  • The components of a record must be laid out in memory corresponding to the lexical order of their appearance in the high-level language declaration of the record.

  • Unaligned bit strings, unaligned bit arrays, and elements of unaligned bit arrays must start at the next available bit in the record. No fill is ever supplied preceding an unaligned bit string, unaligned bit array, or unaligned bit array element.

  • All other components of a record must start at the next available byte in the record. Any unused bits following the last-used bit in the last-used byte of each component must be filled out to the next byte boundary so that any following data starts on a byte boundary.

  • Subrecords must be aligned according to the largest alignment of the contained elements and subrecords. A subrecord always starts at the next available byte unless it consists entirely of unaligned bit data and it immediately follows an unaligned bit string, unaligned bit array, or a subrecord consisting entirely of unaligned bit data.

  • Records must be aligned on byte boundaries.

5.10. Addressing

Industry standard conventions for x86-64 Position Independent Code (PIC) generally make use of a Global Offset Table (GOT) to facilitate addressing code and data that is not known or assured to be within a 32-bit offset of the reference. The GOT is itself a data segment that is assured “near” the code so that PC-relative addressing with a 32-bit offset is sufficient to access that GOT. The GOT holds 64-bit addresses that allow access to any location in the system 64-bit address space.

5.10.1. Memory Models

Almost all x86-64 memory instructions have the size of the displacement field limited to 32 bits. This means that a single instruction can directly address only ±2 GB of memory. This limitation gives rise to three memory models:
  • The small code model—all code and data is within 2 GB.

  • The large code model—code and data is not limited to be within 2 GB.

  • The medium code model—code and data is assumed within 2 GB while specifically marked large model data may not.

OpenVMS compilers generate small model position-independent code using indirect addressing of all data to allow static data to be farther than 2 GB away from code. Because direct addressing is used only for entries in the Global Offset Table, OpenVMS compilers do not distinguish between the small and medium memory models. In effect, OpenVMS compilers support the medium data model for applications.

Foreign compilers and object modules may use any memory model. The OpenVMS linker and image activator support all memory models.

5.10.2. Inter-Segment Addressing

In industry standards for x86-64, shareable images may be loaded anywhere, but all segments within a shared library must have the same positions relative to each other that they were assigned by the linker. On OpenVMS x86-64, the image activator may map (logically load) segments of a shareable image independently of each other.

The independent loading of segments influences the way code addresses data. Industry standard x86-64 code uses PC-relative addressing to access not only the Global Offset Table, but also any other data that is known to be local to the image. Because segments may be mapped independently, this standard requires that code use indirect addressing to access all data except for the Global Offset Table. With this scheme, the code segment and the Global Offset Table (linkage) segment are the only segments whose relative positions have to be maintained.

In an image with multiple code segments, each code segment has its own Global Offset Table.

Non-VSI compilers and object modules may assume a small code model and use PC-relative data addressing exclusively. Both the linker and the image activator maintain the relative positions of code segments, Global Offset Tables, and other segments that are referenced in a PC-relative manner. In theory, the code could be adjusted with image relocations; in practice, the limited address range of the small code model (±2 GB) precludes this.

Chapter 6. Signature Information and Translated Images (Alpha and I64 Systems)

To support interoperation between images built from native OpenVMS Alpha code and images translated from OpenVMS VAX code, native Alpha compilers can optionally generate information that describes the parameters and result of a procedure. Similarly, for interoperation between images built from native OpenVMS I64 code and images translated from VAX or Alpha code, I64 compilers can also optionally generate information that describes the parameters and result of a procedure. This auxiliary information is called signature information.

Translated VAX code on Alpha and I64 systems uses VAX argument list and function return conventions as described in Section 2.4 and Section 2.5.

Translated Alpha code on I64 systems uses Alpha argument list and function return conventions as described in Chapter 3.

The following sections describe the conventions for using signature information to control the passing of arguments and returning a function value when a native procedure passes control to a translated procedure and vice versa.

The Translated Image Executive (TIE) is the user-mode support facility (itself a sharable image) that performs the following functions:
  • Mediates calls between native and translated code

  • Controls execution of translated code

  • Performs interpretation where necessary

6.1. Overview

OpenVMS compilers for Alpha and I64 provide a compilation option that causes signature information to be included in the resulting object file. To support interoperation between OpenVMS native and translated code, the native code must contain signature information.

With one exception related to indirect calls (see Section 6.1.1.3 and Section 6.1.2.3), code generation is not affected by the presence or absence of translated code support.

The operation of translated images on OpenVMS Alpha and I64 systems is very similar, though different in certain details.

6.1.1. Translated VAX Images on Alpha Systems

When a VAX image is translated to an Alpha image, the VAX registers R0—15 are represented using the lower half of the corresponding Alpha registers R0—15 at call interface boundaries. No type conversion is performed in making parameters from either native or translated code available to each other.

6.1.1.1. Direct Calls From Translated to Native Code

When the TIE encounters a call in translated code that passes control to native Alpha code, it obtains signature information for the target procedure using the PDSC$W_SIGNATURE_OFFSET field of the target procedure descriptor (see Section 3.4.1).

If the value in the PDSC$W_SIGNATURE_OFFSET is zero, then no signature information is available, the call cannot be performed, and the TIE signals an error.

Otherwise, the TIE uses the signature information to create an appropriate Alpha argument list (in the integer registers and stack as appropriate), then calls the native procedure. When control returns, the TIE obtains the returned result (if any), makes it available to translated code, and resumes translated code execution.

6.1.1.2. Direct Calls From Native to Translated Code

Calls from native Alpha code to a routine in a translated image depend on special linker and image activator support. If the linker can confirm that the target of the call is also in native code (because the target is local to the same image), then the call is resolved normally. Otherwise, the linker passes the compiler generated signature information for use by the image activator.

If the image activator can determine that the target of the call is also in native code, then the call is resolved normally. Otherwise, the image activator creates a bound procedure descriptor (see Section 3.6.4) and resolves the procedure value to that descriptor. This descriptor is setup to pass control to a special TIE entry point which obtains the target VAX procedure value and signature information from that same descriptor.

6.1.1.3. Indirect Calls From Native to Translated Code

If interoperation with translated images is not required, then an indirect call is made as described in Section 3.6.3. If interoperation with translated images must be considered, the procedure value (in R4 in the following example) might be the address of a VAX entry point or the address of an Alpha procedure descriptor.

A VAX entry point can be dynamically distinguished from an Alpha procedure descriptor by examining bits 12 and 13 of a VAX entry call mask, which are required to be 0 by the VAX architecture. For an Alpha procedure, bit 12 corresponds to the PDSC$V_NATIVE flag, which is required to be set in all Alpha procedure descriptors. Bit 13 corresponds to the PDSC$V_NO_JACKET flag, which is currently required to be set but reserved for enhancements to this standard in all Alpha procedure descriptors.

If the procedure value is determined to correspond to an Alpha procedure, then the call can be completed as discussed. If the procedure value is determined to correspond to a VAX procedure, then the call must be completed using system TIE facilities that will effect the transition into and out of the code of the translated image.

Example 6.1 illustrates a code sequence for examining the procedure value.
Example 6.1. Code for Examining the Procedure Value
    LDL     R28,0(R4)             ;Load the flags field of the target PDSC 
    MOV     #AI_LITERAL,R25       ;Load Argument Information register 
    SRL     R28,#PDSC$V_NO_JACKET,R26;Position jacket flag 
    BLBC    R26,CALL_JACKET       ;If clear then jacket needed 
    LDQ     R26,8(R4)             ;Entry address to scratch register 
    MOV     R4,R27                ;Procedure value to R27 
    JSR     R26,(R26)             ;Call entry address. 
back_in_line: 
    ...                           ;Rest of procedure code goes here 

TRANSLATED:                       ;Generated out of line, R2 contains a 
    LDQ     R26,N_TO_T_LKP(R2)    ;Entry address to scratch register 
    LDQ     R27,N_TO_T_LKP+8(R2)  ;Load procedure value 
    MOV     R4,R23                ;Address of routine to call to R23 
    JSR     R26,(R26)             ;Call jacket routine 
    BR      back_in_line          ;Return to normal code path 

CALL_JACKET:                      ; 
    SRL     R28,#PDSC$V_NATIVE,R28;Jacketing for translated or native? 
    LDA     R24,PSIG_OUT(R2)      ;Pass address of our argument 
                                  ; signature information in R24 
    BLBC    R28,TRANSLATED        ;If clear, then translated jacketing 
    (Native Jacketing Reserved for Future Use) 
    BR      back_in_line          ;Return to normal code path
In Example 6.1, TIE jacketing functionality is provided by the SYS$NATIVE_TO_TRANSLATED routine. This system procedure is called with the actual arguments for the target procedure in their normal locations (as though the target procedure were an Alpha procedure) and with two additional, nonstandard arguments:
  • R23 contains the procedure value for the target VAX procedure.

  • R24 contains the address of a signature information block for the call, as described in Section 6.2. There are two special address values:
    • The value zero (null) indicates that no signature information is available. As a result, if the call is to a translated image, then the call will fail.

    • The value one indicates a default signature applies, based on information in the argument information register (see Section 6.2.5).

The conventions just described are normally accomplished using the special service routine OTS$CALL_PROC . The actual parameters to the target function are passed to OTS$CALL_PROC as though the target routine is native code that is being invoked directly. In addition, OTS$CALL_PROC receives two additional parameters in registers R23 and R24 as described above for SYS$NATIVE_TO_TRANSLATED .

6.1.2. Translated Images on I64 Systems

When a VAX or Alpha image is translated to an I64 image, the VAX or Alpha registers become associated with I64 registers for the purpose of making a call according to the following mapping:

VAX/Alpha Register

I64 Register

R0

R8

R1

R9

In the case of a VAX image, the lower half of the corresponding I64 register is used.

For example, at the time of a call from an Alpha to an I64 image, the contents of the Alpha R1 register become the initial contents of the I64 R9 register when native execution begins. Similarly, at the time of a call from an I64 image to a VAX image, the contents of the lower half of the I64 R8 register become the initial contents of the VAX R0 register.

For calls between a translated VAX and a translated Alpha image on I64 systems, the rules for calls between translated VAX and native Alpha images apply and make use of signature information in the translated Alpha image.

OpenVMS I64 implements a static mapping that:
  • Allows an address corresponding to a translated image to be identified

  • Specifies whether it is an Alpha or VAX translated image

However, the means for creating and accessing this mapping is not part of this calling standard.

It is not possible for dynamically generated non-native code to be reflected in this mapping. As a result, OpenVMS does not support translated images that dynamically generate non-native code and call the in-memory result.

6.1.2.1. Calls From Translated to Native I64 Code

When the TIE encounters a call in translated code that passes control to native I64 code, it obtains signature information for the target routine from the function descriptor for that routine.

If the value in the signature information field is zero, then no signature information is available, the call cannot be performed, and the TIE signals an exception.

Otherwise, the TIE uses the signature information to create an appropriate I64 argument list (in the stacked registers and memory stack as appropriate), then calls the target native function. When control returns, the TIE obtains the returned result (if any), makes it available to the translated code, and resumes translated code execution.

To assure that any routine that can potentially be called from translated code has either signature information or a zero indicating the lack of signature information, it is necessary that every official function descriptor be allocated with room for the signature information field.

6.1.2.2. Direct Calls From Native I64 Code to Translated Code

Calls from native I64 code to a routine in a translated image depend on special linker and image activator support. If the linker can confirm that the target of a call is also in native code (because the target is local to the same image), then the call is resolved normally. Otherwise, the linker creates an import stub and an associated local function descriptor in the linkage table in the normal way. However, in this case the local function descriptor must be a jacket function descriptor, as described in the following paragraphs.

The linker also passes through the compiler generated signature information for use by the image activator. If the image activator can determine that the target of a call is also in native code, then the jacket function descriptor is initialized as for a simple function descriptor (the extra space in the jacket descriptor is unused). Otherwise, the image activator initializes the jacket function descriptor so that the call using that descriptor will transfer control into the TIE.

A jacket function descriptor is similar to a bound function descriptor (see Section 4.7.7) except that it initially transfers control to an entry point in the TIE. The TIE uses the signature information field together with other information in the descriptor to construct an appropriate parameter list for the translated code and effects the transfer of control into that code. When the call completes, control returns to the TIE, which sets up the return value for the native code and returns to normal execution.

A jacket function descriptor consists of the following fields:
  • Entry (code) address of the TIE entry point that handles transfers of control into translated code

  • Pseudo-GP value, which is the address of the jacket function descriptor

  • Signature information for the call (see Section 6.1.3)

  • Function pointer to the official function descriptor for the entry point in the translated image (or other unique identification that can be interpreted by the TIE)

More complete details are beyond the scope of this Standard.

Calls made by translated code to other entry points in translated code are not visible to the OpenVMS I64 calling standard. From the outside, a call from native I64 code to translated code looks like a single call to the TIE entry point, regardless of how many calls are made within the translated image.

6.1.2.3. Indirect Calls From Native to Translated Code

When translated code support is not requested, the code generated for calling a dynamic function value follows the I64 conventions. In particular, the target code address and target global pointer value are obtained from the function pointer and used in the standard way (see Section 4.7.3.2).

When translated code support is requested, the compiled code must instead call a special service routine, OTS$CALL_PROC. The actual parameters to the target function are passed to OTS$CALL_PROC as though the target routine is native code that is being invoked directly. In addition, OTS$CALL_PROC receives two additional parameters in special registers:
  • R17 contains the address of a signature information block for the call (see Section 6.1.3).

  • R18 contains the function pointer for the target of the call.

OTS$CALL_PROC first determines whether the target routine is part of a translated image or not using the static mapping mentioned earlier.

If the target is in native code, then OTS$CALL_PROC completes the call in a way that makes its mediation transparent (that is, control need not pass back through it for the return). The native parameters are used without modification.

If the target is in translated code, then OTS$CALL_PROC passes control to the TIE which handles the call as described in Section 6.1.2.2.

6.1.3. Signature Information Fields in Function Descriptors

The signature information field of the function descriptor is encoded using the low three bits of the field as a tag that specifies the interpretation of the rest of the field. Table 6.1 contains the meaning of the values specified by the tag value.
Table 6.1. Signature Information Field Tag Values

Tag Value
(low 3 bits)

Meaning

0

The signature information field as a whole (including the tag bits) is the address of a signature information block (see Section 6.2). However, if the address is null, no signature information is available.

1

Default signature information applies, which is based on the information in the argument information register (see Section 6.2.5). In this case the rest of the field must be zero.

2

The field as a whole is a signature information block (see Section 6.2) that is immediately contained in the function descriptor. This can only be used for a signature information block