diff options
Diffstat (limited to 'docs/firmware-design.rst')
-rw-r--r-- | docs/firmware-design.rst | 2443 |
1 files changed, 2443 insertions, 0 deletions
diff --git a/docs/firmware-design.rst b/docs/firmware-design.rst new file mode 100644 index 000000000..cc2fe118d --- /dev/null +++ b/docs/firmware-design.rst @@ -0,0 +1,2443 @@ +ARM Trusted Firmware Design +=========================== + + +.. section-numbering:: + :suffix: . + +.. contents:: + +The ARM Trusted Firmware implements a subset of the Trusted Board Boot +Requirements (TBBR) Platform Design Document (PDD) [1] for ARM reference +platforms. The TBB sequence starts when the platform is powered on and runs up +to the stage where it hands-off control to firmware running in the normal +world in DRAM. This is the cold boot path. + +The ARM Trusted Firmware also implements the Power State Coordination Interface +PDD [2] as a runtime service. PSCI is the interface from normal world software +to firmware implementing power management use-cases (for example, secondary CPU +boot, hotplug and idle). Normal world software can access ARM Trusted Firmware +runtime services via the ARM SMC (Secure Monitor Call) instruction. The SMC +instruction must be used as mandated by the SMC Calling Convention [3]. + +The ARM Trusted Firmware implements a framework for configuring and managing +interrupts generated in either security state. The details of the interrupt +management framework and its design can be found in ARM Trusted Firmware +Interrupt Management Design guide [4]. + +The ARM Trusted Firmware can be built to support either AArch64 or AArch32 +execution state. + +Cold boot +--------- + +The cold boot path starts when the platform is physically turned on. If +``COLD_BOOT_SINGLE_CPU=0``, one of the CPUs released from reset is chosen as the +primary CPU, and the remaining CPUs are considered secondary CPUs. The primary +CPU is chosen through platform-specific means. The cold boot path is mainly +executed by the primary CPU, other than essential CPU initialization executed by +all CPUs. The secondary CPUs are kept in a safe platform-specific state until +the primary CPU has performed enough initialization to boot them. + +Refer to the `Reset Design`_ for more information on the effect of the +``COLD_BOOT_SINGLE_CPU`` platform build option. + +The cold boot path in this implementation of the ARM Trusted Firmware, +depends on the execution state. +For AArch64, it is divided into five steps (in order of execution): + +- Boot Loader stage 1 (BL1) *AP Trusted ROM* +- Boot Loader stage 2 (BL2) *Trusted Boot Firmware* +- Boot Loader stage 3-1 (BL31) *EL3 Runtime Software* +- Boot Loader stage 3-2 (BL32) *Secure-EL1 Payload* (optional) +- Boot Loader stage 3-3 (BL33) *Non-trusted Firmware* + +For AArch32, it is divided into four steps (in order of execution): + +- Boot Loader stage 1 (BL1) *AP Trusted ROM* +- Boot Loader stage 2 (BL2) *Trusted Boot Firmware* +- Boot Loader stage 3-2 (BL32) *EL3 Runtime Software* +- Boot Loader stage 3-3 (BL33) *Non-trusted Firmware* + +ARM development platforms (Fixed Virtual Platforms (FVPs) and Juno) implement a +combination of the following types of memory regions. Each bootloader stage uses +one or more of these memory regions. + +- Regions accessible from both non-secure and secure states. For example, + non-trusted SRAM, ROM and DRAM. +- Regions accessible from only the secure state. For example, trusted SRAM and + ROM. The FVPs also implement the trusted DRAM which is statically + configured. Additionally, the Base FVPs and Juno development platform + configure the TrustZone Controller (TZC) to create a region in the DRAM + which is accessible only from the secure state. + +The sections below provide the following details: + +- initialization and execution of the first three stages during cold boot +- specification of the EL3 Runtime Software (BL31 for AArch64 and BL32 for + AArch32) entrypoint requirements for use by alternative Trusted Boot + Firmware in place of the provided BL1 and BL2 + +BL1 +~~~ + +This stage begins execution from the platform's reset vector at EL3. The reset +address is platform dependent but it is usually located in a Trusted ROM area. +The BL1 data section is copied to trusted SRAM at runtime. + +On the ARM development platforms, BL1 code starts execution from the reset +vector defined by the constant ``BL1_RO_BASE``. The BL1 data section is copied +to the top of trusted SRAM as defined by the constant ``BL1_RW_BASE``. + +The functionality implemented by this stage is as follows. + +Determination of boot path +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Whenever a CPU is released from reset, BL1 needs to distinguish between a warm +boot and a cold boot. This is done using platform-specific mechanisms (see the +``plat_get_my_entrypoint()`` function in the `Porting Guide`_). In the case of a +warm boot, a CPU is expected to continue execution from a separate +entrypoint. In the case of a cold boot, the secondary CPUs are placed in a safe +platform-specific state (see the ``plat_secondary_cold_boot_setup()`` function in +the `Porting Guide`_) while the primary CPU executes the remaining cold boot path +as described in the following sections. + +This step only applies when ``PROGRAMMABLE_RESET_ADDRESS=0``. Refer to the +`Reset Design`_ for more information on the effect of the +``PROGRAMMABLE_RESET_ADDRESS`` platform build option. + +Architectural initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL1 performs minimal architectural initialization as follows. + +- Exception vectors + + BL1 sets up simple exception vectors for both synchronous and asynchronous + exceptions. The default behavior upon receiving an exception is to populate + a status code in the general purpose register ``X0/R0`` and call the + ``plat_report_exception()`` function (see the `Porting Guide`_). The status + code is one of: + + For AArch64: + + :: + + 0x0 : Synchronous exception from Current EL with SP_EL0 + 0x1 : IRQ exception from Current EL with SP_EL0 + 0x2 : FIQ exception from Current EL with SP_EL0 + 0x3 : System Error exception from Current EL with SP_EL0 + 0x4 : Synchronous exception from Current EL with SP_ELx + 0x5 : IRQ exception from Current EL with SP_ELx + 0x6 : FIQ exception from Current EL with SP_ELx + 0x7 : System Error exception from Current EL with SP_ELx + 0x8 : Synchronous exception from Lower EL using aarch64 + 0x9 : IRQ exception from Lower EL using aarch64 + 0xa : FIQ exception from Lower EL using aarch64 + 0xb : System Error exception from Lower EL using aarch64 + 0xc : Synchronous exception from Lower EL using aarch32 + 0xd : IRQ exception from Lower EL using aarch32 + 0xe : FIQ exception from Lower EL using aarch32 + 0xf : System Error exception from Lower EL using aarch32 + + For AArch32: + + :: + + 0x10 : User mode + 0x11 : FIQ mode + 0x12 : IRQ mode + 0x13 : SVC mode + 0x16 : Monitor mode + 0x17 : Abort mode + 0x1a : Hypervisor mode + 0x1b : Undefined mode + 0x1f : System mode + + The ``plat_report_exception()`` implementation on the ARM FVP port programs + the Versatile Express System LED register in the following format to + indicate the occurence of an unexpected exception: + + :: + + SYS_LED[0] - Security state (Secure=0/Non-Secure=1) + SYS_LED[2:1] - Exception Level (EL3=0x3, EL2=0x2, EL1=0x1, EL0=0x0) + For AArch32 it is always 0x0 + SYS_LED[7:3] - Exception Class (Sync/Async & origin). This is the value + of the status code + + A write to the LED register reflects in the System LEDs (S6LED0..7) in the + CLCD window of the FVP. + + BL1 does not expect to receive any exceptions other than the SMC exception. + For the latter, BL1 installs a simple stub. The stub expects to receive a + limited set of SMC types (determined by their function IDs in the general + purpose register ``X0/R0``): + + - ``BL1_SMC_RUN_IMAGE``: This SMC is raised by BL2 to make BL1 pass control + to EL3 Runtime Software. + - All SMCs listed in section "BL1 SMC Interface" in the `Firmware Update`_ + Design Guide are supported for AArch64 only. These SMCs are currently + not supported when BL1 is built for AArch32. + + Any other SMC leads to an assertion failure. + +- CPU initialization + + BL1 calls the ``reset_handler()`` function which in turn calls the CPU + specific reset handler function (see the section: "CPU specific operations + framework"). + +- Control register setup (for AArch64) + + - ``SCTLR_EL3``. Instruction cache is enabled by setting the ``SCTLR_EL3.I`` + bit. Alignment and stack alignment checking is enabled by setting the + ``SCTLR_EL3.A`` and ``SCTLR_EL3.SA`` bits. Exception endianness is set to + little-endian by clearing the ``SCTLR_EL3.EE`` bit. + + - ``SCR_EL3``. The register width of the next lower exception level is set + to AArch64 by setting the ``SCR.RW`` bit. The ``SCR.EA`` bit is set to trap + both External Aborts and SError Interrupts in EL3. The ``SCR.SIF`` bit is + also set to disable instruction fetches from Non-secure memory when in + secure state. + + - ``CPTR_EL3``. Accesses to the ``CPACR_EL1`` register from EL1 or EL2, or the + ``CPTR_EL2`` register from EL2 are configured to not trap to EL3 by + clearing the ``CPTR_EL3.TCPAC`` bit. Access to the trace functionality is + configured not to trap to EL3 by clearing the ``CPTR_EL3.TTA`` bit. + Instructions that access the registers associated with Floating Point + and Advanced SIMD execution are configured to not trap to EL3 by + clearing the ``CPTR_EL3.TFP`` bit. + + - ``DAIF``. The SError interrupt is enabled by clearing the SError interrupt + mask bit. + + - ``MDCR_EL3``. The trap controls, ``MDCR_EL3.TDOSA``, ``MDCR_EL3.TDA`` and + ``MDCR_EL3.TPM``, are set so that accesses to the registers they control + do not trap to EL3. AArch64 Secure self-hosted debug is disabled by + setting the ``MDCR_EL3.SDD`` bit. Also ``MDCR_EL3.SPD32`` is set to + disable AArch32 Secure self-hosted privileged debug from S-EL1. + +- Control register setup (for AArch32) + + - ``SCTLR``. Instruction cache is enabled by setting the ``SCTLR.I`` bit. + Alignment checking is enabled by setting the ``SCTLR.A`` bit. + Exception endianness is set to little-endian by clearing the + ``SCTLR.EE`` bit. + + - ``SCR``. The ``SCR.SIF`` bit is set to disable instruction fetches from + Non-secure memory when in secure state. + + - ``CPACR``. Allow execution of Advanced SIMD instructions at PL0 and PL1, + by clearing the ``CPACR.ASEDIS`` bit. Access to the trace functionality + is configured not to trap to undefined mode by clearing the + ``CPACR.TRCDIS`` bit. + + - ``NSACR``. Enable non-secure access to Advanced SIMD functionality and + system register access to implemented trace registers. + + - ``FPEXC``. Enable access to the Advanced SIMD and floating-point + functionality from all Exception levels. + + - ``CPSR.A``. The Asynchronous data abort interrupt is enabled by clearing + the Asynchronous data abort interrupt mask bit. + + - ``SDCR``. The ``SDCR.SPD`` field is set to disable AArch32 Secure + self-hosted privileged debug. + +Platform initialization +^^^^^^^^^^^^^^^^^^^^^^^ + +On ARM platforms, BL1 performs the following platform initializations: + +- Enable the Trusted Watchdog. +- Initialize the console. +- Configure the Interconnect to enable hardware coherency. +- Enable the MMU and map the memory it needs to access. +- Configure any required platform storage to load the next bootloader image + (BL2). + +Firmware Update detection and execution +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +After performing platform setup, BL1 common code calls +``bl1_plat_get_next_image_id()`` to determine if `Firmware Update`_ is required or +to proceed with the normal boot process. If the platform code returns +``BL2_IMAGE_ID`` then the normal boot sequence is executed as described in the +next section, else BL1 assumes that `Firmware Update`_ is required and execution +passes to the first image in the `Firmware Update`_ process. In either case, BL1 +retrieves a descriptor of the next image by calling ``bl1_plat_get_image_desc()``. +The image descriptor contains an ``entry_point_info_t`` structure, which BL1 +uses to initialize the execution state of the next image. + +BL2 image load and execution +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In the normal boot flow, BL1 execution continues as follows: + +#. BL1 prints the following string from the primary CPU to indicate successful + execution of the BL1 stage: + + :: + + "Booting Trusted Firmware" + +#. BL1 determines the amount of free trusted SRAM memory available by + calculating the extent of its own data section, which also resides in + trusted SRAM. BL1 loads a BL2 raw binary image from platform storage, at a + platform-specific base address. If the BL2 image file is not present or if + there is not enough free trusted SRAM the following error message is + printed: + + :: + + "Failed to load BL2 firmware." + + BL1 calculates the amount of Trusted SRAM that can be used by the BL2 + image. The exact load location of the image is provided as a base address + in the platform header. Further description of the memory layout can be + found later in this document. + +#. BL1 passes control to the BL2 image at Secure EL1 (for AArch64) or at + Secure SVC mode (for AArch32), starting from its load address. + +#. BL1 also passes information about the amount of trusted SRAM used and + available for use. This information is populated at a platform-specific + memory address. + +BL2 +~~~ + +BL1 loads and passes control to BL2 at Secure-EL1 (for AArch64) or at Secure +SVC mode (for AArch32) . BL2 is linked against and loaded at a platform-specific +base address (more information can be found later in this document). +The functionality implemented by BL2 is as follows. + +Architectural initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +For AArch64, BL2 performs the minimal architectural initialization required +for subsequent stages of the ARM Trusted Firmware and normal world software. +EL1 and EL0 are given access to Floating Point and Advanced SIMD registers +by clearing the ``CPACR.FPEN`` bits. + +For AArch32, the minimal architectural initialization required for subsequent +stages of the ARM Trusted Firmware and normal world software is taken care of +in BL1 as both BL1 and BL2 execute at PL1. + +Platform initialization +^^^^^^^^^^^^^^^^^^^^^^^ + +On ARM platforms, BL2 performs the following platform initializations: + +- Initialize the console. +- Configure any required platform storage to allow loading further bootloader + images. +- Enable the MMU and map the memory it needs to access. +- Perform platform security setup to allow access to controlled components. +- Reserve some memory for passing information to the next bootloader image + EL3 Runtime Software and populate it. +- Define the extents of memory available for loading each subsequent + bootloader image. + +Image loading in BL2 +^^^^^^^^^^^^^^^^^^^^ + +Image loading scheme in BL2 depends on ``LOAD_IMAGE_V2`` build option. If the +flag is disabled, the BLxx images are loaded, by calling the respective +load\_blxx() function from BL2 generic code. If the flag is enabled, the BL2 +generic code loads the images based on the list of loadable images provided +by the platform. BL2 passes the list of executable images provided by the +platform to the next handover BL image. By default, this flag is disabled for +AArch64 and the AArch32 build is supported only if this flag is enabled. + +SCP\_BL2 (System Control Processor Firmware) image load +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Some systems have a separate System Control Processor (SCP) for power, clock, +reset and system control. BL2 loads the optional SCP\_BL2 image from platform +storage into a platform-specific region of secure memory. The subsequent +handling of SCP\_BL2 is platform specific. For example, on the Juno ARM +development platform port the image is transferred into SCP's internal memory +using the Boot Over MHU (BOM) protocol after being loaded in the trusted SRAM +memory. The SCP executes SCP\_BL2 and signals to the Application Processor (AP) +for BL2 execution to continue. + +EL3 Runtime Software image load +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL2 loads the EL3 Runtime Software image from platform storage into a platform- +specific address in trusted SRAM. If there is not enough memory to load the +image or image is missing it leads to an assertion failure. If ``LOAD_IMAGE_V2`` +is disabled and if image loads successfully, BL2 updates the amount of trusted +SRAM used and available for use by EL3 Runtime Software. This information is +populated at a platform-specific memory address. + +AArch64 BL32 (Secure-EL1 Payload) image load +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL2 loads the optional BL32 image from platform storage into a platform- +specific region of secure memory. The image executes in the secure world. BL2 +relies on BL31 to pass control to the BL32 image, if present. Hence, BL2 +populates a platform-specific area of memory with the entrypoint/load-address +of the BL32 image. The value of the Saved Processor Status Register (``SPSR``) +for entry into BL32 is not determined by BL2, it is initialized by the +Secure-EL1 Payload Dispatcher (see later) within BL31, which is responsible for +managing interaction with BL32. This information is passed to BL31. + +BL33 (Non-trusted Firmware) image load +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL2 loads the BL33 image (e.g. UEFI or other test or boot software) from +platform storage into non-secure memory as defined by the platform. + +BL2 relies on EL3 Runtime Software to pass control to BL33 once secure state +initialization is complete. Hence, BL2 populates a platform-specific area of +memory with the entrypoint and Saved Program Status Register (``SPSR``) of the +normal world software image. The entrypoint is the load address of the BL33 +image. The ``SPSR`` is determined as specified in Section 5.13 of the +`PSCI PDD`_. This information is passed to the EL3 Runtime Software. + +AArch64 BL31 (EL3 Runtime Software) execution +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL2 execution continues as follows: + +#. BL2 passes control back to BL1 by raising an SMC, providing BL1 with the + BL31 entrypoint. The exception is handled by the SMC exception handler + installed by BL1. + +#. BL1 turns off the MMU and flushes the caches. It clears the + ``SCTLR_EL3.M/I/C`` bits, flushes the data cache to the point of coherency + and invalidates the TLBs. + +#. BL1 passes control to BL31 at the specified entrypoint at EL3. + +AArch64 BL31 +~~~~~~~~~~~~ + +The image for this stage is loaded by BL2 and BL1 passes control to BL31 at +EL3. BL31 executes solely in trusted SRAM. BL31 is linked against and +loaded at a platform-specific base address (more information can be found later +in this document). The functionality implemented by BL31 is as follows. + +Architectural initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Currently, BL31 performs a similar architectural initialization to BL1 as +far as system register settings are concerned. Since BL1 code resides in ROM, +architectural initialization in BL31 allows override of any previous +initialization done by BL1. + +BL31 initializes the per-CPU data framework, which provides a cache of +frequently accessed per-CPU data optimised for fast, concurrent manipulation +on different CPUs. This buffer includes pointers to per-CPU contexts, crash +buffer, CPU reset and power down operations, PSCI data, platform data and so on. + +It then replaces the exception vectors populated by BL1 with its own. BL31 +exception vectors implement more elaborate support for handling SMCs since this +is the only mechanism to access the runtime services implemented by BL31 (PSCI +for example). BL31 checks each SMC for validity as specified by the +`SMC calling convention PDD`_ before passing control to the required SMC +handler routine. + +BL31 programs the ``CNTFRQ_EL0`` register with the clock frequency of the system +counter, which is provided by the platform. + +Platform initialization +^^^^^^^^^^^^^^^^^^^^^^^ + +BL31 performs detailed platform initialization, which enables normal world +software to function correctly. + +On ARM platforms, this consists of the following: + +- Initialize the console. +- Configure the Interconnect to enable hardware coherency. +- Enable the MMU and map the memory it needs to access. +- Initialize the generic interrupt controller. +- Initialize the power controller device. +- Detect the system topology. + +Runtime services initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL31 is responsible for initializing the runtime services. One of them is PSCI. + +As part of the PSCI initializations, BL31 detects the system topology. It also +initializes the data structures that implement the state machine used to track +the state of power domain nodes. The state can be one of ``OFF``, ``RUN`` or +``RETENTION``. All secondary CPUs are initially in the ``OFF`` state. The cluster +that the primary CPU belongs to is ``ON``; any other cluster is ``OFF``. It also +initializes the locks that protect them. BL31 accesses the state of a CPU or +cluster immediately after reset and before the data cache is enabled in the +warm boot path. It is not currently possible to use 'exclusive' based spinlocks, +therefore BL31 uses locks based on Lamport's Bakery algorithm instead. + +The runtime service framework and its initialization is described in more +detail in the "EL3 runtime services framework" section below. + +Details about the status of the PSCI implementation are provided in the +"Power State Coordination Interface" section below. + +AArch64 BL32 (Secure-EL1 Payload) image initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If a BL32 image is present then there must be a matching Secure-EL1 Payload +Dispatcher (SPD) service (see later for details). During initialization +that service must register a function to carry out initialization of BL32 +once the runtime services are fully initialized. BL31 invokes such a +registered function to initialize BL32 before running BL33. This initialization +is not necessary for AArch32 SPs. + +Details on BL32 initialization and the SPD's role are described in the +"Secure-EL1 Payloads and Dispatchers" section below. + +BL33 (Non-trusted Firmware) execution +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +EL3 Runtime Software initializes the EL2 or EL1 processor context for normal- +world cold boot, ensuring that no secure state information finds its way into +the non-secure execution state. EL3 Runtime Software uses the entrypoint +information provided by BL2 to jump to the Non-trusted firmware image (BL33) +at the highest available Exception Level (EL2 if available, otherwise EL1). + +Using alternative Trusted Boot Firmware in place of BL1 & BL2 (AArch64 only) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Some platforms have existing implementations of Trusted Boot Firmware that +would like to use ARM Trusted Firmware BL31 for the EL3 Runtime Software. To +enable this firmware architecture it is important to provide a fully documented +and stable interface between the Trusted Boot Firmware and BL31. + +Future changes to the BL31 interface will be done in a backwards compatible +way, and this enables these firmware components to be independently enhanced/ +updated to develop and exploit new functionality. + +Required CPU state when calling ``bl31_entrypoint()`` during cold boot +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This function must only be called by the primary CPU. + +On entry to this function the calling primary CPU must be executing in AArch64 +EL3, little-endian data access, and all interrupt sources masked: + +:: + + PSTATE.EL = 3 + PSTATE.RW = 1 + PSTATE.DAIF = 0xf + SCTLR_EL3.EE = 0 + +X0 and X1 can be used to pass information from the Trusted Boot Firmware to the +platform code in BL31: + +:: + + X0 : Reserved for common Trusted Firmware information + X1 : Platform specific information + +BL31 zero-init sections (e.g. ``.bss``) should not contain valid data on entry, +these will be zero filled prior to invoking platform setup code. + +Use of the X0 and X1 parameters +''''''''''''''''''''''''''''''' + +The parameters are platform specific and passed from ``bl31_entrypoint()`` to +``bl31_early_platform_setup()``. The value of these parameters is never directly +used by the common BL31 code. + +The convention is that ``X0`` conveys information regarding the BL31, BL32 and +BL33 images from the Trusted Boot firmware and ``X1`` can be used for other +platform specific purpose. This convention allows platforms which use ARM +Trusted Firmware's BL1 and BL2 images to transfer additional platform specific +information from Secure Boot without conflicting with future evolution of the +Trusted Firmware using ``X0`` to pass a ``bl31_params`` structure. + +BL31 common and SPD initialization code depends on image and entrypoint +information about BL33 and BL32, which is provided via BL31 platform APIs. +This information is required until the start of execution of BL33. This +information can be provided in a platform defined manner, e.g. compiled into +the platform code in BL31, or provided in a platform defined memory location +by the Trusted Boot firmware, or passed from the Trusted Boot Firmware via the +Cold boot Initialization parameters. This data may need to be cleaned out of +the CPU caches if it is provided by an earlier boot stage and then accessed by +BL31 platform code before the caches are enabled. + +ARM Trusted Firmware's BL2 implementation passes a ``bl31_params`` structure in +``X0`` and the ARM development platforms interpret this in the BL31 platform +code. + +MMU, Data caches & Coherency +'''''''''''''''''''''''''''' + +BL31 does not depend on the enabled state of the MMU, data caches or +interconnect coherency on entry to ``bl31_entrypoint()``. If these are disabled +on entry, these should be enabled during ``bl31_plat_arch_setup()``. + +Data structures used in the BL31 cold boot interface +'''''''''''''''''''''''''''''''''''''''''''''''''''' + +These structures are designed to support compatibility and independent +evolution of the structures and the firmware images. For example, a version of +BL31 that can interpret the BL3x image information from different versions of +BL2, a platform that uses an extended entry\_point\_info structure to convey +additional register information to BL31, or a ELF image loader that can convey +more details about the firmware images. + +To support these scenarios the structures are versioned and sized, which enables +BL31 to detect which information is present and respond appropriately. The +``param_header`` is defined to capture this information: + +.. code:: c + + typedef struct param_header { + uint8_t type; /* type of the structure */ + uint8_t version; /* version of this structure */ + uint16_t size; /* size of this structure in bytes */ + uint32_t attr; /* attributes: unused bits SBZ */ + } param_header_t; + +The structures using this format are ``entry_point_info``, ``image_info`` and +``bl31_params``. The code that allocates and populates these structures must set +the header fields appropriately, and the ``SET_PARAM_HEAD()`` a macro is defined +to simplify this action. + +Required CPU state for BL31 Warm boot initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When requesting a CPU power-on, or suspending a running CPU, ARM Trusted +Firmware provides the platform power management code with a Warm boot +initialization entry-point, to be invoked by the CPU immediately after the +reset handler. On entry to the Warm boot initialization function the calling +CPU must be in AArch64 EL3, little-endian data access and all interrupt sources +masked: + +:: + + PSTATE.EL = 3 + PSTATE.RW = 1 + PSTATE.DAIF = 0xf + SCTLR_EL3.EE = 0 + +The PSCI implementation will initialize the processor state and ensure that the +platform power management code is then invoked as required to initialize all +necessary system, cluster and CPU resources. + +AArch32 EL3 Runtime Software entrypoint interface +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To enable this firmware architecture it is important to provide a fully +documented and stable interface between the Trusted Boot Firmware and the +AArch32 EL3 Runtime Software. + +Future changes to the entrypoint interface will be done in a backwards +compatible way, and this enables these firmware components to be independently +enhanced/updated to develop and exploit new functionality. + +Required CPU state when entering during cold boot +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This function must only be called by the primary CPU. + +On entry to this function the calling primary CPU must be executing in AArch32 +EL3, little-endian data access, and all interrupt sources masked: + +:: + + PSTATE.AIF = 0x7 + SCTLR.EE = 0 + +R0 and R1 are used to pass information from the Trusted Boot Firmware to the +platform code in AArch32 EL3 Runtime Software: + +:: + + R0 : Reserved for common Trusted Firmware information + R1 : Platform specific information + +Use of the R0 and R1 parameters +''''''''''''''''''''''''''''''' + +The parameters are platform specific and the convention is that ``R0`` conveys +information regarding the BL3x images from the Trusted Boot firmware and ``R1`` +can be used for other platform specific purpose. This convention allows +platforms which use ARM Trusted Firmware's BL1 and BL2 images to transfer +additional platform specific information from Secure Boot without conflicting +with future evolution of the Trusted Firmware using ``R0`` to pass a ``bl_params`` +structure. + +The AArch32 EL3 Runtime Software is responsible for entry into BL33. This +information can be obtained in a platform defined manner, e.g. compiled into +the AArch32 EL3 Runtime Software, or provided in a platform defined memory +location by the Trusted Boot firmware, or passed from the Trusted Boot Firmware +via the Cold boot Initialization parameters. This data may need to be cleaned +out of the CPU caches if it is provided by an earlier boot stage and then +accessed by AArch32 EL3 Runtime Software before the caches are enabled. + +When using AArch32 EL3 Runtime Software, the ARM development platforms pass a +``bl_params`` structure in ``R0`` from BL2 to be interpreted by AArch32 EL3 Runtime +Software platform code. + +MMU, Data caches & Coherency +'''''''''''''''''''''''''''' + +AArch32 EL3 Runtime Software must not depend on the enabled state of the MMU, +data caches or interconnect coherency in its entrypoint. They must be explicitly +enabled if required. + +Data structures used in cold boot interface +''''''''''''''''''''''''''''''''''''''''''' + +The AArch32 EL3 Runtime Software cold boot interface uses ``bl_params`` instead +of ``bl31_params``. The ``bl_params`` structure is based on the convention +described in AArch64 BL31 cold boot interface section. + +Required CPU state for warm boot initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When requesting a CPU power-on, or suspending a running CPU, AArch32 EL3 +Runtime Software must ensure execution of a warm boot initialization entrypoint. +If ARM Trusted Firmware BL1 is used and the PROGRAMMABLE\_RESET\_ADDRESS build +flag is false, then AArch32 EL3 Runtime Software must ensure that BL1 branches +to the warm boot entrypoint by arranging for the BL1 platform function, +plat\_get\_my\_entrypoint(), to return a non-zero value. + +In this case, the warm boot entrypoint must be in AArch32 EL3, little-endian +data access and all interrupt sources masked: + +:: + + PSTATE.AIF = 0x7 + SCTLR.EE = 0 + +The warm boot entrypoint may be implemented by using the ARM Trusted Firmware +``psci_warmboot_entrypoint()`` function. In that case, the platform must fulfil +the pre-requisites mentioned in the `PSCI Library integration guide`_. + +EL3 runtime services framework +------------------------------ + +Software executing in the non-secure state and in the secure state at exception +levels lower than EL3 will request runtime services using the Secure Monitor +Call (SMC) instruction. These requests will follow the convention described in +the SMC Calling Convention PDD (`SMCCC`_). The `SMCCC`_ assigns function +identifiers to each SMC request and describes how arguments are passed and +returned. + +The EL3 runtime services framework enables the development of services by +different providers that can be easily integrated into final product firmware. +The following sections describe the framework which facilitates the +registration, initialization and use of runtime services in EL3 Runtime +Software (BL31). + +The design of the runtime services depends heavily on the concepts and +definitions described in the `SMCCC`_, in particular SMC Function IDs, Owning +Entity Numbers (OEN), Fast and Yielding calls, and the SMC32 and SMC64 calling +conventions. Please refer to that document for more detailed explanation of +these terms. + +The following runtime services are expected to be implemented first. They have +not all been instantiated in the current implementation. + +#. Standard service calls + + This service is for management of the entire system. The Power State + Coordination Interface (`PSCI`_) is the first set of standard service calls + defined by ARM (see PSCI section later). + +#. Secure-EL1 Payload Dispatcher service + + If a system runs a Trusted OS or other Secure-EL1 Payload (SP) then + it also requires a *Secure Monitor* at EL3 to switch the EL1 processor + context between the normal world (EL1/EL2) and trusted world (Secure-EL1). + The Secure Monitor will make these world switches in response to SMCs. The + `SMCCC`_ provides for such SMCs with the Trusted OS Call and Trusted + Application Call OEN ranges. + + The interface between the EL3 Runtime Software and the Secure-EL1 Payload is + not defined by the `SMCCC`_ or any other standard. As a result, each + Secure-EL1 Payload requires a specific Secure Monitor that runs as a runtime + service - within ARM Trusted Firmware this service is referred to as the + Secure-EL1 Payload Dispatcher (SPD). + + ARM Trusted Firmware provides a Test Secure-EL1 Payload (TSP) and its + associated Dispatcher (TSPD). Details of SPD design and TSP/TSPD operation + are described in the "Secure-EL1 Payloads and Dispatchers" section below. + +#. CPU implementation service + + This service will provide an interface to CPU implementation specific + services for a given platform e.g. access to processor errata workarounds. + This service is currently unimplemented. + +Additional services for ARM Architecture, SiP and OEM calls can be implemented. +Each implemented service handles a range of SMC function identifiers as +described in the `SMCCC`_. + +Registration +~~~~~~~~~~~~ + +A runtime service is registered using the ``DECLARE_RT_SVC()`` macro, specifying +the name of the service, the range of OENs covered, the type of service and +initialization and call handler functions. This macro instantiates a ``const struct rt_svc_desc`` for the service with these details (see ``runtime_svc.h``). +This structure is allocated in a special ELF section ``rt_svc_descs``, enabling +the framework to find all service descriptors included into BL31. + +The specific service for a SMC Function is selected based on the OEN and call +type of the Function ID, and the framework uses that information in the service +descriptor to identify the handler for the SMC Call. + +The service descriptors do not include information to identify the precise set +of SMC function identifiers supported by this service implementation, the +security state from which such calls are valid nor the capability to support +64-bit and/or 32-bit callers (using SMC32 or SMC64). Responding appropriately +to these aspects of a SMC call is the responsibility of the service +implementation, the framework is focused on integration of services from +different providers and minimizing the time taken by the framework before the +service handler is invoked. + +Details of the parameters, requirements and behavior of the initialization and +call handling functions are provided in the following sections. + +Initialization +~~~~~~~~~~~~~~ + +``runtime_svc_init()`` in ``runtime_svc.c`` initializes the runtime services +framework running on the primary CPU during cold boot as part of the BL31 +initialization. This happens prior to initializing a Trusted OS and running +Normal world boot firmware that might in turn use these services. +Initialization involves validating each of the declared runtime service +descriptors, calling the service initialization function and populating the +index used for runtime lookup of the service. + +The BL31 linker script collects all of the declared service descriptors into a +single array and defines symbols that allow the framework to locate and traverse +the array, and determine its size. + +The framework does basic validation of each descriptor to halt firmware +initialization if service declaration errors are detected. The framework does +not check descriptors for the following error conditions, and may behave in an +unpredictable manner under such scenarios: + +#. Overlapping OEN ranges +#. Multiple descriptors for the same range of OENs and ``call_type`` +#. Incorrect range of owning entity numbers for a given ``call_type`` + +Once validated, the service ``init()`` callback is invoked. This function carries +out any essential EL3 initialization before servicing requests. The ``init()`` +function is only invoked on the primary CPU during cold boot. If the service +uses per-CPU data this must either be initialized for all CPUs during this call, +or be done lazily when a CPU first issues an SMC call to that service. If +``init()`` returns anything other than ``0``, this is treated as an initialization +error and the service is ignored: this does not cause the firmware to halt. + +The OEN and call type fields present in the SMC Function ID cover a total of +128 distinct services, but in practice a single descriptor can cover a range of +OENs, e.g. SMCs to call a Trusted OS function. To optimize the lookup of a +service handler, the framework uses an array of 128 indices that map every +distinct OEN/call-type combination either to one of the declared services or to +indicate the service is not handled. This ``rt_svc_descs_indices[]`` array is +populated for all of the OENs covered by a service after the service ``init()`` +function has reported success. So a service that fails to initialize will never +have it's ``handle()`` function invoked. + +The following figure shows how the ``rt_svc_descs_indices[]`` index maps the SMC +Function ID call type and OEN onto a specific service handler in the +``rt_svc_descs[]`` array. + +|Image 1| + +Handling an SMC +~~~~~~~~~~~~~~~ + +When the EL3 runtime services framework receives a Secure Monitor Call, the SMC +Function ID is passed in W0 from the lower exception level (as per the +`SMCCC`_). If the calling register width is AArch32, it is invalid to invoke an +SMC Function which indicates the SMC64 calling convention: such calls are +ignored and return the Unknown SMC Function Identifier result code ``0xFFFFFFFF`` +in R0/X0. + +Bit[31] (fast/yielding call) and bits[29:24] (owning entity number) of the SMC +Function ID are combined to index into the ``rt_svc_descs_indices[]`` array. The +resulting value might indicate a service that has no handler, in this case the +framework will also report an Unknown SMC Function ID. Otherwise, the value is +used as a further index into the ``rt_svc_descs[]`` array to locate the required +service and handler. + +The service's ``handle()`` callback is provided with five of the SMC parameters +directly, the others are saved into memory for retrieval (if needed) by the +handler. The handler is also provided with an opaque ``handle`` for use with the +supporting library for parameter retrieval, setting return values and context +manipulation; and with ``flags`` indicating the security state of the caller. The +framework finally sets up the execution stack for the handler, and invokes the +services ``handle()`` function. + +On return from the handler the result registers are populated in X0-X3 before +restoring the stack and CPU state and returning from the original SMC. + +Power State Coordination Interface +---------------------------------- + +TODO: Provide design walkthrough of PSCI implementation. + +The PSCI v1.0 specification categorizes APIs as optional and mandatory. All the +mandatory APIs in PSCI v1.0 and all the APIs in PSCI v0.2 draft specification +`Power State Coordination Interface PDD`_ are implemented. The table lists +the PSCI v1.0 APIs and their support in generic code. + +An API implementation might have a dependency on platform code e.g. CPU\_SUSPEND +requires the platform to export a part of the implementation. Hence the level +of support of the mandatory APIs depends upon the support exported by the +platform port as well. The Juno and FVP (all variants) platforms export all the +required support. + ++-----------------------------+-------------+-------------------------------+ +| PSCI v1.0 API | Supported | Comments | ++=============================+=============+===============================+ +| ``PSCI_VERSION`` | Yes | The version returned is 1.0 | ++-----------------------------+-------------+-------------------------------+ +| ``CPU_SUSPEND`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``CPU_OFF`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``CPU_ON`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``AFFINITY_INFO`` | Yes | | ++-----------------------------+-------------+-------------------------------+ +| ``MIGRATE`` | Yes\*\* | | ++-----------------------------+-------------+-------------------------------+ +| ``MIGRATE_INFO_TYPE`` | Yes\*\* | | ++-----------------------------+-------------+-------------------------------+ +| ``MIGRATE_INFO_CPU`` | Yes\*\* | | ++-----------------------------+-------------+-------------------------------+ +| ``SYSTEM_OFF`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``SYSTEM_RESET`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``PSCI_FEATURES`` | Yes | | ++-----------------------------+-------------+-------------------------------+ +| ``CPU_FREEZE`` | No | | ++-----------------------------+-------------+-------------------------------+ +| ``CPU_DEFAULT_SUSPEND`` | No | | ++-----------------------------+-------------+-------------------------------+ +| ``NODE_HW_STATE`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``SYSTEM_SUSPEND`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``PSCI_SET_SUSPEND_MODE`` | No | | ++-----------------------------+-------------+-------------------------------+ +| ``PSCI_STAT_RESIDENCY`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``PSCI_STAT_COUNT`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ + +\*Note : These PSCI APIs require platform power management hooks to be +registered with the generic PSCI code to be supported. + +\*\*Note : These PSCI APIs require appropriate Secure Payload Dispatcher +hooks to be registered with the generic PSCI code to be supported. + +The PSCI implementation in ARM Trusted Firmware is a library which can be +integrated with AArch64 or AArch32 EL3 Runtime Software for ARMv8-A systems. +A guide to integrating PSCI library with AArch32 EL3 Runtime Software +can be found `here`_. + +Secure-EL1 Payloads and Dispatchers +----------------------------------- + +On a production system that includes a Trusted OS running in Secure-EL1/EL0, +the Trusted OS is coupled with a companion runtime service in the BL31 +firmware. This service is responsible for the initialisation of the Trusted +OS and all communications with it. The Trusted OS is the BL32 stage of the +boot flow in ARM Trusted Firmware. The firmware will attempt to locate, load +and execute a BL32 image. + +ARM Trusted Firmware uses a more general term for the BL32 software that runs +at Secure-EL1 - the *Secure-EL1 Payload* - as it is not always a Trusted OS. + +The ARM Trusted Firmware provides a Test Secure-EL1 Payload (TSP) and a Test +Secure-EL1 Payload Dispatcher (TSPD) service as an example of how a Trusted OS +is supported on a production system using the Runtime Services Framework. On +such a system, the Test BL32 image and service are replaced by the Trusted OS +and its dispatcher service. The ARM Trusted Firmware build system expects that +the dispatcher will define the build flag ``NEED_BL32`` to enable it to include +the BL32 in the build either as a binary or to compile from source depending +on whether the ``BL32`` build option is specified or not. + +The TSP runs in Secure-EL1. It is designed to demonstrate synchronous +communication with the normal-world software running in EL1/EL2. Communication +is initiated by the normal-world software + +- either directly through a Fast SMC (as defined in the `SMCCC`_) + +- or indirectly through a `PSCI`_ SMC. The `PSCI`_ implementation in turn + informs the TSPD about the requested power management operation. This allows + the TSP to prepare for or respond to the power state change + +The TSPD service is responsible for. + +- Initializing the TSP + +- Routing requests and responses between the secure and the non-secure + states during the two types of communications just described + +Initializing a BL32 Image +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The Secure-EL1 Payload Dispatcher (SPD) service is responsible for initializing +the BL32 image. It needs access to the information passed by BL2 to BL31 to do +so. This is provided by: + +.. code:: c + + entry_point_info_t *bl31_plat_get_next_image_ep_info(uint32_t); + +which returns a reference to the ``entry_point_info`` structure corresponding to +the image which will be run in the specified security state. The SPD uses this +API to get entry point information for the SECURE image, BL32. + +In the absence of a BL32 image, BL31 passes control to the normal world +bootloader image (BL33). When the BL32 image is present, it is typical +that the SPD wants control to be passed to BL32 first and then later to BL33. + +To do this the SPD has to register a BL32 initialization function during +initialization of the SPD service. The BL32 initialization function has this +prototype: + +.. code:: c + + int32_t init(void); + +and is registered using the ``bl31_register_bl32_init()`` function. + +Trusted Firmware supports two approaches for the SPD to pass control to BL32 +before returning through EL3 and running the non-trusted firmware (BL33): + +#. In the BL32 setup function, use ``bl31_set_next_image_type()`` to + request that the exit from ``bl31_main()`` is to the BL32 entrypoint in + Secure-EL1. BL31 will exit to BL32 using the asynchronous method by + calling ``bl31_prepare_next_image_entry()`` and ``el3_exit()``. + + When the BL32 has completed initialization at Secure-EL1, it returns to + BL31 by issuing an SMC, using a Function ID allocated to the SPD. On + receipt of this SMC, the SPD service handler should switch the CPU context + from trusted to normal world and use the ``bl31_set_next_image_type()`` and + ``bl31_prepare_next_image_entry()`` functions to set up the initial return to + the normal world firmware BL33. On return from the handler the framework + will exit to EL2 and run BL33. + +#. The BL32 setup function registers an initialization function using + ``bl31_register_bl32_init()`` which provides a SPD-defined mechanism to + invoke a 'world-switch synchronous call' to Secure-EL1 to run the BL32 + entrypoint. + NOTE: The Test SPD service included with the Trusted Firmware provides one + implementation of such a mechanism. + + On completion BL32 returns control to BL31 via a SMC, and on receipt the + SPD service handler invokes the synchronous call return mechanism to return + to the BL32 initialization function. On return from this function, + ``bl31_main()`` will set up the return to the normal world firmware BL33 and + continue the boot process in the normal world. + +#. .. rubric:: Crash Reporting in BL31 + :name: crash-reporting-in-bl31 + +BL31 implements a scheme for reporting the processor state when an unhandled +exception is encountered. The reporting mechanism attempts to preserve all the +register contents and report it via a dedicated UART (PL011 console). BL31 +reports the general purpose, EL3, Secure EL1 and some EL2 state registers. + +A dedicated per-CPU crash stack is maintained by BL31 and this is retrieved via +the per-CPU pointer cache. The implementation attempts to minimise the memory +required for this feature. The file ``crash_reporting.S`` contains the +implementation for crash reporting. + +The sample crash output is shown below. + +:: + + x0 :0x000000004F00007C + x1 :0x0000000007FFFFFF + x2 :0x0000000004014D50 + x3 :0x0000000000000000 + x4 :0x0000000088007998 + x5 :0x00000000001343AC + x6 :0x0000000000000016 + x7 :0x00000000000B8A38 + x8 :0x00000000001343AC + x9 :0x00000000000101A8 + x10 :0x0000000000000002 + x11 :0x000000000000011C + x12 :0x00000000FEFDC644 + x13 :0x00000000FED93FFC + x14 :0x0000000000247950 + x15 :0x00000000000007A2 + x16 :0x00000000000007A4 + x17 :0x0000000000247950 + x18 :0x0000000000000000 + x19 :0x00000000FFFFFFFF + x20 :0x0000000004014D50 + x21 :0x000000000400A38C + x22 :0x0000000000247950 + x23 :0x0000000000000010 + x24 :0x0000000000000024 + x25 :0x00000000FEFDC868 + x26 :0x00000000FEFDC86A + x27 :0x00000000019EDEDC + x28 :0x000000000A7CFDAA + x29 :0x0000000004010780 + x30 :0x000000000400F004 + scr_el3 :0x0000000000000D3D + sctlr_el3 :0x0000000000C8181F + cptr_el3 :0x0000000000000000 + tcr_el3 :0x0000000080803520 + daif :0x00000000000003C0 + mair_el3 :0x00000000000004FF + spsr_el3 :0x00000000800003CC + elr_el3 :0x000000000400C0CC + ttbr0_el3 :0x00000000040172A0 + esr_el3 :0x0000000096000210 + sp_el3 :0x0000000004014D50 + far_el3 :0x000000004F00007C + spsr_el1 :0x0000000000000000 + elr_el1 :0x0000000000000000 + spsr_abt :0x0000000000000000 + spsr_und :0x0000000000000000 + spsr_irq :0x0000000000000000 + spsr_fiq :0x0000000000000000 + sctlr_el1 :0x0000000030C81807 + actlr_el1 :0x0000000000000000 + cpacr_el1 :0x0000000000300000 + csselr_el1 :0x0000000000000002 + sp_el1 :0x0000000004028800 + esr_el1 :0x0000000000000000 + ttbr0_el1 :0x000000000402C200 + ttbr1_el1 :0x0000000000000000 + mair_el1 :0x00000000000004FF + amair_el1 :0x0000000000000000 + tcr_el1 :0x0000000000003520 + tpidr_el1 :0x0000000000000000 + tpidr_el0 :0x0000000000000000 + tpidrro_el0 :0x0000000000000000 + dacr32_el2 :0x0000000000000000 + ifsr32_el2 :0x0000000000000000 + par_el1 :0x0000000000000000 + far_el1 :0x0000000000000000 + afsr0_el1 :0x0000000000000000 + afsr1_el1 :0x0000000000000000 + contextidr_el1 :0x0000000000000000 + vbar_el1 :0x0000000004027000 + cntp_ctl_el0 :0x0000000000000000 + cntp_cval_el0 :0x0000000000000000 + cntv_ctl_el0 :0x0000000000000000 + cntv_cval_el0 :0x0000000000000000 + cntkctl_el1 :0x0000000000000000 + fpexc32_el2 :0x0000000004000700 + sp_el0 :0x0000000004010780 + +Guidelines for Reset Handlers +----------------------------- + +Trusted Firmware implements a framework that allows CPU and platform ports to +perform actions very early after a CPU is released from reset in both the cold +and warm boot paths. This is done by calling the ``reset_handler()`` function in +both the BL1 and BL31 images. It in turn calls the platform and CPU specific +reset handling functions. + +Details for implementing a CPU specific reset handler can be found in +Section 8. Details for implementing a platform specific reset handler can be +found in the `Porting Guide`_ (see the ``plat_reset_handler()`` function). + +When adding functionality to a reset handler, keep in mind that if a different +reset handling behavior is required between the first and the subsequent +invocations of the reset handling code, this should be detected at runtime. +In other words, the reset handler should be able to detect whether an action has +already been performed and act as appropriate. Possible courses of actions are, +e.g. skip the action the second time, or undo/redo it. + +CPU specific operations framework +--------------------------------- + +Certain aspects of the ARMv8 architecture are implementation defined, +that is, certain behaviours are not architecturally defined, but must be defined +and documented by individual processor implementations. The ARM Trusted +Firmware implements a framework which categorises the common implementation +defined behaviours and allows a processor to export its implementation of that +behaviour. The categories are: + +#. Processor specific reset sequence. + +#. Processor specific power down sequences. + +#. Processor specific register dumping as a part of crash reporting. + +#. Errata status reporting. + +Each of the above categories fulfils a different requirement. + +#. allows any processor specific initialization before the caches and MMU + are turned on, like implementation of errata workarounds, entry into + the intra-cluster coherency domain etc. + +#. allows each processor to implement the power down sequence mandated in + its Technical Reference Manual (TRM). + +#. allows a processor to provide additional information to the developer + in the event of a crash, for example Cortex-A53 has registers which + can expose the data cache contents. + +#. allows a processor to define a function that inspects and reports the status + of all errata workarounds on that processor. + +Please note that only 2. is mandated by the TRM. + +The CPU specific operations framework scales to accommodate a large number of +different CPUs during power down and reset handling. The platform can specify +any CPU optimization it wants to enable for each CPU. It can also specify +the CPU errata workarounds to be applied for each CPU type during reset +handling by defining CPU errata compile time macros. Details on these macros +can be found in the `cpu-specific-build-macros.rst`_ file. + +The CPU specific operations framework depends on the ``cpu_ops`` structure which +needs to be exported for each type of CPU in the platform. It is defined in +``include/lib/cpus/aarch64/cpu_macros.S`` and has the following fields : ``midr``, +``reset_func()``, ``cpu_pwr_down_ops`` (array of power down functions) and +``cpu_reg_dump()``. + +The CPU specific files in ``lib/cpus`` export a ``cpu_ops`` data structure with +suitable handlers for that CPU. For example, ``lib/cpus/aarch64/cortex_a53.S`` +exports the ``cpu_ops`` for Cortex-A53 CPU. According to the platform +configuration, these CPU specific files must be included in the build by +the platform makefile. The generic CPU specific operations framework code exists +in ``lib/cpus/aarch64/cpu_helpers.S``. + +CPU specific Reset Handling +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +After a reset, the state of the CPU when it calls generic reset handler is: +MMU turned off, both instruction and data caches turned off and not part +of any coherency domain. + +The BL entrypoint code first invokes the ``plat_reset_handler()`` to allow +the platform to perform any system initialization required and any system +errata workarounds that needs to be applied. The ``get_cpu_ops_ptr()`` reads +the current CPU midr, finds the matching ``cpu_ops`` entry in the ``cpu_ops`` +array and returns it. Note that only the part number and implementer fields +in midr are used to find the matching ``cpu_ops`` entry. The ``reset_func()`` in +the returned ``cpu_ops`` is then invoked which executes the required reset +handling for that CPU and also any errata workarounds enabled by the platform. +This function must preserve the values of general purpose registers x20 to x29. + +Refer to Section "Guidelines for Reset Handlers" for general guidelines +regarding placement of code in a reset handler. + +CPU specific power down sequence +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +During the BL31 initialization sequence, the pointer to the matching ``cpu_ops`` +entry is stored in per-CPU data by ``init_cpu_ops()`` so that it can be quickly +retrieved during power down sequences. + +Various CPU drivers register handlers to perform power down at certain power +levels for that specific CPU. The PSCI service, upon receiving a power down +request, determines the highest power level at which to execute power down +sequence for a particular CPU. It uses the ``prepare_cpu_pwr_dwn()`` function to +pick the right power down handler for the requested level. The function +retrieves ``cpu_ops`` pointer member of per-CPU data, and from that, further +retrieves ``cpu_pwr_down_ops`` array, and indexes into the required level. If the +requested power level is higher than what a CPU driver supports, the handler +registered for highest level is invoked. + +At runtime the platform hooks for power down are invoked by the PSCI service to +perform platform specific operations during a power down sequence, for example +turning off CCI coherency during a cluster power down. + +CPU specific register reporting during crash +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If the crash reporting is enabled in BL31, when a crash occurs, the crash +reporting framework calls ``do_cpu_reg_dump`` which retrieves the matching +``cpu_ops`` using ``get_cpu_ops_ptr()`` function. The ``cpu_reg_dump()`` in +``cpu_ops`` is invoked, which then returns the CPU specific register values to +be reported and a pointer to the ASCII list of register names in a format +expected by the crash reporting framework. + +CPU errata status reporting +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Errata workarounds for CPUs supported in ARM Trusted Firmware are applied during +both cold and warm boots, shortly after reset. Individual Errata workarounds are +enabled as build options. Some errata workarounds have potential run-time +implications; therefore some are enabled by default, others not. Platform ports +shall override build options to enable or disable errata as appropriate. The CPU +drivers take care of applying errata workarounds that are enabled and applicable +to a given CPU. Refer to the section titled *CPU Errata Workarounds* in `CPUBM`_ +for more information. + +Functions in CPU drivers that apply errata workaround must follow the +conventions listed below. + +The errata workaround must be authored as two separate functions: + +- One that checks for errata. This function must determine whether that errata + applies to the current CPU. Typically this involves matching the current + CPUs revision and variant against a value that's known to be affected by the + errata. If the function determines that the errata applies to this CPU, it + must return ``ERRATA_APPLIES``; otherwise, it must return + ``ERRATA_NOT_APPLIES``. The utility functions ``cpu_get_rev_var`` and + ``cpu_rev_var_ls`` functions may come in handy for this purpose. + +For an errata identified as ``E``, the check function must be named +``check_errata_E``. + +This function will be invoked at different times, both from assembly and from +C run time. Therefore it must follow AAPCS, and must not use stack. + +- Another one that applies the errata workaround. This function would call the + check function described above, and applies errata workaround if required. + +CPU drivers that apply errata workaround can optionally implement an assembly +function that report the status of errata workarounds pertaining to that CPU. +For a driver that registers the CPU, for example, ``cpux`` via. ``declare_cpu_ops`` +macro, the errata reporting function, if it exists, must be named +``cpux_errata_report``. This function will always be called with MMU enabled; it +must follow AAPCS and may use stack. + +In a debug build of ARM Trusted Firmware, on a CPU that comes out of reset, both +BL1 and the run time firmware (BL31 in AArch64, and BL32 in AArch32) will invoke +errata status reporting function, if one exists, for that type of CPU. + +To report the status of each errata workaround, the function shall use the +assembler macro ``report_errata``, passing it: + +- The build option that enables the errata; + +- The name of the CPU: this must be the same identifier that CPU driver + registered itself with, using ``declare_cpu_ops``; + +- And the errata identifier: the identifier must match what's used in the + errata's check function described above. + +The errata status reporting function will be called once per CPU type/errata +combination during the software's active life time. + +It's expected that whenever an errata workaround is submitted to ARM Trusted +Firmware, the errata reporting function is appropriately extended to report its +status as well. + +Reporting the status of errata workaround is for informational purpose only; it +has no functional significance. + +Memory layout of BL images +-------------------------- + +Each bootloader image can be divided in 2 parts: + +- the static contents of the image. These are data actually stored in the + binary on the disk. In the ELF terminology, they are called ``PROGBITS`` + sections; + +- the run-time contents of the image. These are data that don't occupy any + space in the binary on the disk. The ELF binary just contains some + metadata indicating where these data will be stored at run-time and the + corresponding sections need to be allocated and initialized at run-time. + In the ELF terminology, they are called ``NOBITS`` sections. + +All PROGBITS sections are grouped together at the beginning of the image, +followed by all NOBITS sections. This is true for all Trusted Firmware images +and it is governed by the linker scripts. This ensures that the raw binary +images are as small as possible. If a NOBITS section was inserted in between +PROGBITS sections then the resulting binary file would contain zero bytes in +place of this NOBITS section, making the image unnecessarily bigger. Smaller +images allow faster loading from the FIP to the main memory. + +Linker scripts and symbols +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Each bootloader stage image layout is described by its own linker script. The +linker scripts export some symbols into the program symbol table. Their values +correspond to particular addresses. The trusted firmware code can refer to these +symbols to figure out the image memory layout. + +Linker symbols follow the following naming convention in the trusted firmware. + +- ``__<SECTION>_START__`` + + Start address of a given section named ``<SECTION>``. + +- ``__<SECTION>_END__`` + + End address of a given section named ``<SECTION>``. If there is an alignment + constraint on the section's end address then ``__<SECTION>_END__`` corresponds + to the end address of the section's actual contents, rounded up to the right + boundary. Refer to the value of ``__<SECTION>_UNALIGNED_END__`` to know the + actual end address of the section's contents. + +- ``__<SECTION>_UNALIGNED_END__`` + + End address of a given section named ``<SECTION>`` without any padding or + rounding up due to some alignment constraint. + +- ``__<SECTION>_SIZE__`` + + Size (in bytes) of a given section named ``<SECTION>``. If there is an + alignment constraint on the section's end address then ``__<SECTION>_SIZE__`` + corresponds to the size of the section's actual contents, rounded up to the + right boundary. In other words, ``__<SECTION>_SIZE__ = __<SECTION>_END__ - _<SECTION>_START__``. Refer to the value of ``__<SECTION>_UNALIGNED_SIZE__`` + to know the actual size of the section's contents. + +- ``__<SECTION>_UNALIGNED_SIZE__`` + + Size (in bytes) of a given section named ``<SECTION>`` without any padding or + rounding up due to some alignment constraint. In other words, + ``__<SECTION>_UNALIGNED_SIZE__ = __<SECTION>_UNALIGNED_END__ - __<SECTION>_START__``. + +Some of the linker symbols are mandatory as the trusted firmware code relies on +them to be defined. They are listed in the following subsections. Some of them +must be provided for each bootloader stage and some are specific to a given +bootloader stage. + +The linker scripts define some extra, optional symbols. They are not actually +used by any code but they help in understanding the bootloader images' memory +layout as they are easy to spot in the link map files. + +Common linker symbols +^^^^^^^^^^^^^^^^^^^^^ + +All BL images share the following requirements: + +- The BSS section must be zero-initialised before executing any C code. +- The coherent memory section (if enabled) must be zero-initialised as well. +- The MMU setup code needs to know the extents of the coherent and read-only + memory regions to set the right memory attributes. When + ``SEPARATE_CODE_AND_RODATA=1``, it needs to know more specifically how the + read-only memory region is divided between code and data. + +The following linker symbols are defined for this purpose: + +- ``__BSS_START__`` +- ``__BSS_SIZE__`` +- ``__COHERENT_RAM_START__`` Must be aligned on a page-size boundary. +- ``__COHERENT_RAM_END__`` Must be aligned on a page-size boundary. +- ``__COHERENT_RAM_UNALIGNED_SIZE__`` +- ``__RO_START__`` +- ``__RO_END__`` +- ``__TEXT_START__`` +- ``__TEXT_END__`` +- ``__RODATA_START__`` +- ``__RODATA_END__`` + +BL1's linker symbols +^^^^^^^^^^^^^^^^^^^^ + +BL1 being the ROM image, it has additional requirements. BL1 resides in ROM and +it is entirely executed in place but it needs some read-write memory for its +mutable data. Its ``.data`` section (i.e. its allocated read-write data) must be +relocated from ROM to RAM before executing any C code. + +The following additional linker symbols are defined for BL1: + +- ``__BL1_ROM_END__`` End address of BL1's ROM contents, covering its code + and ``.data`` section in ROM. +- ``__DATA_ROM_START__`` Start address of the ``.data`` section in ROM. Must be + aligned on a 16-byte boundary. +- ``__DATA_RAM_START__`` Address in RAM where the ``.data`` section should be + copied over. Must be aligned on a 16-byte boundary. +- ``__DATA_SIZE__`` Size of the ``.data`` section (in ROM or RAM). +- ``__BL1_RAM_START__`` Start address of BL1 read-write data. +- ``__BL1_RAM_END__`` End address of BL1 read-write data. + +How to choose the right base addresses for each bootloader stage image +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +There is currently no support for dynamic image loading in the Trusted Firmware. +This means that all bootloader images need to be linked against their ultimate +runtime locations and the base addresses of each image must be chosen carefully +such that images don't overlap each other in an undesired way. As the code +grows, the base addresses might need adjustments to cope with the new memory +layout. + +The memory layout is completely specific to the platform and so there is no +general recipe for choosing the right base addresses for each bootloader image. +However, there are tools to aid in understanding the memory layout. These are +the link map files: ``build/<platform>/<build-type>/bl<x>/bl<x>.map``, with ``<x>`` +being the stage bootloader. They provide a detailed view of the memory usage of +each image. Among other useful information, they provide the end address of +each image. + +- ``bl1.map`` link map file provides ``__BL1_RAM_END__`` address. +- ``bl2.map`` link map file provides ``__BL2_END__`` address. +- ``bl31.map`` link map file provides ``__BL31_END__`` address. +- ``bl32.map`` link map file provides ``__BL32_END__`` address. + +For each bootloader image, the platform code must provide its start address +as well as a limit address that it must not overstep. The latter is used in the +linker scripts to check that the image doesn't grow past that address. If that +happens, the linker will issue a message similar to the following: + +:: + + aarch64-none-elf-ld: BLx has exceeded its limit. + +Additionally, if the platform memory layout implies some image overlaying like +on FVP, BL31 and TSP need to know the limit address that their PROGBITS +sections must not overstep. The platform code must provide those. + +When LOAD\_IMAGE\_V2 is disabled, Trusted Firmware provides a mechanism to +verify at boot time that the memory to load a new image is free to prevent +overwriting a previously loaded image. For this mechanism to work, the platform +must specify the memory available in the system as regions, where each region +consists of base address, total size and the free area within it (as defined +in the ``meminfo_t`` structure). Trusted Firmware retrieves these memory regions +by calling the corresponding platform API: + +- ``meminfo_t *bl1_plat_sec_mem_layout(void)`` +- ``meminfo_t *bl2_plat_sec_mem_layout(void)`` +- ``void bl2_plat_get_scp_bl2_meminfo(meminfo_t *scp_bl2_meminfo)`` +- ``void bl2_plat_get_bl32_meminfo(meminfo_t *bl32_meminfo)`` +- ``void bl2_plat_get_bl33_meminfo(meminfo_t *bl33_meminfo)`` + +For example, in the case of BL1 loading BL2, ``bl1_plat_sec_mem_layout()`` will +return the region defined by the platform where BL1 intends to load BL2. The +``load_image()`` function will check that the memory where BL2 will be loaded is +within the specified region and marked as free. + +The actual number of regions and their base addresses and sizes is platform +specific. The platform may return the same region or define a different one for +each API. However, the overlap verification mechanism applies only to a single +region. Hence, it is the platform responsibility to guarantee that different +regions do not overlap, or that if they do, the overlapping images are not +accessed at the same time. This could be used, for example, to load temporary +images (e.g. certificates) or firmware images prior to being transfered to its +corresponding processor (e.g. the SCP BL2 image). + +To reduce fragmentation and simplify the tracking of free memory, all the free +memory within a region is always located in one single buffer defined by its +base address and size. Trusted Firmware implements a top/bottom load approach: +after a new image is loaded, it checks how much memory remains free above and +below the image. The smallest area is marked as unavailable, while the larger +area becomes the new free memory buffer. Platforms should take this behaviour +into account when defining the base address for each of the images. For example, +if an image is loaded near the middle of the region, small changes in image size +could cause a flip between a top load and a bottom load, which may result in an +unexpected memory layout. + +The following diagram is an example of an image loaded in the bottom part of +the memory region. The region is initially free (nothing has been loaded yet): + +:: + + Memory region + +----------+ + | | + | | <<<<<<<<<<<<< Free + | | + |----------| +------------+ + | image | <<<<<<<<<<<<< | image | + |----------| +------------+ + | xxxxxxxx | <<<<<<<<<<<<< Marked as unavailable + +----------+ + +And the following diagram is an example of an image loaded in the top part: + +:: + + Memory region + +----------+ + | xxxxxxxx | <<<<<<<<<<<<< Marked as unavailable + |----------| +------------+ + | image | <<<<<<<<<<<<< | image | + |----------| +------------+ + | | + | | <<<<<<<<<<<<< Free + | | + +----------+ + +When LOAD\_IMAGE\_V2 is enabled, Trusted Firmware does not provide any mechanism +to verify at boot time that the memory to load a new image is free to prevent +overwriting a previously loaded image. The platform must specify the memory +available in the system for all the relevant BL images to be loaded. + +For example, in the case of BL1 loading BL2, ``bl1_plat_sec_mem_layout()`` will +return the region defined by the platform where BL1 intends to load BL2. The +``load_image()`` function performs bounds check for the image size based on the +base and maximum image size provided by the platforms. Platforms must take +this behaviour into account when defining the base/size for each of the images. + +Memory layout on ARM development platforms +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following list describes the memory layout on the ARM development platforms: + +- A 4KB page of shared memory is used for communication between Trusted + Firmware and the platform's power controller. This is located at the base of + Trusted SRAM. The amount of Trusted SRAM available to load the bootloader + images is reduced by the size of the shared memory. + + The shared memory is used to store the CPUs' entrypoint mailbox. On Juno, + this is also used for the MHU payload when passing messages to and from the + SCP. + +- On FVP, BL1 is originally sitting in the Trusted ROM at address ``0x0``. On + Juno, BL1 resides in flash memory at address ``0x0BEC0000``. BL1 read-write + data are relocated to the top of Trusted SRAM at runtime. + +- EL3 Runtime Software, BL31 for AArch64 and BL32 for AArch32 (e.g. SP\_MIN), + is loaded at the top of the Trusted SRAM, such that its NOBITS sections will + overwrite BL1 R/W data. This implies that BL1 global variables remain valid + only until execution reaches the EL3 Runtime Software entry point during a + cold boot. + +- BL2 is loaded below EL3 Runtime Software. + +- On Juno, SCP\_BL2 is loaded temporarily into the EL3 Runtime Software memory + region and transfered to the SCP before being overwritten by EL3 Runtime + Software. + +- BL32 (for AArch64) can be loaded in one of the following locations: + + - Trusted SRAM + - Trusted DRAM (FVP only) + - Secure region of DRAM (top 16MB of DRAM configured by the TrustZone + controller) + + When BL32 (for AArch64) is loaded into Trusted SRAM, its NOBITS sections + are allowed to overlay BL2. This memory layout is designed to give the + BL32 image as much memory as possible when it is loaded into Trusted SRAM. + +When LOAD\_IMAGE\_V2 is disabled the memory regions for the overlap detection +mechanism at boot time are defined as follows (shown per API): + +- ``meminfo_t *bl1_plat_sec_mem_layout(void)`` + + This region corresponds to the whole Trusted SRAM except for the shared + memory at the base. This region is initially free. At boot time, BL1 will + mark the BL1(rw) section within this region as occupied. The BL1(rw) section + is placed at the top of Trusted SRAM. + +- ``meminfo_t *bl2_plat_sec_mem_layout(void)`` + + This region corresponds to the whole Trusted SRAM as defined by + ``bl1_plat_sec_mem_layout()``, but with the BL1(rw) section marked as + occupied. This memory region is used to check that BL2 and BL31 do not + overlap with each other. BL2\_BASE and BL1\_RW\_BASE are carefully chosen so + that the memory for BL31 is top loaded above BL2. + +- ``void bl2_plat_get_scp_bl2_meminfo(meminfo_t *scp_bl2_meminfo)`` + + This region is an exact copy of the region defined by + ``bl2_plat_sec_mem_layout()``. Being a disconnected copy means that all the + changes made to this region by the Trusted Firmware will not be propagated. + This approach is valid because the SCP BL2 image is loaded temporarily + while it is being transferred to the SCP, so this memory is reused + afterwards. + +- ``void bl2_plat_get_bl32_meminfo(meminfo_t *bl32_meminfo)`` + + This region depends on the location of the BL32 image. Currently, ARM + platforms support three different locations (detailed below): Trusted SRAM, + Trusted DRAM and the TZC-Secured DRAM. + +- ``void bl2_plat_get_bl33_meminfo(meminfo_t *bl33_meminfo)`` + + This region corresponds to the Non-Secure DDR-DRAM, excluding the + TZC-Secured area. + +The location of the BL32 image will result in different memory maps. This is +illustrated for both FVP and Juno in the following diagrams, using the TSP as +an example. + +Note: Loading the BL32 image in TZC secured DRAM doesn't change the memory +layout of the other images in Trusted SRAM. + +**FVP with TSP in Trusted SRAM (default option):** +(These diagrams only cover the AArch64 case) + +:: + + Trusted SRAM + 0x04040000 +----------+ loaded by BL2 ------------------ + | BL1 (rw) | <<<<<<<<<<<<< | BL31 NOBITS | + |----------| <<<<<<<<<<<<< |----------------| + | | <<<<<<<<<<<<< | BL31 PROGBITS | + |----------| ------------------ + | BL2 | <<<<<<<<<<<<< | BL32 NOBITS | + |----------| <<<<<<<<<<<<< |----------------| + | | <<<<<<<<<<<<< | BL32 PROGBITS | + 0x04001000 +----------+ ------------------ + | Shared | + 0x04000000 +----------+ + + Trusted ROM + 0x04000000 +----------+ + | BL1 (ro) | + 0x00000000 +----------+ + +**FVP with TSP in Trusted DRAM:** + +:: + + Trusted DRAM + 0x08000000 +----------+ + | BL32 | + 0x06000000 +----------+ + + Trusted SRAM + 0x04040000 +----------+ loaded by BL2 ------------------ + | BL1 (rw) | <<<<<<<<<<<<< | BL31 NOBITS | + |----------| <<<<<<<<<<<<< |----------------| + | | <<<<<<<<<<<<< | BL31 PROGBITS | + |----------| ------------------ + | BL2 | + |----------| + | | + 0x04001000 +----------+ + | Shared | + 0x04000000 +----------+ + + Trusted ROM + 0x04000000 +----------+ + | BL1 (ro) | + 0x00000000 +----------+ + +**FVP with TSP in TZC-Secured DRAM:** + +:: + + DRAM + 0xffffffff +----------+ + | BL32 | (secure) + 0xff000000 +----------+ + | | + : : (non-secure) + | | + 0x80000000 +----------+ + + Trusted SRAM + 0x04040000 +----------+ loaded by BL2 ------------------ + | BL1 (rw) | <<<<<<<<<<<<< | BL31 NOBITS | + |----------| <<<<<<<<<<<<< |----------------| + | | <<<<<<<<<<<<< | BL31 PROGBITS | + |----------| ------------------ + | BL2 | + |----------| + | | + 0x04001000 +----------+ + | Shared | + 0x04000000 +----------+ + + Trusted ROM + 0x04000000 +----------+ + | BL1 (ro) | + 0x00000000 +----------+ + +**Juno with BL32 in Trusted SRAM (default option):** + +:: + + Flash0 + 0x0C000000 +----------+ + : : + 0x0BED0000 |----------| + | BL1 (ro) | + 0x0BEC0000 |----------| + : : + 0x08000000 +----------+ BL31 is loaded + after SCP_BL2 has + Trusted SRAM been sent to SCP + 0x04040000 +----------+ loaded by BL2 ------------------ + | BL1 (rw) | <<<<<<<<<<<<< | BL31 NOBITS | + |----------| <<<<<<<<<<<<< |----------------| + | SCP_BL2 | <<<<<<<<<<<<< | BL31 PROGBITS | + |----------| ------------------ + | BL2 | <<<<<<<<<<<<< | BL32 NOBITS | + |----------| <<<<<<<<<<<<< |----------------| + | | <<<<<<<<<<<<< | BL32 PROGBITS | + 0x04001000 +----------+ ------------------ + | MHU | + 0x04000000 +----------+ + +**Juno with BL32 in TZC-secured DRAM:** + +:: + + DRAM + 0xFFE00000 +----------+ + | BL32 | (secure) + 0xFF000000 |----------| + | | + : : (non-secure) + | | + 0x80000000 +----------+ + + Flash0 + 0x0C000000 +----------+ + : : + 0x0BED0000 |----------| + | BL1 (ro) | + 0x0BEC0000 |----------| + : : + 0x08000000 +----------+ BL31 is loaded + after SCP_BL2 has + Trusted SRAM been sent to SCP + 0x04040000 +----------+ loaded by BL2 ------------------ + | BL1 (rw) | <<<<<<<<<<<<< | BL31 NOBITS | + |----------| <<<<<<<<<<<<< |----------------| + | SCP_BL2 | <<<<<<<<<<<<< | BL31 PROGBITS | + |----------| ------------------ + | BL2 | + |----------| + | | + 0x04001000 +----------+ + | MHU | + 0x04000000 +----------+ + +Firmware Image Package (FIP) +---------------------------- + +Using a Firmware Image Package (FIP) allows for packing bootloader images (and +potentially other payloads) into a single archive that can be loaded by the ARM +Trusted Firmware from non-volatile platform storage. A driver to load images +from a FIP has been added to the storage layer and allows a package to be read +from supported platform storage. A tool to create Firmware Image Packages is +also provided and described below. + +Firmware Image Package layout +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The FIP layout consists of a table of contents (ToC) followed by payload data. +The ToC itself has a header followed by one or more table entries. The ToC is +terminated by an end marker entry. All ToC entries describe some payload data +that has been appended to the end of the binary package. With the information +provided in the ToC entry the corresponding payload data can be retrieved. + +:: + + ------------------ + | ToC Header | + |----------------| + | ToC Entry 0 | + |----------------| + | ToC Entry 1 | + |----------------| + | ToC End Marker | + |----------------| + | | + | Data 0 | + | | + |----------------| + | | + | Data 1 | + | | + ------------------ + +The ToC header and entry formats are described in the header file +``include/tools_share/firmware_image_package.h``. This file is used by both the +tool and the ARM Trusted firmware. + +The ToC header has the following fields: + +:: + + `name`: The name of the ToC. This is currently used to validate the header. + `serial_number`: A non-zero number provided by the creation tool + `flags`: Flags associated with this data. + Bits 0-31: Reserved + Bits 32-47: Platform defined + Bits 48-63: Reserved + +A ToC entry has the following fields: + +:: + + `uuid`: All files are referred to by a pre-defined Universally Unique + IDentifier [UUID] . The UUIDs are defined in + `include/tools_share/firmware_image_package.h`. The platform translates + the requested image name into the corresponding UUID when accessing the + package. + `offset_address`: The offset address at which the corresponding payload data + can be found. The offset is calculated from the ToC base address. + `size`: The size of the corresponding payload data in bytes. + `flags`: Flags associated with this entry. Non are yet defined. + +Firmware Image Package creation tool +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The FIP creation tool can be used to pack specified images into a binary package +that can be loaded by the ARM Trusted Firmware from platform storage. The tool +currently only supports packing bootloader images. Additional image definitions +can be added to the tool as required. + +The tool can be found in ``tools/fiptool``. + +Loading from a Firmware Image Package (FIP) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The Firmware Image Package (FIP) driver can load images from a binary package on +non-volatile platform storage. For the ARM development platforms, this is +currently NOR FLASH. + +Bootloader images are loaded according to the platform policy as specified by +the function ``plat_get_image_source()``. For the ARM development platforms, this +means the platform will attempt to load images from a Firmware Image Package +located at the start of NOR FLASH0. + +The ARM development platforms' policy is to only allow loading of a known set of +images. The platform policy can be modified to allow additional images. + +Use of coherent memory in Trusted Firmware +------------------------------------------ + +There might be loss of coherency when physical memory with mismatched +shareability, cacheability and memory attributes is accessed by multiple CPUs +(refer to section B2.9 of `ARM ARM`_ for more details). This possibility occurs +in Trusted Firmware during power up/down sequences when coherency, MMU and +caches are turned on/off incrementally. + +Trusted Firmware defines coherent memory as a region of memory with Device +nGnRE attributes in the translation tables. The translation granule size in +Trusted Firmware is 4KB. This is the smallest possible size of the coherent +memory region. + +By default, all data structures which are susceptible to accesses with +mismatched attributes from various CPUs are allocated in a coherent memory +region (refer to section 2.1 of `Porting Guide`_). The coherent memory region +accesses are Outer Shareable, non-cacheable and they can be accessed +with the Device nGnRE attributes when the MMU is turned on. Hence, at the +expense of at least an extra page of memory, Trusted Firmware is able to work +around coherency issues due to mismatched memory attributes. + +The alternative to the above approach is to allocate the susceptible data +structures in Normal WriteBack WriteAllocate Inner shareable memory. This +approach requires the data structures to be designed so that it is possible to +work around the issue of mismatched memory attributes by performing software +cache maintenance on them. + +Disabling the use of coherent memory in Trusted Firmware +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +It might be desirable to avoid the cost of allocating coherent memory on +platforms which are memory constrained. Trusted Firmware enables inclusion of +coherent memory in firmware images through the build flag ``USE_COHERENT_MEM``. +This flag is enabled by default. It can be disabled to choose the second +approach described above. + +The below sections analyze the data structures allocated in the coherent memory +region and the changes required to allocate them in normal memory. + +Coherent memory usage in PSCI implementation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``psci_non_cpu_pd_nodes`` data structure stores the platform's power domain +tree information for state management of power domains. By default, this data +structure is allocated in the coherent memory region in the Trusted Firmware +because it can be accessed by multple CPUs, either with caches enabled or +disabled. + +.. code:: c + + typedef struct non_cpu_pwr_domain_node { + /* + * Index of the first CPU power domain node level 0 which has this node + * as its parent. + */ + unsigned int cpu_start_idx; + + /* + * Number of CPU power domains which are siblings of the domain indexed + * by 'cpu_start_idx' i.e. all the domains in the range 'cpu_start_idx + * -> cpu_start_idx + ncpus' have this node as their parent. + */ + unsigned int ncpus; + + /* + * Index of the parent power domain node. + * TODO: Figure out whether to whether using pointer is more efficient. + */ + unsigned int parent_node; + + plat_local_state_t local_state; + + unsigned char level; + + /* For indexing the psci_lock array*/ + unsigned char lock_index; + } non_cpu_pd_node_t; + +In order to move this data structure to normal memory, the use of each of its +fields must be analyzed. Fields like ``cpu_start_idx``, ``ncpus``, ``parent_node`` +``level`` and ``lock_index`` are only written once during cold boot. Hence removing +them from coherent memory involves only doing a clean and invalidate of the +cache lines after these fields are written. + +The field ``local_state`` can be concurrently accessed by multiple CPUs in +different cache states. A Lamport's Bakery lock ``psci_locks`` is used to ensure +mutual exlusion to this field and a clean and invalidate is needed after it +is written. + +Bakery lock data +~~~~~~~~~~~~~~~~ + +The bakery lock data structure ``bakery_lock_t`` is allocated in coherent memory +and is accessed by multiple CPUs with mismatched attributes. ``bakery_lock_t`` is +defined as follows: + +.. code:: c + + typedef struct bakery_lock { + /* + * The lock_data is a bit-field of 2 members: + * Bit[0] : choosing. This field is set when the CPU is + * choosing its bakery number. + * Bits[1 - 15] : number. This is the bakery number allocated. + */ + volatile uint16_t lock_data[BAKERY_LOCK_MAX_CPUS]; + } bakery_lock_t; + +It is a characteristic of Lamport's Bakery algorithm that the volatile per-CPU +fields can be read by all CPUs but only written to by the owning CPU. + +Depending upon the data cache line size, the per-CPU fields of the +``bakery_lock_t`` structure for multiple CPUs may exist on a single cache line. +These per-CPU fields can be read and written during lock contention by multiple +CPUs with mismatched memory attributes. Since these fields are a part of the +lock implementation, they do not have access to any other locking primitive to +safeguard against the resulting coherency issues. As a result, simple software +cache maintenance is not enough to allocate them in coherent memory. Consider +the following example. + +CPU0 updates its per-CPU field with data cache enabled. This write updates a +local cache line which contains a copy of the fields for other CPUs as well. Now +CPU1 updates its per-CPU field of the ``bakery_lock_t`` structure with data cache +disabled. CPU1 then issues a DCIVAC operation to invalidate any stale copies of +its field in any other cache line in the system. This operation will invalidate +the update made by CPU0 as well. + +To use bakery locks when ``USE_COHERENT_MEM`` is disabled, the lock data structure +has been redesigned. The changes utilise the characteristic of Lamport's Bakery +algorithm mentioned earlier. The bakery\_lock structure only allocates the memory +for a single CPU. The macro ``DEFINE_BAKERY_LOCK`` allocates all the bakery locks +needed for a CPU into a section ``bakery_lock``. The linker allocates the memory +for other cores by using the total size allocated for the bakery\_lock section +and multiplying it with (PLATFORM\_CORE\_COUNT - 1). This enables software to +perform software cache maintenance on the lock data structure without running +into coherency issues associated with mismatched attributes. + +The bakery lock data structure ``bakery_info_t`` is defined for use when +``USE_COHERENT_MEM`` is disabled as follows: + +.. code:: c + + typedef struct bakery_info { + /* + * The lock_data is a bit-field of 2 members: + * Bit[0] : choosing. This field is set when the CPU is + * choosing its bakery number. + * Bits[1 - 15] : number. This is the bakery number allocated. + */ + volatile uint16_t lock_data; + } bakery_info_t; + +The ``bakery_info_t`` represents a single per-CPU field of one lock and +the combination of corresponding ``bakery_info_t`` structures for all CPUs in the +system represents the complete bakery lock. The view in memory for a system +with n bakery locks are: + +:: + + bakery_lock section start + |----------------| + | `bakery_info_t`| <-- Lock_0 per-CPU field + | Lock_0 | for CPU0 + |----------------| + | `bakery_info_t`| <-- Lock_1 per-CPU field + | Lock_1 | for CPU0 + |----------------| + | .... | + |----------------| + | `bakery_info_t`| <-- Lock_N per-CPU field + | Lock_N | for CPU0 + ------------------ + | XXXXX | + | Padding to | + | next Cache WB | <--- Calculate PERCPU_BAKERY_LOCK_SIZE, allocate + | Granule | continuous memory for remaining CPUs. + ------------------ + | `bakery_info_t`| <-- Lock_0 per-CPU field + | Lock_0 | for CPU1 + |----------------| + | `bakery_info_t`| <-- Lock_1 per-CPU field + | Lock_1 | for CPU1 + |----------------| + | .... | + |----------------| + | `bakery_info_t`| <-- Lock_N per-CPU field + | Lock_N | for CPU1 + ------------------ + | XXXXX | + | Padding to | + | next Cache WB | + | Granule | + ------------------ + +Consider a system of 2 CPUs with 'N' bakery locks as shown above. For an +operation on Lock\_N, the corresponding ``bakery_info_t`` in both CPU0 and CPU1 +``bakery_lock`` section need to be fetched and appropriate cache operations need +to be performed for each access. + +On ARM Platforms, bakery locks are used in psci (``psci_locks``) and power controller +driver (``arm_lock``). + +Non Functional Impact of removing coherent memory +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Removal of the coherent memory region leads to the additional software overhead +of performing cache maintenance for the affected data structures. However, since +the memory where the data structures are allocated is cacheable, the overhead is +mostly mitigated by an increase in performance. + +There is however a performance impact for bakery locks, due to: + +- Additional cache maintenance operations, and +- Multiple cache line reads for each lock operation, since the bakery locks + for each CPU are distributed across different cache lines. + +The implementation has been optimized to minimize this additional overhead. +Measurements indicate that when bakery locks are allocated in Normal memory, the +minimum latency of acquiring a lock is on an average 3-4 micro seconds whereas +in Device memory the same is 2 micro seconds. The measurements were done on the +Juno ARM development platform. + +As mentioned earlier, almost a page of memory can be saved by disabling +``USE_COHERENT_MEM``. Each platform needs to consider these trade-offs to decide +whether coherent memory should be used. If a platform disables +``USE_COHERENT_MEM`` and needs to use bakery locks in the porting layer, it can +optionally define macro ``PLAT_PERCPU_BAKERY_LOCK_SIZE`` (see the +`Porting Guide`_). Refer to the reference platform code for examples. + +Isolating code and read-only data on separate memory pages +---------------------------------------------------------- + +In the ARMv8 VMSA, translation table entries include fields that define the +properties of the target memory region, such as its access permissions. The +smallest unit of memory that can be addressed by a translation table entry is +a memory page. Therefore, if software needs to set different permissions on two +memory regions then it needs to map them using different memory pages. + +The default memory layout for each BL image is as follows: + +:: + + | ... | + +-------------------+ + | Read-write data | + +-------------------+ Page boundary + | <Padding> | + +-------------------+ + | Exception vectors | + +-------------------+ 2 KB boundary + | <Padding> | + +-------------------+ + | Read-only data | + +-------------------+ + | Code | + +-------------------+ BLx_BASE + +Note: The 2KB alignment for the exception vectors is an architectural +requirement. + +The read-write data start on a new memory page so that they can be mapped with +read-write permissions, whereas the code and read-only data below are configured +as read-only. + +However, the read-only data are not aligned on a page boundary. They are +contiguous to the code. Therefore, the end of the code section and the beginning +of the read-only data one might share a memory page. This forces both to be +mapped with the same memory attributes. As the code needs to be executable, this +means that the read-only data stored on the same memory page as the code are +executable as well. This could potentially be exploited as part of a security +attack. + +TF provides the build flag ``SEPARATE_CODE_AND_RODATA`` to isolate the code and +read-only data on separate memory pages. This in turn allows independent control +of the access permissions for the code and read-only data. In this case, +platform code gets a finer-grained view of the image layout and can +appropriately map the code region as executable and the read-only data as +execute-never. + +This has an impact on memory footprint, as padding bytes need to be introduced +between the code and read-only data to ensure the segragation of the two. To +limit the memory cost, this flag also changes the memory layout such that the +code and exception vectors are now contiguous, like so: + +:: + + | ... | + +-------------------+ + | Read-write data | + +-------------------+ Page boundary + | <Padding> | + +-------------------+ + | Read-only data | + +-------------------+ Page boundary + | <Padding> | + +-------------------+ + | Exception vectors | + +-------------------+ 2 KB boundary + | <Padding> | + +-------------------+ + | Code | + +-------------------+ BLx_BASE + +With this more condensed memory layout, the separation of read-only data will +add zero or one page to the memory footprint of each BL image. Each platform +should consider the trade-off between memory footprint and security. + +This build flag is disabled by default, minimising memory footprint. On ARM +platforms, it is enabled. + +Performance Measurement Framework +--------------------------------- + +The Performance Measurement Framework (PMF) facilitates collection of +timestamps by registered services and provides interfaces to retrieve +them from within the ARM Trusted Firmware. A platform can choose to +expose appropriate SMCs to retrieve these collected timestamps. + +By default, the global physical counter is used for the timestamp +value and is read via ``CNTPCT_EL0``. The framework allows to retrieve +timestamps captured by other CPUs. + +Timestamp identifier format +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A PMF timestamp is uniquely identified across the system via the +timestamp ID or ``tid``. The ``tid`` is composed as follows: + +:: + + Bits 0-7: The local timestamp identifier. + Bits 8-9: Reserved. + Bits 10-15: The service identifier. + Bits 16-31: Reserved. + +#. The service identifier. Each PMF service is identified by a + service name and a service identifier. Both the service name and + identifier are unique within the system as a whole. + +#. The local timestamp identifier. This identifier is unique within a given + service. + +Registering a PMF service +~~~~~~~~~~~~~~~~~~~~~~~~~ + +To register a PMF service, the ``PMF_REGISTER_SERVICE()`` macro from ``pmf.h`` +is used. The arguments required are the service name, the service ID, +the total number of local timestamps to be captured and a set of flags. + +The ``flags`` field can be specified as a bitwise-OR of the following values: + +:: + + PMF_STORE_ENABLE: The timestamp is stored in memory for later retrieval. + PMF_DUMP_ENABLE: The timestamp is dumped on the serial console. + +The ``PMF_REGISTER_SERVICE()`` reserves memory to store captured +timestamps in a PMF specific linker section at build time. +Additionally, it defines necessary functions to capture and +retrieve a particular timestamp for the given service at runtime. + +The macro ``PMF_REGISTER_SERVICE()`` only enables capturing PMF +timestamps from within ARM Trusted Firmware. In order to retrieve +timestamps from outside of ARM Trusted Firmware, the +``PMF_REGISTER_SERVICE_SMC()`` macro must be used instead. This macro +accepts the same set of arguments as the ``PMF_REGISTER_SERVICE()`` +macro but additionally supports retrieving timestamps using SMCs. + +Capturing a timestamp +~~~~~~~~~~~~~~~~~~~~~ + +PMF timestamps are stored in a per-service timestamp region. On a +system with multiple CPUs, each timestamp is captured and stored +in a per-CPU cache line aligned memory region. + +Having registered the service, the ``PMF_CAPTURE_TIMESTAMP()`` macro can be +used to capture a timestamp at the location where it is used. The macro +takes the service name, a local timestamp identifier and a flag as arguments. + +The ``flags`` field argument can be zero, or ``PMF_CACHE_MAINT`` which +instructs PMF to do cache maintenance following the capture. Cache +maintenance is required if any of the service's timestamps are captured +with data cache disabled. + +To capture a timestamp in assembly code, the caller should use +``pmf_calc_timestamp_addr`` macro (defined in ``pmf_asm_macros.S``) to +calculate the address of where the timestamp would be stored. The +caller should then read ``CNTPCT_EL0`` register to obtain the timestamp +and store it at the determined address for later retrieval. + +Retrieving a timestamp +~~~~~~~~~~~~~~~~~~~~~~ + +From within ARM Trusted Firmware, timestamps for individual CPUs can +be retrieved using either ``PMF_GET_TIMESTAMP_BY_MPIDR()`` or +``PMF_GET_TIMESTAMP_BY_INDEX()`` macros. These macros accept the CPU's MPIDR +value, or its ordinal position, respectively. + +From outside ARM Trusted Firmware, timestamps for individual CPUs can be +retrieved by calling into ``pmf_smc_handler()``. + +.. code:: c + + Interface : pmf_smc_handler() + Argument : unsigned int smc_fid, u_register_t x1, + u_register_t x2, u_register_t x3, + u_register_t x4, void *cookie, + void *handle, u_register_t flags + Return : uintptr_t + + smc_fid: Holds the SMC identifier which is either `PMF_SMC_GET_TIMESTAMP_32` + when the caller of the SMC is running in AArch32 mode + or `PMF_SMC_GET_TIMESTAMP_64` when the caller is running in AArch64 mode. + x1: Timestamp identifier. + x2: The `mpidr` of the CPU for which the timestamp has to be retrieved. + This can be the `mpidr` of a different core to the one initiating + the SMC. In that case, service specific cache maintenance may be + required to ensure the updated copy of the timestamp is returned. + x3: A flags value that is either 0 or `PMF_CACHE_MAINT`. If + `PMF_CACHE_MAINT` is passed, then the PMF code will perform a + cache invalidate before reading the timestamp. This ensures + an updated copy is returned. + +The remaining arguments, ``x4``, ``cookie``, ``handle`` and ``flags`` are unused +in this implementation. + +PMF code structure +~~~~~~~~~~~~~~~~~~ + +#. ``pmf_main.c`` consists of core functions that implement service registration, + initialization, storing, dumping and retrieving timestamps. + +#. ``pmf_smc.c`` contains the SMC handling for registered PMF services. + +#. ``pmf.h`` contains the public interface to Performance Measurement Framework. + +#. ``pmf_asm_macros.S`` consists of macros to facilitate capturing timestamps in + assembly code. + +#. ``pmf_helpers.h`` is an internal header used by ``pmf.h``. + +#. .. rubric:: ARMv8 Architecture Extensions + :name: armv8-architecture-extensions + +ARM Trusted Firmware makes use of ARMv8 Architecture Extensions where +applicable. This section lists the usage of Architecture Extensions, and build +flags controlling them. + +In general, and unless individually mentioned, the build options +``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` selects the Architecture Extension to +target when building ARM Trusted Firmware. Subsequent ARM Architecture +Extensions are backward compatible with previous versions. + +The build system only requires that ``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` have a +valid numeric value. These build options only control whether or not +Architecture Extension-specific code is included in the build. Otherwise, ARM +Trusted Firmware targets the base ARMv8.0 architecture; i.e. as if +``ARM_ARCH_MAJOR`` == 8 and ``ARM_ARCH_MINOR`` == 0, which are also their respective +default values. + +See also the *Summary of build options* in `User Guide`_. + +For details on the Architecture Extension and available features, please refer +to the respective Architecture Extension Supplement. + +ARMv8.1 +~~~~~~~ + +This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` >= 8, or when +``ARM_ARCH_MAJOR`` == 8 and ``ARM_ARCH_MINOR`` >= 1. + +- The Compare and Swap instruction is used to implement spinlocks. Otherwise, + the load-/store-exclusive instruction pair is used. + +Code Structure +-------------- + +Trusted Firmware code is logically divided between the three boot loader +stages mentioned in the previous sections. The code is also divided into the +following categories (present as directories in the source code): + +- **Platform specific.** Choice of architecture specific code depends upon + the platform. +- **Common code.** This is platform and architecture agnostic code. +- **Library code.** This code comprises of functionality commonly used by all + other code. The PSCI implementation and other EL3 runtime frameworks reside + as Library components. +- **Stage specific.** Code specific to a boot stage. +- **Drivers.** +- **Services.** EL3 runtime services (eg: SPD). Specific SPD services + reside in the ``services/spd`` directory (e.g. ``services/spd/tspd``). + +Each boot loader stage uses code from one or more of the above mentioned +categories. Based upon the above, the code layout looks like this: + +:: + + Directory Used by BL1? Used by BL2? Used by BL31? + bl1 Yes No No + bl2 No Yes No + bl31 No No Yes + plat Yes Yes Yes + drivers Yes No Yes + common Yes Yes Yes + lib Yes Yes Yes + services No No Yes + +The build system provides a non configurable build option IMAGE\_BLx for each +boot loader stage (where x = BL stage). e.g. for BL1 , IMAGE\_BL1 will be +defined by the build system. This enables the Trusted Firmware to compile +certain code only for specific boot loader stages + +All assembler files have the ``.S`` extension. The linker source files for each +boot stage have the extension ``.ld.S``. These are processed by GCC to create the +linker scripts which have the extension ``.ld``. + +FDTs provide a description of the hardware platform and are used by the Linux +kernel at boot time. These can be found in the ``fdts`` directory. + +References +---------- + +#. Trusted Board Boot Requirements CLIENT PDD (ARM DEN 0006B-5). Available + under NDA through your ARM account representative. + +#. `Power State Coordination Interface PDD`_ + +#. `SMC Calling Convention PDD`_ + +#. `ARM Trusted Firmware Interrupt Management Design guide`_. + +-------------- + +*Copyright (c) 2013-2016, ARM Limited and Contributors. All rights reserved.* + +.. _Reset Design: ./reset-design.rst +.. _Porting Guide: ./porting-guide.rst +.. _Firmware Update: ./firmware-update.rst +.. _PSCI PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf +.. _SMC calling convention PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf +.. _PSCI Library integration guide: ./psci-lib-integration-guide.rst +.. _SMCCC: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf +.. _PSCI: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf +.. _Power State Coordination Interface PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf +.. _here: ./psci-lib-integration-guide.rst +.. _cpu-specific-build-macros.rst: ./cpu-specific-build-macros.rst +.. _CPUBM: ./cpu-specific-build-macros.rst +.. _ARM ARM: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0487a.e/index.html +.. _User Guide: ./user-guide.rst +.. _SMC Calling Convention PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf +.. _ARM Trusted Firmware Interrupt Management Design guide: ./interrupt-framework-design.rst + +.. |Image 1| image:: diagrams/rt-svc-descs-layout.png?raw=true |