diff options
Diffstat (limited to 'Documentation/x86')
-rw-r--r-- | Documentation/x86/boot.rst | 4 | ||||
-rw-r--r-- | Documentation/x86/buslock.rst | 126 | ||||
-rw-r--r-- | Documentation/x86/index.rst | 1 | ||||
-rw-r--r-- | Documentation/x86/mtrr.rst | 2 | ||||
-rw-r--r-- | Documentation/x86/x86_64/boot-options.rst | 31 |
5 files changed, 131 insertions, 33 deletions
diff --git a/Documentation/x86/boot.rst b/Documentation/x86/boot.rst index fc844913dece..894a19897005 100644 --- a/Documentation/x86/boot.rst +++ b/Documentation/x86/boot.rst @@ -1343,7 +1343,7 @@ follow:: In addition to read/modify/write the setup header of the struct boot_params as that of 16-bit boot protocol, the boot loader should also fill the additional fields of the struct boot_params as -described in chapter :doc:`zero-page`. +described in chapter Documentation/x86/zero-page.rst. After setting up the struct boot_params, the boot loader can load the 32/64-bit kernel in the same way as that of 16-bit boot protocol. @@ -1379,7 +1379,7 @@ can be calculated as follows:: In addition to read/modify/write the setup header of the struct boot_params as that of 16-bit boot protocol, the boot loader should also fill the additional fields of the struct boot_params as described -in chapter :doc:`zero-page`. +in chapter Documentation/x86/zero-page.rst. After setting up the struct boot_params, the boot loader can load 64-bit kernel in the same way as that of 16-bit boot protocol, but diff --git a/Documentation/x86/buslock.rst b/Documentation/x86/buslock.rst new file mode 100644 index 000000000000..7c051e714943 --- /dev/null +++ b/Documentation/x86/buslock.rst @@ -0,0 +1,126 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. include:: <isonum.txt> + +=============================== +Bus lock detection and handling +=============================== + +:Copyright: |copy| 2021 Intel Corporation +:Authors: - Fenghua Yu <fenghua.yu@intel.com> + - Tony Luck <tony.luck@intel.com> + +Problem +======= + +A split lock is any atomic operation whose operand crosses two cache lines. +Since the operand spans two cache lines and the operation must be atomic, +the system locks the bus while the CPU accesses the two cache lines. + +A bus lock is acquired through either split locked access to writeback (WB) +memory or any locked access to non-WB memory. This is typically thousands of +cycles slower than an atomic operation within a cache line. It also disrupts +performance on other cores and brings the whole system to its knees. + +Detection +========= + +Intel processors may support either or both of the following hardware +mechanisms to detect split locks and bus locks. + +#AC exception for split lock detection +-------------------------------------- + +Beginning with the Tremont Atom CPU split lock operations may raise an +Alignment Check (#AC) exception when a split lock operation is attemped. + +#DB exception for bus lock detection +------------------------------------ + +Some CPUs have the ability to notify the kernel by an #DB trap after a user +instruction acquires a bus lock and is executed. This allows the kernel to +terminate the application or to enforce throttling. + +Software handling +================= + +The kernel #AC and #DB handlers handle bus lock based on the kernel +parameter "split_lock_detect". Here is a summary of different options: + ++------------------+----------------------------+-----------------------+ +|split_lock_detect=|#AC for split lock |#DB for bus lock | ++------------------+----------------------------+-----------------------+ +|off |Do nothing |Do nothing | ++------------------+----------------------------+-----------------------+ +|warn |Kernel OOPs |Warn once per task and | +|(default) |Warn once per task and |and continues to run. | +| |disable future checking | | +| |When both features are | | +| |supported, warn in #AC | | ++------------------+----------------------------+-----------------------+ +|fatal |Kernel OOPs |Send SIGBUS to user. | +| |Send SIGBUS to user | | +| |When both features are | | +| |supported, fatal in #AC | | ++------------------+----------------------------+-----------------------+ +|ratelimit:N |Do nothing |Limit bus lock rate to | +|(0 < N <= 1000) | |N bus locks per second | +| | |system wide and warn on| +| | |bus locks. | ++------------------+----------------------------+-----------------------+ + +Usages +====== + +Detecting and handling bus lock may find usages in various areas: + +It is critical for real time system designers who build consolidated real +time systems. These systems run hard real time code on some cores and run +"untrusted" user processes on other cores. The hard real time cannot afford +to have any bus lock from the untrusted processes to hurt real time +performance. To date the designers have been unable to deploy these +solutions as they have no way to prevent the "untrusted" user code from +generating split lock and bus lock to block the hard real time code to +access memory during bus locking. + +It's also useful for general computing to prevent guests or user +applications from slowing down the overall system by executing instructions +with bus lock. + + +Guidance +======== +off +--- + +Disable checking for split lock and bus lock. This option can be useful if +there are legacy applications that trigger these events at a low rate so +that mitigation is not needed. + +warn +---- + +A warning is emitted when a bus lock is detected which allows to identify +the offending application. This is the default behavior. + +fatal +----- + +In this case, the bus lock is not tolerated and the process is killed. + +ratelimit +--------- + +A system wide bus lock rate limit N is specified where 0 < N <= 1000. This +allows a bus lock rate up to N bus locks per second. When the bus lock rate +is exceeded then any task which is caught via the buslock #DB exception is +throttled by enforced sleeps until the rate goes under the limit again. + +This is an effective mitigation in cases where a minimal impact can be +tolerated, but an eventual Denial of Service attack has to be prevented. It +allows to identify the offending processes and analyze whether they are +malicious or just badly written. + +Selecting a rate limit of 1000 allows the bus to be locked for up to about +seven million cycles each second (assuming 7000 cycles for each bus +lock). On a 2 GHz processor that would be about 0.35% system slowdown. diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst index d58614d5cde6..383048396336 100644 --- a/Documentation/x86/index.rst +++ b/Documentation/x86/index.rst @@ -29,6 +29,7 @@ x86-specific Documentation microcode resctrl tsx_async_abort + buslock usb-legacy-support i386/index x86_64/index diff --git a/Documentation/x86/mtrr.rst b/Documentation/x86/mtrr.rst index c5b695d75349..9f0b1851771a 100644 --- a/Documentation/x86/mtrr.rst +++ b/Documentation/x86/mtrr.rst @@ -28,7 +28,7 @@ are aligned with platform MTRR setup. If MTRRs are only set up by the platform firmware code though and the OS does not make any specific MTRR mapping requests mtrr_type_lookup() should always return MTRR_TYPE_INVALID. -For details refer to :doc:`pat`. +For details refer to Documentation/x86/pat.rst. .. tip:: On Intel P6 family processors (Pentium Pro, Pentium II and later) diff --git a/Documentation/x86/x86_64/boot-options.rst b/Documentation/x86/x86_64/boot-options.rst index 324cefff92e7..5f62b3b86357 100644 --- a/Documentation/x86/x86_64/boot-options.rst +++ b/Documentation/x86/x86_64/boot-options.rst @@ -247,16 +247,11 @@ Multiple x86-64 PCI-DMA mapping implementations exist, for example: Kernel boot message: "PCI-DMA: Using software bounce buffering for IO (SWIOTLB)" - 4. <arch/x86_64/pci-calgary.c> : IBM Calgary hardware IOMMU. Used in IBM - pSeries and xSeries servers. This hardware IOMMU supports DMA address - mapping with memory protection, etc. - Kernel boot message: "PCI-DMA: Using Calgary IOMMU" - :: iommu=[<size>][,noagp][,off][,force][,noforce] [,memaper[=<order>]][,merge][,fullflush][,nomerge] - [,noaperture][,calgary] + [,noaperture] General iommu options: @@ -295,8 +290,6 @@ iommu options only relevant to the AMD GART hardware IOMMU: Don't initialize the AGP driver and use full aperture. panic Always panic when IOMMU overflows. - calgary - Use the Calgary IOMMU if it is available iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU implementation: @@ -307,28 +300,6 @@ implementation: force Force all IO through the software TLB. -Settings for the IBM Calgary hardware IOMMU currently found in IBM -pSeries and xSeries machines - - calgary=[64k,128k,256k,512k,1M,2M,4M,8M] - Set the size of each PCI slot's translation table when using the - Calgary IOMMU. This is the size of the translation table itself - in main memory. The smallest table, 64k, covers an IO space of - 32MB; the largest, 8MB table, can cover an IO space of 4GB. - Normally the kernel will make the right choice by itself. - calgary=[translate_empty_slots] - Enable translation even on slots that have no devices attached to - them, in case a device will be hotplugged in the future. - calgary=[disable=<PCI bus number>] - Disable translation on a given PHB. For - example, the built-in graphics adapter resides on the first bridge - (PCI bus number 0); if translation (isolation) is enabled on this - bridge, X servers that access the hardware directly from user - space might stop working. Use this option if you have devices that - are accessed from userspace directly on some PCI host bridge. - panic - Always panic when IOMMU overflows - Miscellaneous ============= |