summaryrefslogtreecommitdiff
path: root/openmp/docs
diff options
context:
space:
mode:
authorgregrodgers <Gregory.Rodgers@amd.com>2023-04-19 16:14:40 -0500
committerJP Lehr <JanPatrick.Lehr@amd.com>2023-05-04 06:01:14 -0400
commitf238a98e844752b955dcf3d7b95b9c76c75a0017 (patch)
tree1be2db77947855a21aa4bd05c387175cadf6335d /openmp/docs
parentf3dcd3ad992c82be4f652fd2aac6b0ef414566a2 (diff)
downloadllvm-f238a98e844752b955dcf3d7b95b9c76c75a0017.tar.gz
[OpenMP][libomptarget][AMDGPU] Enable active HSA wait state
Adds HSA timeout hint of 2 seconds to the AMDGPU nextgen-plugin to improve performance of small kernels. The HSA runtime may stay in HSA_WAIT_STATE_ACTIVE for up to the timeout value before switching to HSA_WAIT_STATE_BLOCKED. This can improve latency from which small kernels can benefit. The value was determined via experimentation w/ different benchmarks. The timeout value can be overriden using the environment variable LIBOMPTARGET_AMDGPU_STREAM_BUSYWAIT with a value in microseconds. Original author: Greg Rodgers <Gregory.Rodgers@amd.com> Contributions from: JP Lehr <JanPatrick.Lehr@amd.com> Differential Revision: https://reviews.llvm.org/D148808
Diffstat (limited to 'openmp/docs')
-rw-r--r--openmp/docs/design/Runtimes.rst9
1 files changed, 9 insertions, 0 deletions
diff --git a/openmp/docs/design/Runtimes.rst b/openmp/docs/design/Runtimes.rst
index 98f47bc1c632..1402192581d3 100644
--- a/openmp/docs/design/Runtimes.rst
+++ b/openmp/docs/design/Runtimes.rst
@@ -1160,6 +1160,7 @@ There are several environment variables to change the behavior of the plugins:
* ``LIBOMPTARGET_AMDGPU_TEAMS_PER_CU``
* ``LIBOMPTARGET_AMDGPU_MAX_ASYNC_COPY_BYTES``
* ``LIBOMPTARGET_AMDGPU_NUM_INITIAL_HSA_SIGNALS``
+* ``LIBOMPTARGET_AMDGPU_STREAM_BUSYWAIT``
The environment variables ``LIBOMPTARGET_SHARED_MEMORY_SIZE``,
``LIBOMPTARGET_STACK_SIZE`` and ``LIBOMPTARGET_HEAP_SIZE`` are described in
@@ -1238,6 +1239,14 @@ managing several pre-created signals. These signals are mainly used by AMDGPU
streams. More HSA signals will be created dynamically throughout the execution
if needed. The default value is ``64``.
+LIBOMPTARGET_AMDGPU_STREAM_BUSYWAIT
+"""""""""""""""""""""""""""""""""""
+
+This environment variable controls the timeout hint in microseconds for the
+HSA wait state within the AMDGPU plugin. For the duration of this value
+the HSA runtime may busy wait. This can reduce overall latency.
+The default value is ``2000000``.
+
.. _remote_offloading_plugin:
Remote Offloading Plugin: