summaryrefslogtreecommitdiff
path: root/openmp
Commit message (Collapse)AuthorAgeFilesLines
* [OpenMP]Fix trivial build failure in MacOSMats Petersson2023-05-171-0/+2
| | | | | | | | | | | | MacOS build of LLVM with OpenMP enabled fails with an error that it doesn't know what std::abs is. Fix by including <cmath> so that the relevant function declaration is included. No functional change intended. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D150687
* Revert "Reland "[CMake] Bumps minimum version to 3.20.0.""Nico Weber2023-05-178-10/+19
| | | | | | | | | | | | | | This reverts commit 65429b9af6a2c99d340ab2dcddd41dab201f399c. Broke several projects, see https://reviews.llvm.org/D144509#4347562 onwards. Also reverts follow-up commit "[OpenMP] Compile assembly files as ASM, not C" This reverts commit 4072c8aee4c89c4457f4f30d01dc9bb4dfa52559. Also reverts fix attempt "[cmake] Set CMP0091 to fix Windows builds after the cmake_minimum_required bump" This reverts commit 7d47dac5f828efd1d378ba44a97559114f00fb64.
* [OpenMP] Compile assembly files as ASM, not CMartin Storsjö2023-05-162-3/+1
| | | | | | | | | | | | | | | | | | | Since CMake 3.20, CMake explicitly passes "-x c" (or equivalent) when compiling a file which has been set as having the language C. This behaviour change only takes place if "cmake_minimum_required" is set to 3.20 or newer, or if the policy CMP0119 is set to new. Attempting to compile assembly files with "-x c" fails, however this is workarounded in many cases, as OpenMP overrides this with "-x assembler-with-cpp", however this is only added for non-Windows targets. Thus, after increasing cmake_minimum_required to 3.20, this breaks compiling the GNU assembly for Windows targets; the GNU assembly is used for ARM and AArch64 Windows targets when building with Clang. This patch unbreaks that. Differential Revision: https://reviews.llvm.org/D150532
* [OpenMP] Use CMAKE_CXX_STANDARD for setting the C++ versionMartin Storsjö2023-05-163-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we tried to check whether the -std=c++17 option was supported and manually add the flag. That doesn't work for compilers that do support C++17 but use a different option syntax, like clang-cl. OpenMP itself probably doesn't specifically require C++17, therefore CXX_STANDARD_REQUIRED is left off, but in some cases, we may have code that only works in C++17 mode. In particular, 46262cab24312c71717ca70a9d0700481aa59152 made a refactoring that works when built with Clang in C++17 mode, but not in C++14 mode. MSVC accepts the construct in both language modes. For libomptarget, we've had specific checks that require C++17 (or the -std=c++17 option) to be supported. It's doubtful that libomptarget has got any code which more specifically requires C++17; this seems to be a remnant from when libomptarget was added originally in 2467df6e4f04e3d0e8e78d662473ba1b87c0a885 / D14031. At that point, the rest of OpenMP didn't require C++11, while libomptarget did require it. Now, it's unlikely that anyone attempts building it with a toolchain that doesn't support C++11. At this point, we could also probably just set CXX_STANDARD_REQUIRED to true, requiring C++17 as baseline for all the OpenMP libraries. This fixes building OpenMP with clang-cl after 46262cab24312c71717ca70a9d0700481aa59152. Differential Revision: https://reviews.llvm.org/D149726
* [OpenMP] Implement task record and replay mechanismChenle Yu2023-05-1515-7/+917
| | | | | | | | | | | | | | | | This patch implements the "task record and replay" mechanism. The idea is to be able to store tasks and their dependencies in the runtime so that we do not pay the cost of task creation and dependency resolution for future executions. The objective is to improve fine-grained task performance, both for those from "omp task" and "taskloop". The entry point of the recording phase is __kmpc_start_record_task, and the end of record is triggered by __kmpc_end_record_task. Tasks encapsulated between a record start and a record end are saved, meaning that the runtime stores their dependencies and structures, referred to as TDG, in order to replay them in subsequent executions. In these TDG replays, we start the execution by scheduling all root tasks (tasks that do not have input dependencies), and there will be no involvement of a hash table to track the dependencies, yet tasks do not need to be created again. At the beginning of __kmpc_start_record_task, we must check if a TDG has already been recorded. If yes, the function returns 0 and starts to replay the TDG by calling __kmp_exec_tdg; if not, we start to record, and the function returns 1. An integer uniquely identifies TDGs. Currently, this identifier needs to be incremented manually in the source code. Still, depending on how this feature would eventually be used in the library, the caller function must do it; also, the caller function needs to implement a mechanism to skip the associated region, according to the return value of __kmpc_start_record_task. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D146642
* Reland "[CMake] Bumps minimum version to 3.20.0."Mark de Wever2023-05-137-16/+9
| | | | | | The owner of the last two failing buildbots updated CMake. This reverts commit e8e8707b4aa6e4cc04c0cffb2de01d2de71165fc.
* [OpenMP] remove an erroneous assert on the location argumentVadim Paretsky2023-05-121-1/+0
| | | | | | The 'loc' argument is optional, and some compilers (e.g. MSVC) do no supply it. Differential Revision: https://reviews.llvm.org/D148393
* [OpenMP] Fix GCC build issues and restore "Additional APIs used by the MSVC ↵Vadim Paretsky2023-05-124-0/+1727
| | | | | | | | | compiler for loop collapse (rectangular and non-rectangular loops)" Fixes a GCC build issue (an instance of unallowed typename keyword) and reworks memory allocation to avoid the use of C++ library based primitives ) in and restores the earlier commit https://reviews.llvm.org/D148393 Differential Revision: https://reviews.llvm.org/D149010
* [Libomptarget] Fix AMDGPU Note handling after D150022Joseph Huber2023-05-102-4/+4
| | | | | | | Summary: The changes in https://reviews.llvm.org/D150022 changed the API for this function that we query. Simply pass in the alignment from the associated header to fix.
* [OpenMP][libomptarget] Init device when printing device infoKevin Sala2023-05-091-0/+6
| | | | | | | This patch fixes the printing of device information. Devices are initialized before printing its information. Fixes #61392 Differential Revision: https://reviews.llvm.org/D146081
* [OpenMP][libomptarget] Improve device info printing in NextGen pluginsKevin Sala2023-05-095-146/+464
| | | | | | | | | | This patch improves the device info printing in the NextGen plugins. The device info properties are composed of keys, values and units (if necessary). These properties are pushed into a queue by each vendor-specifc plugin, and later, these properties are printed processed and printed by the common Plugin Interface. The printing format is common across the different plugins. Differential Revision: https://reviews.llvm.org/D148178
* [OpenMP] Fix incorrect interop type for number of dependenciesJoseph Huber2023-05-081-1/+1
| | | | | | | | | | | | | The interop types use the number of dependencies in the function interface. Every other function uses an `i32` to count the number of dependencies except for the initialization function. This leads to codegen issues when the rest of the compiler passes in an `i32` that then creates an invalid call. Fix this to be consistent with the other uses. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D150156
* [OpenMP] Make `libomptarget` link against `libomp`Shilei Tian2023-05-062-2/+13
| | | | | | | | | | | | | In `libomptarget` we use a couple of functions from `libomp`, but we didn't link `libomptarget` against `libomp`. That will not work on some platforms such as macOS. A linker error will be encountered because those symbols are not resolved at link time when building `libomptarget`. This patch simply makes `libomptarget` link agains `libomp`, makes it a "user" of `libomp`. I think this will not break the policies between `libomp` and `libomptarget`. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149617
* [NFC][OpenMP] Remove trailing whitespaces in `openmp/runtime/src/CMakeLists.txt`Shilei Tian2023-05-061-3/+3
|
* Revert "Reland "[CMake] Bumps minimum version to 3.20.0.""Mark de Wever2023-05-067-9/+16
| | | | | | Unfortunatly not all buildbots are updated. This reverts commit ffb807ab5375b3f78df198dc5d4302b3b552242f.
* Reland "[CMake] Bumps minimum version to 3.20.0."Mark de Wever2023-05-067-16/+9
| | | | | | All build bots should be updated now. This reverts commit 44d38022ab29a3156349602733b3459df5beef93.
* [OpenMP] [OMPT] [amdgpu] [4/8] Implemented callback registration in nextgen ↵Dhruva Chakrabarti2023-05-0512-13/+258
| | | | | | | | | | | | | | | | | | | | plugins The purpose of this patch is to Implement registration of callback functions in the generic plugin by looking up corresponding callbacks in libomptarget. The overall design document is https://rice.app.box.com/s/pf3gix2hs4d4o1aatwir1set05xmjljc Defined an object of type OmptDeviceCallbacksTy in the amdgpu plugin for holding the tool-provided callback functions. Implemented a global constructor in the plugin that creates a connector object to connect with libomptarget. The callbacks that are already registered with libomptarget are looked up and registered with the plugin. Combined with an internal patch from Dhruva Chakrabarti, which fixes the OMPT initialization ordering. Achieved through removal of the constructor attribute from ompt_init. Patch from John Mellor-Crummey <johnmc@rice.edu> With contributions from: Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com> Michael Halkenhaeuser <MichaelGerald.Halkenhauser@amd.com> Reviewed By: dhruvachak, tianshilei1992 Differential Revision: https://reviews.llvm.org/D124070
* [docs] Hide collaboration and include graphs in doxygen docsTimm Bäder2023-05-042-6/+6
| | | | | | | They don't convey any useful information and make the documentation unnecessarily hard to read. Differential Revision: https://reviews.llvm.org/D149641
* [OpenMP][libomptarget][AMDGPU] Enable active HSA wait stategregrodgers2023-05-042-6/+36
| | | | | | | | | | | | | | | | | Adds HSA timeout hint of 2 seconds to the AMDGPU nextgen-plugin to improve performance of small kernels. The HSA runtime may stay in HSA_WAIT_STATE_ACTIVE for up to the timeout value before switching to HSA_WAIT_STATE_BLOCKED. This can improve latency from which small kernels can benefit. The value was determined via experimentation w/ different benchmarks. The timeout value can be overriden using the environment variable LIBOMPTARGET_AMDGPU_STREAM_BUSYWAIT with a value in microseconds. Original author: Greg Rodgers <Gregory.Rodgers@amd.com> Contributions from: JP Lehr <JanPatrick.Lehr@amd.com> Differential Revision: https://reviews.llvm.org/D148808
* Revert "[openmp] [test] Set __COMPAT_LAYER=RunAsInvoker when running tests ↵Martin Storsjö2023-05-031-8/+0
| | | | | | | | | | | | on Windows" This reverts commit 63f0fdc2628dfb2f52ed5a92e579f99261d946ed. Since f1431bbfb17cd7167adda9fc8521bb6eec52c300, this environment variable is always set up by lit itself, so individual test suites don't need to set it. Differential Revision: https://reviews.llvm.org/D149356
* [OpenMP] Fix libomptarget test mapping/ompx_hold/struct.cJoel E. Denny2023-05-021-7/+14
| | | | | | | | | | | For me, the test fails for nvptx64 offload. The problem was introduced by D146838, which landed as 747af2415519. It tries to copy a string constant's address from device to host and then print the string. This patch copies the contents of the string instead. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149623
* Revert "[OpenMP] [OMPT] [amdgpu] [4/8] Implemented callback registration in ↵Shilei Tian2023-05-0212-245/+13
| | | | | | | | | nextgen plugins" This reverts commit 8cd1f0d8885fd69c452c6bf3fb04514d06c899b0. It causes issues when OMPT is disabled explicitly and dependences are not set correctly.
* [OpenMP] [OMPT] [amdgpu] [4/8] Implemented callback registration in nextgen ↵Dhruva Chakrabarti2023-05-0212-13/+245
| | | | | | | | | | | | | | | | | | plugins The purpose of this patch is to Implement registration of callback functions in the generic plugin by looking up corresponding callbacks in libomptarget. The overall design document is https://rice.app.box.com/s/pf3gix2hs4d4o1aatwir1set05xmjljc Defined an object of type OmptDeviceCallbacksTy in the amdgpu plugin for holding the tool-provided callback functions. Implemented a global constructor in the plugin that creates a connector object to connect with libomptarget. The callbacks that are already registered with libomptarget are looked up and registered with the plugin. Combined with an internal patch from Dhruva Chakrabarti, which fixes the OMPT initialization ordering. Achieved through removal of the constructor attribute from ompt_init. Patch from John Mellor-Crummey <johnmc@rice.edu> With contributions from: Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com> Michael Halkenhaeuser <MichaelGerald.Halkenhauser@amd.com> Differential Revision: https://reviews.llvm.org/D124070
* [OpenMP] In libomptarget, assume alignment at powers of twoJoel E. Denny2023-05-022-2/+90
| | | | | | | | | | This patch fixes a bug introduced by D142586, which landed as 434992c96ed1. The fix was to only look for alignments that are powers of 2. See the new test case for details. Reviewed By: jdoerfert, jhuber6 Differential Revision: https://reviews.llvm.org/D149490
* Revert "[OpenMP] Make `libomptarget` link against `libomp`"Shilei Tian2023-05-021-4/+0
| | | | | | This reverts commit dc049a4ea681b1d0a4880bae3e19ae0ef40f6e80. It causes issue of export target.
* [OpenMP] Make `libomptarget` link against `libomp`Shilei Tian2023-05-011-0/+4
| | | | | | | | | | | | | In `libomptarget` we use a couple of functions from `libomp`, but we didn't link `libomptarget` against `libomp`. That will not work on some platforms such as macOS. A linker error will be encountered because those symbols are not resolved at link time when building `libomptarget`. This patch simply makes `libomptarget` link agains `libomp`, makes it a "user" of `libomp`. I think this will not break the policies between `libomp` and `libomptarget`. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149617
* [OpenMP] Handle function calls from `libomp` to `libomptarget` correctlyShilei Tian2023-05-013-3/+26
| | | | | | | | | | | | | D132005 introduced function calls from `libomp` to `libomptarget` if offloading is enabled. However, the external function declaration may not always work. For example, it causes a link error on macOS. Currently it is guarded properly by a macro, but in order to get OpenMP target offloading working on macOS, it has to be handled correctly. This patch applies the same idea of how we support target memory extension by using function pointer indirect call for that function. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149557
* Revert "[OpenMP] Handle function calls from `libomp` to `libomptarget` ↵Shilei Tian2023-05-013-26/+3
| | | | | | | | correctly" This reverts commit 479e335fc37c06767654141358ea076ac066de11. The assertion at `kmp_tasking.cpp(29)` is triggered.
* [OpenMP] Handle function calls from `libomp` to `libomptarget` correctlyShilei Tian2023-05-013-3/+26
| | | | | | | | | | | | | D132005 introduced function calls from `libomp` to `libomptarget` if offloading is enabled. However, the external function declaration may not always work. For example, it causes a link error on macOS. Currently it is guarded properly by a macro, but in order to get OpenMP target offloading working on macOS, it has to be handled correctly. This patch applies the same idea of how we support target memory extension by using function pointer indirect call for that function. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D149557
* Emit info message when use_device_address variable does not have a device ↵Doru Bercea2023-05-014-1/+43
| | | | counterpart.
* [OpenMP] Only enable version script if supportedShilei Tian2023-04-304-6/+21
| | | | | | | | | | | | | | The linker flag `--version-script` may not be supported by all linkers, such as macOS's linker. `libomp` is already capable of detecting whether the linker supports it and append the linker flag accordingly. Since currently we assume `libomptarget` only works on Linux, we don't do the check accordingly. This patch simply adds the check before adding it to linker flag. This will be the first patch to make OpenMP target offloading work on macOS. Note that CMake files in `plugins` are not touched before they are going to be removed pretty soon anyway. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D149555
* [OpenMP] Add missing -L to libomptarget testsJoel E. Denny2023-04-281-1/+2
| | | | | | | | | | | | | Without this patch, if an incompatible libomptarget.so is present in a system directory, such as /usr/lib64, check-openmp fails many libomptarget tests with linking errors. The problem appears to have started at D129875, which landed as dc52712a0632. This patch extends the libomptarget test suite config with a -L for the current build directory of libomptarget.so. Reviewed By: jhuber6, JonChesterfield Differential Revision: https://reviews.llvm.org/D149391
* [OpenMP] Add LIT test on task depend clauseAnimesh Kumar2023-04-281-0/+86
| | | | | | | | | | | The working of depend clause with iterator modifier can be correctly tested by means of execution tests and not at the LLVM IR level. These tests are imported/inspired from the SOLLVE tests. SOLLVE repo: https://github.com/SOLLVE/sollve_vv Differential Revision: https://reviews.llvm.org/D146706
* Disable private mapping test for AMD GPU due to intermittent fails.Doru Bercea2023-04-251-0/+1
|
* Revert "[OpenMP] Fix GCC build issues and restore "Additional APIs used by the"Joseph Huber2023-04-244-1714/+0
| | | | | | | | This patch caused failures on the OpenMP buildbots as discussed in https://reviews.llvm.org/D149010. We will need to investigate why we are seeing unresolved references to the standard C++ library. This reverts commit 5a15ca7f10bcba55a2f51281b1562cf5095ae015.
* [OpenMP] Fix GCC build issues and restore "Additional APIs used by theNatalia Glagoleva2023-04-244-0/+1714
| | | | | | | | | MSVC compiler for loop collapse (rectangular and non-rectangular loops)" Fixes a GCC build issue (unallowed typename keyword use) in and restores https://reviews.llvm.org/D148393 Differential Revision: https://reviews.llvm.org/D149010
* Revert "[OpenMP] Introduce kernel environment"Shilei Tian2023-04-2217-153/+79
| | | | | | | | This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9. It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU
* [OpenMP] Introduce kernel environmentShilei Tian2023-04-2217-79/+153
| | | | | | | | | | This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569
* Revert "[OpenMP] Additional APIs used by the MSVC compiler for loop collapse"Slava Zakharin2023-04-214-1714/+0
| | | | | | | | This reverts commit 7aa815fc782c5e39aefeec10d9c7c4cea7231975. Buildbots are failing, e.g.: https://lab.llvm.org/buildbot/#/builders/84/builds/36964 https://lab.llvm.org/buildbot/#/builders/193/builds/30096
* [OpenMP] Additional APIs used by the MSVC compiler for loop collapseNatalia Glagoleva2023-04-214-0/+1714
| | | | | | | | (rectangular and non-rectangular loops) Submitting on behalf of Natalia Glagoleva <natgla@microsoft.com> Differential Revision: https://reviews.llvm.org/D148393
* [OpenMP] Enable the IDE support for the device runtimeShilei Tian2023-04-211-0/+18
| | | | | | | | | | | | | | | Currently the device runtime is built as a custom target, which will not be included in the compile commands. Those language servers using compile commands cannot handle device runtime correctly. In this patch, when `CMAKE_EXPORT_COMPILE_COMMANDS` is turned on, dummy targets that will be excluded from all will be added. Those targets will not be built or installed if we just simply do `make` or `make install`, but their compilation will be included in the compile commands. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D148870
* Fix an issue with th_task_state_memo_stack and proxy/helper tasksAlex Duran2023-04-211-0/+46
| | | | | | | | | | | When proxy or helper tasks were used in inactive parallel regions, no memo of the th_task_state was stored in the stack, so th_task_state became invalid. This change inserts an item in the memo stack to track these th_task_states. Patch by Alex Duran. Differential Revision: https://reviews.llvm.org/D145736
* [OpenMP] Replace libomp_check_linker_flag with llvm_check_compiler_linker_flagNikita Popov2023-04-212-81/+9
| | | | | | | | | | | | | | Replace the custom libomp_check_linker_flag() implementation with llvm_check_compiler_linker_flag() from the common cmake utils. Due to the way the custom implementation is implemented (capturing output from an entire nested cmake invocation) it can easily end up incorrectly detecting flags as unavailable, e.g. because "error", "unknown" or similar occurs inside compiler flags, the directory name, etc. Fixes https://github.com/llvm/llvm-project/issues/62240. Differential Revision: https://reviews.llvm.org/D148798
* Modify test to explicitely use the size of the mapped array.Doru Bercea2023-04-201-1/+1
| | | | Review: https://reviews.llvm.org/D148832
* [OpenMP][libomptarget][NFC] Remove error data member from AsyncInfoWrapperTyKevin Sala2023-04-182-43/+35
| | | | | | | | This patch removes the Err data member from the AsyncInfoWrapperTy class. Now the error is stored externally, in the caller side, and it is explicitly passed to the AsyncInfoWrapperTy::finalize() function as a reference. Differential Revision: https://reviews.llvm.org/D148027
* [OpenMP][NFC] Silence warningJohannes Doerfert2023-04-171-1/+1
|
* [OpenMP] Ensure memory fences are created with barriers for AMDGPUsJohannes Doerfert2023-04-175-48/+154
| | | | | | | | | | | | | | | It turns out that the __builtin_amdgcn_s_barrier() alone does not emit a fence. We somehow got away with this and assumed it would work as it (hopefully) is correct on the NVIDIA path where we just emit a __syncthreads. After talking to @arsenm we now (mostly) align with the OpenCL barrier implementation [1] and emit explicit fences for AMDGPUs. It seems this was the underlying cause for #59759, but I am not 100% certain. There is a chance this simply hides the problem. Fixes: https://github.com/llvm/llvm-project/issues/59759 [1] https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/07b347366eb2c6ebc3414af323c623cbbbafc854/opencl/src/workgroup/wgbarrier.cl#L21
* Revert "Revert "Revert "[CMake] Bumps minimum version to 3.20.0."""Mark de Wever2023-04-158-10/+17
| | | | | | This reverts commit 1ef4c3c859728008cf707cad8d67f45ae5070ae1. Two buildbots still haven't been updated.
* Revert "Revert "[CMake] Bumps minimum version to 3.20.0.""Mark de Wever2023-04-158-17/+10
| | | | | | This reverts commit 92523a35a827539db8557bbc3ecab7f9ea3f6ade. Reland to see whether CIs are updated.
* [OpenMP] Remove duplicates from the list if using 'auto'Joseph Huber2023-04-141-1/+3
| | | | | | | Summary: We can detect the user's GPUs via the `auto` option. But if the user has multiple GPUs installed or set the list incorrectly, we need to remove the duplicates.