diff options
author | Martin Jambor <mjambor@suse.cz> | 2020-08-03 18:13:00 +0200 |
---|---|---|
committer | Martin Jambor <mjambor@suse.cz> | 2020-08-03 18:13:00 +0200 |
commit | c56684fd61223abd45854270cd1e83ab2f07148c (patch) | |
tree | f1ebfbe4dfd00b87f9522b84f78f02ee512e83a8 /libgomp | |
parent | 9623f61b142174b87760c81f78928dd14af7cbc6 (diff) | |
download | gcc-c56684fd61223abd45854270cd1e83ab2f07148c.tar.gz |
Removal of HSA offloading from gcc and libgomp
This patch removes the generation of HSAIL from the compiler, the HSA
offloading plugin from libgomp and the associated testsuite tests and
infrastructure bits from the respective testsuites.
Apart from removal of the obvious files, I removed bits that I found
by searching for HSA related terms and by re-tracing my steps and
looking at the patches that introduced HSA in the first place. I did
not remove everything these patches brought in, for example:
- the mechanism to pass offload-target specific info from the application to
the offloading plugin - but the same mechanism is also used to
communicate number of teams and the thread limit to all offload targets.
- run_func hook in gomp_device_descr stays too, although now it is
not used. If some future offload target would like the ability to
refuse to offload some functions, it can use it. It is easy to
remove as a follow-up if it is considered clutter, though.
- configure options --with-hsa-runtime=PATH, -with-hsa-runtime-include=PATH
and --with-hsa-runtime-lib=PATH rmeain because GCN uses them too.
- Surprisingly, GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES (a constant
from gomp-constants.h) appears in the source of the amdgcn libgomp
plugin, although I tend to think that code path is not ever used
and this patch certainly removes it from the compiler.
Nevertheless, it seems it has potential value beyond HSAIL and so
I've kept it, it can of course always be easily removed in the
future of GCN folk abandon it too.
- I assume constants OFFLOAD_TARGET_TYPE_HSA and GOMP_DEVICE_HSA
need to stay indefinitely too just so that no future offload
target picks that number.
- I have kept dg-require-effective-target
offload_device_nonshared_as requirement of thests which have it.
It is quite probable I missed some small HSA artifacts but those
should be easy to remove later as we find them.
include/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* gomp-constants.h (GOMP_VERSION_HSA): Remove.
gcc/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* hsa-brig-format.h: Moved to brig/brigfrontend.
* hsa-brig.c: Removed.
* hsa-builtins.def: Likewise.
* hsa-common.c: Likewise.
* hsa-common.h: Likewise.
* hsa-dump.c: Likewise.
* hsa-gen.c: Likewise.
* hsa-regalloc.c: Likewise.
* ipa-hsa.c: Likewise.
* omp-grid.c: Likewise.
* omp-grid.h: Likewise.
* Makefile.in (BUILTINS_DEF): Remove hsa-builtins.def.
(OBJS): Remove hsa-common.o, hsa-gen.o, hsa-regalloc.o, hsa-brig.o,
hsa-dump.o, ipa-hsa.c and omp-grid.o.
(GTFILES): Removed hsa-common.c and omp-expand.c.
* builtins.def: Remove processing of hsa-builtins.def.
(DEF_HSA_BUILTIN): Remove.
* common.opt (flag_disable_hsa): Remove.
(-Whsa): Ignore.
* config.in (ENABLE_HSA): Removed.
* configure.ac: Removed handling configuration for hsa offloading.
(ENABLE_HSA): Removed.
* configure: Regenerated.
* doc/install.texi (--enable-offload-targets): Remove hsa from the
example.
(--with-hsa-runtime): Reword to reference any HSA run-time, not
specifically HSA offloading.
* doc/invoke.texi (Option Summary): Remove -Whsa.
(Warning Options): Likewise.
(Optimize Options): Remove hsa-gen-debug-stores.
* doc/passes.texi (Regular IPA passes): Remove section on IPA HSA
pass.
* gimple-low.c (lower_stmt): Remove GIMPLE_OMP_GRID_BODY case.
* gimple-pretty-print.c (dump_gimple_omp_for): Likewise.
(dump_gimple_omp_block): Likewise.
(pp_gimple_stmt_1): Likewise.
* gimple-walk.c (walk_gimple_stmt): Likewise.
* gimple.c (gimple_build_omp_grid_body): Removed function.
(gimple_copy): Remove GIMPLE_OMP_GRID_BODY case.
* gimple.def (GIMPLE_OMP_GRID_BODY): Removed.
* gimple.h (gf_mask): Removed GF_OMP_PARALLEL_GRID_PHONY,
OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY,
GF_OMP_FOR_GRID_INTRA_GROUP, GF_OMP_FOR_GRID_GROUP_ITER and
GF_OMP_TEAMS_GRID_PHONY. Renumbered GF_OMP_FOR_KIND_SIMD and
GF_OMP_TEAMS_HOST.
(gimple_build_omp_grid_body): Removed declaration.
(gimple_has_substatements): Remove GIMPLE_OMP_GRID_BODY case.
(gimple_omp_for_grid_phony): Removed.
(gimple_omp_for_set_grid_phony): Likewise.
(gimple_omp_for_grid_intra_group): Likewise.
(gimple_omp_for_grid_intra_group): Likewise.
(gimple_omp_for_grid_group_iter): Likewise.
(gimple_omp_for_set_grid_group_iter): Likewise.
(gimple_omp_parallel_grid_phony): Likewise.
(gimple_omp_parallel_set_grid_phony): Likewise.
(gimple_omp_teams_grid_phony): Likewise.
(gimple_omp_teams_set_grid_phony): Likewise.
(CASE_GIMPLE_OMP): Remove GIMPLE_OMP_GRID_BODY case.
* lto-section-in.c (lto_section_name): Removed hsa.
* lto-streamer.h (lto_section_type): Removed LTO_section_ipa_hsa.
* lto-wrapper.c (compile_images_for_offload_targets): Remove special
handling of hsa.
* omp-expand.c: Do not include hsa-common.h and gt-omp-expand.h.
(parallel_needs_hsa_kernel_p): Removed.
(grid_launch_attributes_trees): Likewise.
(grid_launch_attributes_trees): Likewise.
(grid_create_kernel_launch_attr_types): Likewise.
(grid_insert_store_range_dim): Likewise.
(grid_get_kernel_launch_attributes): Likewise.
(get_target_arguments): Remove code passing HSA grid sizes.
(grid_expand_omp_for_loop): Remove.
(grid_arg_decl_map): Likewise.
(grid_remap_kernel_arg_accesses): Likewise.
(grid_expand_target_grid_body): Likewise.
(expand_omp): Remove call to grid_expand_target_grid_body.
(omp_make_gimple_edges): Remove GIMPLE_OMP_GRID_BODY case.
* omp-general.c: Do not include hsa-common.h.
(omp_maybe_offloaded): Do not check for HSA offloading.
(omp_context_selector_matches): Likewise.
* omp-low.c: Do not include hsa-common.h and omp-grid.h.
(build_outer_var_ref): Remove handling of GIMPLE_OMP_GRID_BODY.
(scan_sharing_clauses): Remove handling of OMP_CLAUSE__GRIDDIM_.
(scan_omp_parallel): Remove handling of the phoney variant.
(check_omp_nesting_restrictions): Remove handling of
GIMPLE_OMP_GRID_BODY and GF_OMP_FOR_KIND_GRID_LOOP.
(scan_omp_1_stmt): Remove handling of GIMPLE_OMP_GRID_BODY.
(lower_omp_for_lastprivate): Remove handling of gridified loops.
(lower_omp_for): Remove phony loop handling.
(lower_omp_taskreg): Remove phony construct handling.
(lower_omp_teams): Likewise.
(lower_omp_grid_body): Removed.
(lower_omp_1): Remove GIMPLE_OMP_GRID_BODY case.
(execute_lower_omp): Do not call omp_grid_gridify_all_targets.
* opts.c (common_handle_option): Do not handle hsa when processing
OPT_foffload_.
* params.opt (hsa-gen-debug-stores): Remove.
* passes.def: Remove pass_ipa_hsa and pass_gen_hsail.
* timevar.def: Remove TV_IPA_HSA.
* toplev.c: Do not include hsa-common.h.
(compile_file): Do not call hsa_output_brig.
* tree-core.h (enum omp_clause_code): Remove OMP_CLAUSE__GRIDDIM_.
(tree_omp_clause): Remove union field dimension.
* tree-nested.c (convert_nonlocal_omp_clauses): Remove the
OMP_CLAUSE__GRIDDIM_ case.
(convert_local_omp_clauses): Likewise.
* tree-pass.h (make_pass_gen_hsail): Remove declaration.
(make_pass_ipa_hsa): Likewise.
* tree-pretty-print.c (dump_omp_clause): Remove GIMPLE_OMP_GRID_BODY
case.
* tree.c (omp_clause_num_ops): Remove the element corresponding to
OMP_CLAUSE__GRIDDIM_.
(omp_clause_code_name): Likewise.
(walk_tree_1): Remove GIMPLE_OMP_GRID_BODY case.
* tree.h (OMP_CLAUSE__GRIDDIM__DIMENSION): Remove.
(OMP_CLAUSE__GRIDDIM__SIZE): Likewise.
(OMP_CLAUSE__GRIDDIM__GROUP): Likewise.
gcc/fortran/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* f95-lang.c (gfc_init_builtin_functions): Remove processing of
hsa-builtins.def.
gcc/brig/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* brigfrontend/brig-util.h (hsa_type_packed_p): Declared.
* brigfrontend/brig-util.cc (hsa_type_packed_p): Moved here from
removed gcc/hsa-common.c.
libgomp/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* plugin/Makefrag.am: Remove configuration of HSA plugin.
* aclocal.m4: Regenerated.
* Makefile.in: Regenerated.
* config.h.in: Regenerated.
* configure: Regenerated.
* plugin/configfrag.ac: Likewise.
* plugin/hsa_ext_finalize.h: Removed.
* plugin/plugin-hsa.c: Likewise.
* testsuite/Makefile.in: Regenerated.
* testsuite/lib/libgomp.exp
(offload_target_to_openacc_device_type): Remove hsa case.
(check_effective_target_hsa_offloading_selected_nocache): Removed
(check_effective_target_hsa_offloading_selected): Likewise.
(libgomp_init): Do not add -Wno-hsa to additional_flags.
* testsuite/libgomp.hsa.c/alloca-1.c: Removed test.
* testsuite/libgomp.hsa.c/bitfield-1.c: Likewise.
* testsuite/libgomp.hsa.c/bits-insns.c: Likewise.
* testsuite/libgomp.hsa.c/builtins-1.c: Likewise.
* testsuite/libgomp.hsa.c/c.exp: Likewise.
* testsuite/libgomp.hsa.c/complex-1.c: Likewise.
* testsuite/libgomp.hsa.c/complex-align-2.c: Likewise.
* testsuite/libgomp.hsa.c/formal-actual-args-1.c: Likewise.
* testsuite/libgomp.hsa.c/function-call-1.c: Likewise.
* testsuite/libgomp.hsa.c/get-level-1.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-1.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-2.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-3.c: Likewise.
* testsuite/libgomp.hsa.c/gridify-4.c: Likewise.
* testsuite/libgomp.hsa.c/memory-operations-1.c: Likewise.
* testsuite/libgomp.hsa.c/pr69568.c: Likewise.
* testsuite/libgomp.hsa.c/pr82416.c: Likewise.
* testsuite/libgomp.hsa.c/rotate-1.c: Likewise.
* testsuite/libgomp.hsa.c/staticvar.c: Likewise.
* testsuite/libgomp.hsa.c/switch-1.c: Likewise.
* testsuite/libgomp.hsa.c/switch-branch-1.c: Likewise.
* testsuite/libgomp.hsa.c/switch-sbr-2.c: Likewise.
* testsuite/libgomp.hsa.c/tiling-1.c: Likewise.
* testsuite/libgomp.hsa.c/tiling-2.c: Likewise.
gcc/testsuite/ChangeLog:
2020-07-24 Martin Jambor <mjambor@suse.cz>
* lib/target-supports.exp (check_effective_target_offload_hsa):
Removed.
* c-c++-common/gomp/gridify-1.c: Removed test.
* c-c++-common/gomp/gridify-2.c: Likewise.
* c-c++-common/gomp/gridify-3.c: Likewise.
* c-c++-common/gomp/hsa-indirect-call-1.c: Likewise.
* gfortran.dg/gomp/gridify-1.f90: Likewise.
* gcc.dg/gomp/gomp.exp: Do not pass -Wno-hsa to tests.
* g++.dg/gomp/gomp.exp: Likewise.
* gfortran.dg/gomp/gomp.exp: Likewise.
Diffstat (limited to 'libgomp')
34 files changed, 371 insertions, 4469 deletions
diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in index 8418af977d4..2dc2168bce7 100644 --- a/libgomp/Makefile.in +++ b/libgomp/Makefile.in @@ -1,7 +1,7 @@ -# Makefile.in generated by automake 1.15.1 from Makefile.am. +# Makefile.in generated by automake 1.16.1 from Makefile.am. # @configure_input@ -# Copyright (C) 1994-2017 Free Software Foundation, Inc. +# Copyright (C) 1994-2018 Free Software Foundation, Inc. # This Makefile.in is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -119,9 +119,8 @@ build_triplet = @build@ host_triplet = @host@ target_triplet = @target@ @PLUGIN_NVPTX_TRUE@am__append_1 = libgomp-plugin-nvptx.la -@PLUGIN_HSA_TRUE@am__append_2 = libgomp-plugin-hsa.la -@PLUGIN_GCN_TRUE@am__append_3 = libgomp-plugin-gcn.la -@USE_FORTRAN_TRUE@am__append_4 = openacc.f90 +@PLUGIN_GCN_TRUE@am__append_2 = libgomp-plugin-gcn.la +@USE_FORTRAN_TRUE@am__append_3 = openacc.f90 subdir = . ACLOCAL_M4 = $(top_srcdir)/aclocal.m4 am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \ @@ -198,17 +197,6 @@ libgomp_plugin_gcn_la_LINK = $(LIBTOOL) $(AM_V_lt) --tag=CC \ $(libgomp_plugin_gcn_la_LDFLAGS) $(LDFLAGS) -o $@ @PLUGIN_GCN_TRUE@am_libgomp_plugin_gcn_la_rpath = -rpath \ @PLUGIN_GCN_TRUE@ $(toolexeclibdir) -@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_DEPENDENCIES = libgomp.la \ -@PLUGIN_HSA_TRUE@ $(am__DEPENDENCIES_1) -@PLUGIN_HSA_TRUE@am_libgomp_plugin_hsa_la_OBJECTS = \ -@PLUGIN_HSA_TRUE@ libgomp_plugin_hsa_la-plugin-hsa.lo -libgomp_plugin_hsa_la_OBJECTS = $(am_libgomp_plugin_hsa_la_OBJECTS) -libgomp_plugin_hsa_la_LINK = $(LIBTOOL) $(AM_V_lt) --tag=CC \ - $(libgomp_plugin_hsa_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \ - --mode=link $(CCLD) $(AM_CFLAGS) $(CFLAGS) \ - $(libgomp_plugin_hsa_la_LDFLAGS) $(LDFLAGS) -o $@ -@PLUGIN_HSA_TRUE@am_libgomp_plugin_hsa_la_rpath = -rpath \ -@PLUGIN_HSA_TRUE@ $(toolexeclibdir) @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_DEPENDENCIES = libgomp.la \ @PLUGIN_NVPTX_TRUE@ $(am__DEPENDENCIES_1) @PLUGIN_NVPTX_TRUE@am_libgomp_plugin_nvptx_la_OBJECTS = \ @@ -248,7 +236,32 @@ am__v_at_0 = @ am__v_at_1 = DEFAULT_INCLUDES = -I.@am__isrc@ depcomp = $(SHELL) $(top_srcdir)/../depcomp -am__depfiles_maybe = depfiles +am__maybe_remake_depfiles = depfiles +am__depfiles_remade = ./$(DEPDIR)/affinity-fmt.Plo \ + ./$(DEPDIR)/affinity.Plo ./$(DEPDIR)/alloc.Plo \ + ./$(DEPDIR)/allocator.Plo ./$(DEPDIR)/atomic.Plo \ + ./$(DEPDIR)/bar.Plo ./$(DEPDIR)/barrier.Plo \ + ./$(DEPDIR)/critical.Plo ./$(DEPDIR)/env.Plo \ + ./$(DEPDIR)/error.Plo ./$(DEPDIR)/fortran.Plo \ + ./$(DEPDIR)/icv-device.Plo ./$(DEPDIR)/icv.Plo \ + ./$(DEPDIR)/iter.Plo ./$(DEPDIR)/iter_ull.Plo \ + ./$(DEPDIR)/libgomp-plugin.Plo \ + ./$(DEPDIR)/libgomp_plugin_gcn_la-plugin-gcn.Plo \ + ./$(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo \ + ./$(DEPDIR)/lock.Plo ./$(DEPDIR)/loop.Plo \ + ./$(DEPDIR)/loop_ull.Plo ./$(DEPDIR)/mutex.Plo \ + ./$(DEPDIR)/oacc-async.Plo ./$(DEPDIR)/oacc-cuda.Plo \ + ./$(DEPDIR)/oacc-host.Plo ./$(DEPDIR)/oacc-init.Plo \ + ./$(DEPDIR)/oacc-mem.Plo ./$(DEPDIR)/oacc-parallel.Plo \ + ./$(DEPDIR)/oacc-plugin.Plo ./$(DEPDIR)/oacc-profiling.Plo \ + ./$(DEPDIR)/oacc-target.Plo ./$(DEPDIR)/ordered.Plo \ + ./$(DEPDIR)/parallel.Plo ./$(DEPDIR)/priority_queue.Plo \ + ./$(DEPDIR)/proc.Plo ./$(DEPDIR)/ptrlock.Plo \ + ./$(DEPDIR)/sections.Plo ./$(DEPDIR)/sem.Plo \ + ./$(DEPDIR)/single.Plo ./$(DEPDIR)/splay-tree.Plo \ + ./$(DEPDIR)/target.Plo ./$(DEPDIR)/task.Plo \ + ./$(DEPDIR)/team.Plo ./$(DEPDIR)/teams.Plo \ + ./$(DEPDIR)/time.Plo ./$(DEPDIR)/work.Plo am__mv = mv -f COMPILE = $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) \ $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) @@ -281,7 +294,6 @@ am__v_FCLD_ = $(am__v_FCLD_@AM_DEFAULT_V@) am__v_FCLD_0 = @echo " FCLD " $@; am__v_FCLD_1 = SOURCES = $(libgomp_plugin_gcn_la_SOURCES) \ - $(libgomp_plugin_hsa_la_SOURCES) \ $(libgomp_plugin_nvptx_la_SOURCES) $(libgomp_la_SOURCES) AM_V_DVIPS = $(am__v_DVIPS_@AM_V@) am__v_DVIPS_ = $(am__v_DVIPS_@AM_DEFAULT_V@) @@ -450,10 +462,6 @@ PLUGIN_GCN = @PLUGIN_GCN@ PLUGIN_GCN_CPPFLAGS = @PLUGIN_GCN_CPPFLAGS@ PLUGIN_GCN_LDFLAGS = @PLUGIN_GCN_LDFLAGS@ PLUGIN_GCN_LIBS = @PLUGIN_GCN_LIBS@ -PLUGIN_HSA = @PLUGIN_HSA@ -PLUGIN_HSA_CPPFLAGS = @PLUGIN_HSA_CPPFLAGS@ -PLUGIN_HSA_LDFLAGS = @PLUGIN_HSA_LDFLAGS@ -PLUGIN_HSA_LIBS = @PLUGIN_HSA_LIBS@ PLUGIN_NVPTX = @PLUGIN_NVPTX@ PLUGIN_NVPTX_CPPFLAGS = @PLUGIN_NVPTX_CPPFLAGS@ PLUGIN_NVPTX_LDFLAGS = @PLUGIN_NVPTX_LDFLAGS@ @@ -550,8 +558,7 @@ libsubincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include AM_CPPFLAGS = $(addprefix -I, $(search_path)) AM_CFLAGS = $(XCFLAGS) AM_LDFLAGS = $(XLDFLAGS) $(SECTION_LDFLAGS) $(OPT_LDFLAGS) -toolexeclib_LTLIBRARIES = libgomp.la $(am__append_1) $(am__append_2) \ - $(am__append_3) +toolexeclib_LTLIBRARIES = libgomp.la $(am__append_1) $(am__append_2) nodist_toolexeclib_HEADERS = libgomp.spec # -Wc is only a libtool option. @@ -577,7 +584,7 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c env.c \ oacc-parallel.c oacc-host.c oacc-init.c oacc-mem.c \ oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \ affinity-fmt.c teams.c allocator.c oacc-profiling.c \ - oacc-target.c $(am__append_4) + oacc-target.c $(am__append_3) # Nvidia PTX OpenACC plugin. @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info $(libtool_VERSION) @@ -589,18 +596,6 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c env.c \ @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBADD = libgomp.la $(PLUGIN_NVPTX_LIBS) @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static -# Heterogenous Systems Architecture plugin -@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_version_info = -version-info $(libtool_VERSION) -@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_SOURCES = plugin/plugin-hsa.c -@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_CPPFLAGS = $(AM_CPPFLAGS) $(PLUGIN_HSA_CPPFLAGS) \ -@PLUGIN_HSA_TRUE@ -D_GNU_SOURCE - -@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_LDFLAGS = \ -@PLUGIN_HSA_TRUE@ $(libgomp_plugin_hsa_version_info) \ -@PLUGIN_HSA_TRUE@ $(lt_host_flags) $(PLUGIN_HSA_LDFLAGS) -@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_LIBADD = libgomp.la $(PLUGIN_HSA_LIBS) -@PLUGIN_HSA_TRUE@libgomp_plugin_hsa_la_LIBTOOLFLAGS = --tag=disable-static - # AMD GCN plugin @PLUGIN_GCN_TRUE@libgomp_plugin_gcn_version_info = -version-info $(libtool_VERSION) @PLUGIN_GCN_TRUE@libgomp_plugin_gcn_la_SOURCES = plugin/plugin-gcn.c @@ -674,8 +669,8 @@ Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status echo ' $(SHELL) ./config.status'; \ $(SHELL) ./config.status;; \ *) \ - echo ' cd $(top_builddir) && $(SHELL) ./config.status $@ $(am__depfiles_maybe)'; \ - cd $(top_builddir) && $(SHELL) ./config.status $@ $(am__depfiles_maybe);; \ + echo ' cd $(top_builddir) && $(SHELL) ./config.status $@ $(am__maybe_remake_depfiles)'; \ + cd $(top_builddir) && $(SHELL) ./config.status $@ $(am__maybe_remake_depfiles);; \ esac; $(top_srcdir)/plugin/Makefrag.am $(top_srcdir)/../multilib.am $(am__empty): @@ -751,9 +746,6 @@ clean-toolexeclibLTLIBRARIES: libgomp-plugin-gcn.la: $(libgomp_plugin_gcn_la_OBJECTS) $(libgomp_plugin_gcn_la_DEPENDENCIES) $(EXTRA_libgomp_plugin_gcn_la_DEPENDENCIES) $(AM_V_CCLD)$(libgomp_plugin_gcn_la_LINK) $(am_libgomp_plugin_gcn_la_rpath) $(libgomp_plugin_gcn_la_OBJECTS) $(libgomp_plugin_gcn_la_LIBADD) $(LIBS) -libgomp-plugin-hsa.la: $(libgomp_plugin_hsa_la_OBJECTS) $(libgomp_plugin_hsa_la_DEPENDENCIES) $(EXTRA_libgomp_plugin_hsa_la_DEPENDENCIES) - $(AM_V_CCLD)$(libgomp_plugin_hsa_la_LINK) $(am_libgomp_plugin_hsa_la_rpath) $(libgomp_plugin_hsa_la_OBJECTS) $(libgomp_plugin_hsa_la_LIBADD) $(LIBS) - libgomp-plugin-nvptx.la: $(libgomp_plugin_nvptx_la_OBJECTS) $(libgomp_plugin_nvptx_la_DEPENDENCIES) $(EXTRA_libgomp_plugin_nvptx_la_DEPENDENCIES) $(AM_V_CCLD)$(libgomp_plugin_nvptx_la_LINK) $(am_libgomp_plugin_nvptx_la_rpath) $(libgomp_plugin_nvptx_la_OBJECTS) $(libgomp_plugin_nvptx_la_LIBADD) $(LIBS) @@ -766,53 +758,58 @@ mostlyclean-compile: distclean-compile: -rm -f *.tab.c -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/affinity-fmt.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/affinity.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/alloc.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/allocator.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/atomic.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/bar.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/barrier.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/critical.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/env.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/error.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fortran.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/icv-device.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/icv.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter_ull.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp-plugin.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp_plugin_gcn_la-plugin-gcn.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/lock.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/loop.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/loop_ull.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/mutex.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-async.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-cuda.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-host.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-init.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-mem.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-parallel.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-plugin.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-profiling.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-target.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ordered.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/parallel.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/priority_queue.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/proc.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ptrlock.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sections.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sem.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/single.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/splay-tree.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/target.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/task.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/team.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/teams.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/time.Plo@am__quote@ -@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/work.Plo@am__quote@ +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/affinity-fmt.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/affinity.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/alloc.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/allocator.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/atomic.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/bar.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/barrier.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/critical.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/env.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/error.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/fortran.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/icv-device.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/icv.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/iter_ull.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp-plugin.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp_plugin_gcn_la-plugin-gcn.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/lock.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/loop.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/loop_ull.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/mutex.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-async.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-cuda.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-host.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-init.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-mem.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-parallel.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-plugin.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-profiling.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/oacc-target.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ordered.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/parallel.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/priority_queue.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/proc.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ptrlock.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sections.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/sem.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/single.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/splay-tree.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/target.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/task.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/team.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/teams.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/time.Plo@am__quote@ # am--include-marker +@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/work.Plo@am__quote@ # am--include-marker + +$(am__depfiles_remade): + @$(MKDIR_P) $(@D) + @echo '# dummy' >$@-t && $(am__mv) $@-t $@ + +am--depfiles: $(am__depfiles_remade) .c.o: @am__fastdepCC_TRUE@ $(AM_V_CC)$(COMPILE) -MT $@ -MD -MP -MF $(DEPDIR)/$*.Tpo -c -o $@ $< @@ -842,13 +839,6 @@ libgomp_plugin_gcn_la-plugin-gcn.lo: plugin/plugin-gcn.c @AMDEP_TRUE@@am__fastdepCC_FALSE@ DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@ @am__fastdepCC_FALSE@ $(AM_V_CC@am__nodep@)$(LIBTOOL) $(AM_V_lt) --tag=CC $(libgomp_plugin_gcn_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(libgomp_plugin_gcn_la_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o libgomp_plugin_gcn_la-plugin-gcn.lo `test -f 'plugin/plugin-gcn.c' || echo '$(srcdir)/'`plugin/plugin-gcn.c -libgomp_plugin_hsa_la-plugin-hsa.lo: plugin/plugin-hsa.c -@am__fastdepCC_TRUE@ $(AM_V_CC)$(LIBTOOL) $(AM_V_lt) --tag=CC $(libgomp_plugin_hsa_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(libgomp_plugin_hsa_la_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -MT libgomp_plugin_hsa_la-plugin-hsa.lo -MD -MP -MF $(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Tpo -c -o libgomp_plugin_hsa_la-plugin-hsa.lo `test -f 'plugin/plugin-hsa.c' || echo '$(srcdir)/'`plugin/plugin-hsa.c -@am__fastdepCC_TRUE@ $(AM_V_at)$(am__mv) $(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Tpo $(DEPDIR)/libgomp_plugin_hsa_la-plugin-hsa.Plo -@AMDEP_TRUE@@am__fastdepCC_FALSE@ $(AM_V_CC)source='plugin/plugin-hsa.c' object='libgomp_plugin_hsa_la-plugin-hsa.lo' libtool=yes @AMDEPBACKSLASH@ -@AMDEP_TRUE@@am__fastdepCC_FALSE@ DEPDIR=$(DEPDIR) $(CCDEPMODE) $(depcomp) @AMDEPBACKSLASH@ -@am__fastdepCC_FALSE@ $(AM_V_CC@am__nodep@)$(LIBTOOL) $(AM_V_lt) --tag=CC $(libgomp_plugin_hsa_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(libgomp_plugin_hsa_la_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -c -o libgomp_plugin_hsa_la-plugin-hsa.lo `test -f 'plugin/plugin-hsa.c' || echo '$(srcdir)/'`plugin/plugin-hsa.c - libgomp_plugin_nvptx_la-plugin-nvptx.lo: plugin/plugin-nvptx.c @am__fastdepCC_TRUE@ $(AM_V_CC)$(LIBTOOL) $(AM_V_lt) --tag=CC $(libgomp_plugin_nvptx_la_LIBTOOLFLAGS) $(LIBTOOLFLAGS) --mode=compile $(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(libgomp_plugin_nvptx_la_CPPFLAGS) $(CPPFLAGS) $(AM_CFLAGS) $(CFLAGS) -MT libgomp_plugin_nvptx_la-plugin-nvptx.lo -MD -MP -MF $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Tpo -c -o libgomp_plugin_nvptx_la-plugin-nvptx.lo `test -f 'plugin/plugin-nvptx.c' || echo '$(srcdir)/'`plugin/plugin-nvptx.c @am__fastdepCC_TRUE@ $(AM_V_at)$(am__mv) $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Tpo $(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo @@ -1205,7 +1195,52 @@ clean-am: clean-aminfo clean-generic clean-libtool clean-local \ distclean: distclean-recursive -rm -f $(am__CONFIG_DISTCLEAN_FILES) - -rm -rf ./$(DEPDIR) + -rm -f ./$(DEPDIR)/affinity-fmt.Plo + -rm -f ./$(DEPDIR)/affinity.Plo + -rm -f ./$(DEPDIR)/alloc.Plo + -rm -f ./$(DEPDIR)/allocator.Plo + -rm -f ./$(DEPDIR)/atomic.Plo + -rm -f ./$(DEPDIR)/bar.Plo + -rm -f ./$(DEPDIR)/barrier.Plo + -rm -f ./$(DEPDIR)/critical.Plo + -rm -f ./$(DEPDIR)/env.Plo + -rm -f ./$(DEPDIR)/error.Plo + -rm -f ./$(DEPDIR)/fortran.Plo + -rm -f ./$(DEPDIR)/icv-device.Plo + -rm -f ./$(DEPDIR)/icv.Plo + -rm -f ./$(DEPDIR)/iter.Plo + -rm -f ./$(DEPDIR)/iter_ull.Plo + -rm -f ./$(DEPDIR)/libgomp-plugin.Plo + -rm -f ./$(DEPDIR)/libgomp_plugin_gcn_la-plugin-gcn.Plo + -rm -f ./$(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo + -rm -f ./$(DEPDIR)/lock.Plo + -rm -f ./$(DEPDIR)/loop.Plo + -rm -f ./$(DEPDIR)/loop_ull.Plo + -rm -f ./$(DEPDIR)/mutex.Plo + -rm -f ./$(DEPDIR)/oacc-async.Plo + -rm -f ./$(DEPDIR)/oacc-cuda.Plo + -rm -f ./$(DEPDIR)/oacc-host.Plo + -rm -f ./$(DEPDIR)/oacc-init.Plo + -rm -f ./$(DEPDIR)/oacc-mem.Plo + -rm -f ./$(DEPDIR)/oacc-parallel.Plo + -rm -f ./$(DEPDIR)/oacc-plugin.Plo + -rm -f ./$(DEPDIR)/oacc-profiling.Plo + -rm -f ./$(DEPDIR)/oacc-target.Plo + -rm -f ./$(DEPDIR)/ordered.Plo + -rm -f ./$(DEPDIR)/parallel.Plo + -rm -f ./$(DEPDIR)/priority_queue.Plo + -rm -f ./$(DEPDIR)/proc.Plo + -rm -f ./$(DEPDIR)/ptrlock.Plo + -rm -f ./$(DEPDIR)/sections.Plo + -rm -f ./$(DEPDIR)/sem.Plo + -rm -f ./$(DEPDIR)/single.Plo + -rm -f ./$(DEPDIR)/splay-tree.Plo + -rm -f ./$(DEPDIR)/target.Plo + -rm -f ./$(DEPDIR)/task.Plo + -rm -f ./$(DEPDIR)/team.Plo + -rm -f ./$(DEPDIR)/teams.Plo + -rm -f ./$(DEPDIR)/time.Plo + -rm -f ./$(DEPDIR)/work.Plo -rm -f Makefile distclean-am: clean-am distclean-compile distclean-generic \ distclean-hdr distclean-libtool distclean-local distclean-tags @@ -1346,7 +1381,52 @@ installcheck-am: maintainer-clean: maintainer-clean-recursive -rm -f $(am__CONFIG_DISTCLEAN_FILES) -rm -rf $(top_srcdir)/autom4te.cache - -rm -rf ./$(DEPDIR) + -rm -f ./$(DEPDIR)/affinity-fmt.Plo + -rm -f ./$(DEPDIR)/affinity.Plo + -rm -f ./$(DEPDIR)/alloc.Plo + -rm -f ./$(DEPDIR)/allocator.Plo + -rm -f ./$(DEPDIR)/atomic.Plo + -rm -f ./$(DEPDIR)/bar.Plo + -rm -f ./$(DEPDIR)/barrier.Plo + -rm -f ./$(DEPDIR)/critical.Plo + -rm -f ./$(DEPDIR)/env.Plo + -rm -f ./$(DEPDIR)/error.Plo + -rm -f ./$(DEPDIR)/fortran.Plo + -rm -f ./$(DEPDIR)/icv-device.Plo + -rm -f ./$(DEPDIR)/icv.Plo + -rm -f ./$(DEPDIR)/iter.Plo + -rm -f ./$(DEPDIR)/iter_ull.Plo + -rm -f ./$(DEPDIR)/libgomp-plugin.Plo + -rm -f ./$(DEPDIR)/libgomp_plugin_gcn_la-plugin-gcn.Plo + -rm -f ./$(DEPDIR)/libgomp_plugin_nvptx_la-plugin-nvptx.Plo + -rm -f ./$(DEPDIR)/lock.Plo + -rm -f ./$(DEPDIR)/loop.Plo + -rm -f ./$(DEPDIR)/loop_ull.Plo + -rm -f ./$(DEPDIR)/mutex.Plo + -rm -f ./$(DEPDIR)/oacc-async.Plo + -rm -f ./$(DEPDIR)/oacc-cuda.Plo + -rm -f ./$(DEPDIR)/oacc-host.Plo + -rm -f ./$(DEPDIR)/oacc-init.Plo + -rm -f ./$(DEPDIR)/oacc-mem.Plo + -rm -f ./$(DEPDIR)/oacc-parallel.Plo + -rm -f ./$(DEPDIR)/oacc-plugin.Plo + -rm -f ./$(DEPDIR)/oacc-profiling.Plo + -rm -f ./$(DEPDIR)/oacc-target.Plo + -rm -f ./$(DEPDIR)/ordered.Plo + -rm -f ./$(DEPDIR)/parallel.Plo + -rm -f ./$(DEPDIR)/priority_queue.Plo + -rm -f ./$(DEPDIR)/proc.Plo + -rm -f ./$(DEPDIR)/ptrlock.Plo + -rm -f ./$(DEPDIR)/sections.Plo + -rm -f ./$(DEPDIR)/sem.Plo + -rm -f ./$(DEPDIR)/single.Plo + -rm -f ./$(DEPDIR)/splay-tree.Plo + -rm -f ./$(DEPDIR)/target.Plo + -rm -f ./$(DEPDIR)/task.Plo + -rm -f ./$(DEPDIR)/team.Plo + -rm -f ./$(DEPDIR)/teams.Plo + -rm -f ./$(DEPDIR)/time.Plo + -rm -f ./$(DEPDIR)/work.Plo -rm -f Makefile maintainer-clean-am: distclean-am maintainer-clean-aminfo \ maintainer-clean-generic maintainer-clean-local @@ -1373,8 +1453,8 @@ uninstall-am: uninstall-dvi-am uninstall-html-am uninstall-info-am \ .MAKE: $(am__recursive_targets) all install-am install-strip .PHONY: $(am__recursive_targets) CTAGS GTAGS TAGS all all-am all-local \ - am--refresh check check-am clean clean-aminfo clean-cscope \ - clean-generic clean-libtool clean-local \ + am--depfiles am--refresh check check-am clean clean-aminfo \ + clean-cscope clean-generic clean-libtool clean-local \ clean-toolexeclibLTLIBRARIES cscope cscopelist-am ctags \ ctags-am dist-info distclean distclean-compile \ distclean-generic distclean-hdr distclean-libtool \ diff --git a/libgomp/aclocal.m4 b/libgomp/aclocal.m4 index 55d9d71895a..471963bf607 100644 --- a/libgomp/aclocal.m4 +++ b/libgomp/aclocal.m4 @@ -1,6 +1,6 @@ -# generated automatically by aclocal 1.15.1 -*- Autoconf -*- +# generated automatically by aclocal 1.16.1 -*- Autoconf -*- -# Copyright (C) 1996-2017 Free Software Foundation, Inc. +# Copyright (C) 1996-2018 Free Software Foundation, Inc. # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -20,7 +20,7 @@ You have another version of autoconf. It may work, but is not guaranteed to. If you have problems, you may need to regenerate the build system entirely. To do so, use the procedure documented by the package, typically 'autoreconf'.])]) -# Copyright (C) 2002-2017 Free Software Foundation, Inc. +# Copyright (C) 2002-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -32,10 +32,10 @@ To do so, use the procedure documented by the package, typically 'autoreconf'.]) # generated from the m4 files accompanying Automake X.Y. # (This private macro should not be called outside this file.) AC_DEFUN([AM_AUTOMAKE_VERSION], -[am__api_version='1.15' +[am__api_version='1.16' dnl Some users find AM_AUTOMAKE_VERSION and mistake it for a way to dnl require some minimum version. Point them to the right macro. -m4_if([$1], [1.15.1], [], +m4_if([$1], [1.16.1], [], [AC_FATAL([Do not call $0, use AM_INIT_AUTOMAKE([$1]).])])dnl ]) @@ -51,14 +51,14 @@ m4_define([_AM_AUTOCONF_VERSION], []) # Call AM_AUTOMAKE_VERSION and AM_AUTOMAKE_VERSION so they can be traced. # This function is AC_REQUIREd by AM_INIT_AUTOMAKE. AC_DEFUN([AM_SET_CURRENT_AUTOMAKE_VERSION], -[AM_AUTOMAKE_VERSION([1.15.1])dnl +[AM_AUTOMAKE_VERSION([1.16.1])dnl m4_ifndef([AC_AUTOCONF_VERSION], [m4_copy([m4_PACKAGE_VERSION], [AC_AUTOCONF_VERSION])])dnl _AM_AUTOCONF_VERSION(m4_defn([AC_AUTOCONF_VERSION]))]) # AM_AUX_DIR_EXPAND -*- Autoconf -*- -# Copyright (C) 2001-2017 Free Software Foundation, Inc. +# Copyright (C) 2001-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -110,7 +110,7 @@ am_aux_dir=`cd "$ac_aux_dir" && pwd` # AM_CONDITIONAL -*- Autoconf -*- -# Copyright (C) 1997-2017 Free Software Foundation, Inc. +# Copyright (C) 1997-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -141,7 +141,7 @@ AC_CONFIG_COMMANDS_PRE( Usually this means the macro was only invoked conditionally.]]) fi])]) -# Copyright (C) 1999-2017 Free Software Foundation, Inc. +# Copyright (C) 1999-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -332,13 +332,12 @@ _AM_SUBST_NOTMAKE([am__nodep])dnl # Generate code to set up dependency tracking. -*- Autoconf -*- -# Copyright (C) 1999-2017 Free Software Foundation, Inc. +# Copyright (C) 1999-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, # with or without modifications, as long as this notice is preserved. - # _AM_OUTPUT_DEPENDENCY_COMMANDS # ------------------------------ AC_DEFUN([_AM_OUTPUT_DEPENDENCY_COMMANDS], @@ -346,49 +345,41 @@ AC_DEFUN([_AM_OUTPUT_DEPENDENCY_COMMANDS], # Older Autoconf quotes --file arguments for eval, but not when files # are listed without --file. Let's play safe and only enable the eval # if we detect the quoting. - case $CONFIG_FILES in - *\'*) eval set x "$CONFIG_FILES" ;; - *) set x $CONFIG_FILES ;; - esac + # TODO: see whether this extra hack can be removed once we start + # requiring Autoconf 2.70 or later. + AS_CASE([$CONFIG_FILES], + [*\'*], [eval set x "$CONFIG_FILES"], + [*], [set x $CONFIG_FILES]) shift - for mf + # Used to flag and report bootstrapping failures. + am_rc=0 + for am_mf do # Strip MF so we end up with the name of the file. - mf=`echo "$mf" | sed -e 's/:.*$//'` - # Check whether this is an Automake generated Makefile or not. - # We used to match only the files named 'Makefile.in', but - # some people rename them; so instead we look at the file content. - # Grep'ing the first line is not enough: some people post-process - # each Makefile.in and add a new line on top of each file to say so. - # Grep'ing the whole file is not good either: AIX grep has a line + am_mf=`AS_ECHO(["$am_mf"]) | sed -e 's/:.*$//'` + # Check whether this is an Automake generated Makefile which includes + # dependency-tracking related rules and includes. + # Grep'ing the whole file directly is not great: AIX grep has a line # limit of 2048, but all sed's we know have understand at least 4000. - if sed -n 's,^#.*generated by automake.*,X,p' "$mf" | grep X >/dev/null 2>&1; then - dirpart=`AS_DIRNAME("$mf")` - else - continue - fi - # Extract the definition of DEPDIR, am__include, and am__quote - # from the Makefile without running 'make'. - DEPDIR=`sed -n 's/^DEPDIR = //p' < "$mf"` - test -z "$DEPDIR" && continue - am__include=`sed -n 's/^am__include = //p' < "$mf"` - test -z "$am__include" && continue - am__quote=`sed -n 's/^am__quote = //p' < "$mf"` - # Find all dependency output files, they are included files with - # $(DEPDIR) in their names. We invoke sed twice because it is the - # simplest approach to changing $(DEPDIR) to its actual value in the - # expansion. - for file in `sed -n " - s/^$am__include $am__quote\(.*(DEPDIR).*\)$am__quote"'$/\1/p' <"$mf" | \ - sed -e 's/\$(DEPDIR)/'"$DEPDIR"'/g'`; do - # Make sure the directory exists. - test -f "$dirpart/$file" && continue - fdir=`AS_DIRNAME(["$file"])` - AS_MKDIR_P([$dirpart/$fdir]) - # echo "creating $dirpart/$file" - echo '# dummy' > "$dirpart/$file" - done + sed -n 's,^am--depfiles:.*,X,p' "$am_mf" | grep X >/dev/null 2>&1 \ + || continue + am_dirpart=`AS_DIRNAME(["$am_mf"])` + am_filepart=`AS_BASENAME(["$am_mf"])` + AM_RUN_LOG([cd "$am_dirpart" \ + && sed -e '/# am--include-marker/d' "$am_filepart" \ + | $MAKE -f - am--depfiles]) || am_rc=$? done + if test $am_rc -ne 0; then + AC_MSG_FAILURE([Something went wrong bootstrapping makefile fragments + for automatic dependency tracking. Try re-running configure with the + '--disable-dependency-tracking' option to at least be able to build + the package (albeit without support for automatic dependency tracking).]) + fi + AS_UNSET([am_dirpart]) + AS_UNSET([am_filepart]) + AS_UNSET([am_mf]) + AS_UNSET([am_rc]) + rm -f conftest-deps.mk } ])# _AM_OUTPUT_DEPENDENCY_COMMANDS @@ -397,18 +388,17 @@ AC_DEFUN([_AM_OUTPUT_DEPENDENCY_COMMANDS], # ----------------------------- # This macro should only be invoked once -- use via AC_REQUIRE. # -# This code is only required when automatic dependency tracking -# is enabled. FIXME. This creates each '.P' file that we will -# need in order to bootstrap the dependency handling code. +# This code is only required when automatic dependency tracking is enabled. +# This creates each '.Po' and '.Plo' makefile fragment that we'll need in +# order to bootstrap the dependency handling code. AC_DEFUN([AM_OUTPUT_DEPENDENCY_COMMANDS], [AC_CONFIG_COMMANDS([depfiles], [test x"$AMDEP_TRUE" != x"" || _AM_OUTPUT_DEPENDENCY_COMMANDS], - [AMDEP_TRUE="$AMDEP_TRUE" ac_aux_dir="$ac_aux_dir"]) -]) + [AMDEP_TRUE="$AMDEP_TRUE" MAKE="${MAKE-make}"])]) # Do all the work for Automake. -*- Autoconf -*- -# Copyright (C) 1996-2017 Free Software Foundation, Inc. +# Copyright (C) 1996-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -495,8 +485,8 @@ AC_REQUIRE([AM_PROG_INSTALL_STRIP])dnl AC_REQUIRE([AC_PROG_MKDIR_P])dnl # For better backward compatibility. To be removed once Automake 1.9.x # dies out for good. For more background, see: -# <http://lists.gnu.org/archive/html/automake/2012-07/msg00001.html> -# <http://lists.gnu.org/archive/html/automake/2012-07/msg00014.html> +# <https://lists.gnu.org/archive/html/automake/2012-07/msg00001.html> +# <https://lists.gnu.org/archive/html/automake/2012-07/msg00014.html> AC_SUBST([mkdir_p], ['$(MKDIR_P)']) # We need awk for the "check" target (and possibly the TAP driver). The # system "awk" is bad on some platforms. @@ -563,7 +553,7 @@ END Aborting the configuration process, to ensure you take notice of the issue. You can download and install GNU coreutils to get an 'rm' implementation -that behaves properly: <http://www.gnu.org/software/coreutils/>. +that behaves properly: <https://www.gnu.org/software/coreutils/>. If you want to complete the configuration process using your problematic 'rm' anyway, export the environment variable ACCEPT_INFERIOR_RM_PROGRAM @@ -605,7 +595,7 @@ for _am_header in $config_headers :; do done echo "timestamp for $_am_arg" >`AS_DIRNAME(["$_am_arg"])`/stamp-h[]$_am_stamp_count]) -# Copyright (C) 2001-2017 Free Software Foundation, Inc. +# Copyright (C) 2001-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -629,7 +619,7 @@ AC_SUBST([install_sh])]) # Add --enable-maintainer-mode option to configure. -*- Autoconf -*- # From Jim Meyering -# Copyright (C) 1996-2017 Free Software Foundation, Inc. +# Copyright (C) 1996-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -664,7 +654,7 @@ AC_MSG_CHECKING([whether to enable maintainer-specific portions of Makefiles]) # Check to see how 'make' treats includes. -*- Autoconf -*- -# Copyright (C) 2001-2017 Free Software Foundation, Inc. +# Copyright (C) 2001-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -672,49 +662,42 @@ AC_MSG_CHECKING([whether to enable maintainer-specific portions of Makefiles]) # AM_MAKE_INCLUDE() # ----------------- -# Check to see how make treats includes. +# Check whether make has an 'include' directive that can support all +# the idioms we need for our automatic dependency tracking code. AC_DEFUN([AM_MAKE_INCLUDE], -[am_make=${MAKE-make} -cat > confinc << 'END' +[AC_MSG_CHECKING([whether ${MAKE-make} supports the include directive]) +cat > confinc.mk << 'END' am__doit: - @echo this is the am__doit target + @echo this is the am__doit target >confinc.out .PHONY: am__doit END -# If we don't find an include directive, just comment out the code. -AC_MSG_CHECKING([for style of include used by $am_make]) am__include="#" am__quote= -_am_result=none -# First try GNU make style include. -echo "include confinc" > confmf -# Ignore all kinds of additional output from 'make'. -case `$am_make -s -f confmf 2> /dev/null` in #( -*the\ am__doit\ target*) - am__include=include - am__quote= - _am_result=GNU - ;; -esac -# Now try BSD make style include. -if test "$am__include" = "#"; then - echo '.include "confinc"' > confmf - case `$am_make -s -f confmf 2> /dev/null` in #( - *the\ am__doit\ target*) - am__include=.include - am__quote="\"" - _am_result=BSD - ;; - esac -fi -AC_SUBST([am__include]) -AC_SUBST([am__quote]) -AC_MSG_RESULT([$_am_result]) -rm -f confinc confmf -]) +# BSD make does it like this. +echo '.include "confinc.mk" # ignored' > confmf.BSD +# Other make implementations (GNU, Solaris 10, AIX) do it like this. +echo 'include confinc.mk # ignored' > confmf.GNU +_am_result=no +for s in GNU BSD; do + AM_RUN_LOG([${MAKE-make} -f confmf.$s && cat confinc.out]) + AS_CASE([$?:`cat confinc.out 2>/dev/null`], + ['0:this is the am__doit target'], + [AS_CASE([$s], + [BSD], [am__include='.include' am__quote='"'], + [am__include='include' am__quote=''])]) + if test "$am__include" != "#"; then + _am_result="yes ($s style)" + break + fi +done +rm -f confinc.* confmf.* +AC_MSG_RESULT([${_am_result}]) +AC_SUBST([am__include])]) +AC_SUBST([am__quote])]) # Fake the existence of programs that GNU maintainers use. -*- Autoconf -*- -# Copyright (C) 1997-2017 Free Software Foundation, Inc. +# Copyright (C) 1997-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -753,7 +736,7 @@ fi # Helper functions for option handling. -*- Autoconf -*- -# Copyright (C) 2001-2017 Free Software Foundation, Inc. +# Copyright (C) 2001-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -782,7 +765,7 @@ AC_DEFUN([_AM_SET_OPTIONS], AC_DEFUN([_AM_IF_OPTION], [m4_ifset(_AM_MANGLE_OPTION([$1]), [$2], [$3])]) -# Copyright (C) 1999-2017 Free Software Foundation, Inc. +# Copyright (C) 1999-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -829,7 +812,7 @@ AC_LANG_POP([C])]) # For backward compatibility. AC_DEFUN_ONCE([AM_PROG_CC_C_O], [AC_REQUIRE([AC_PROG_CC])]) -# Copyright (C) 2001-2017 Free Software Foundation, Inc. +# Copyright (C) 2001-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -848,7 +831,7 @@ AC_DEFUN([AM_RUN_LOG], # Check to make sure that the build environment is sane. -*- Autoconf -*- -# Copyright (C) 1996-2017 Free Software Foundation, Inc. +# Copyright (C) 1996-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -929,7 +912,7 @@ AC_CONFIG_COMMANDS_PRE( rm -f conftest.file ]) -# Copyright (C) 2009-2017 Free Software Foundation, Inc. +# Copyright (C) 2009-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -989,7 +972,7 @@ AC_SUBST([AM_BACKSLASH])dnl _AM_SUBST_NOTMAKE([AM_BACKSLASH])dnl ]) -# Copyright (C) 2001-2017 Free Software Foundation, Inc. +# Copyright (C) 2001-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -1017,7 +1000,7 @@ fi INSTALL_STRIP_PROGRAM="\$(install_sh) -c -s" AC_SUBST([INSTALL_STRIP_PROGRAM])]) -# Copyright (C) 2006-2017 Free Software Foundation, Inc. +# Copyright (C) 2006-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -1036,7 +1019,7 @@ AC_DEFUN([AM_SUBST_NOTMAKE], [_AM_SUBST_NOTMAKE($@)]) # Check how to create a tarball. -*- Autoconf -*- -# Copyright (C) 2004-2017 Free Software Foundation, Inc. +# Copyright (C) 2004-2018 Free Software Foundation, Inc. # # This file is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, diff --git a/libgomp/config.h.in b/libgomp/config.h.in index 2d50fcd5c1a..dd8a0a06d45 100644 --- a/libgomp/config.h.in +++ b/libgomp/config.h.in @@ -173,9 +173,6 @@ /* Define to 1 if the GCN plugin is built, 0 if not. */ #undef PLUGIN_GCN -/* Define to 1 if the HSA plugin is built, 0 if not. */ -#undef PLUGIN_HSA - /* Define to 1 if the NVIDIA plugin is built, 0 if not. */ #undef PLUGIN_NVPTX diff --git a/libgomp/configure b/libgomp/configure index 937d5d6a591..534f7357786 100755 --- a/libgomp/configure +++ b/libgomp/configure @@ -667,8 +667,6 @@ OPT_LDFLAGS SECTION_LDFLAGS PLUGIN_GCN_FALSE PLUGIN_GCN_TRUE -PLUGIN_HSA_FALSE -PLUGIN_HSA_TRUE PLUGIN_NVPTX_FALSE PLUGIN_NVPTX_TRUE offload_additional_lib_paths @@ -679,10 +677,6 @@ PLUGIN_GCN_LIBS PLUGIN_GCN_LDFLAGS PLUGIN_GCN_CPPFLAGS PLUGIN_GCN -PLUGIN_HSA_LIBS -PLUGIN_HSA_LDFLAGS -PLUGIN_HSA_CPPFLAGS -PLUGIN_HSA HSA_RUNTIME_LIB HSA_RUNTIME_INCLUDE PLUGIN_NVPTX_LIBS @@ -730,7 +724,6 @@ am__nodep AMDEPBACKSLASH AMDEP_FALSE AMDEP_TRUE -am__quote am__include DEPDIR OBJEXT @@ -821,7 +814,8 @@ PACKAGE_VERSION PACKAGE_TARNAME PACKAGE_NAME PATH_SEPARATOR -SHELL' +SHELL +am__quote' ac_subst_files='' ac_user_opts=' enable_option_checking @@ -2891,7 +2885,7 @@ target_alias=${target_alias-$host_alias} # -Wall: turns on all automake warnings... # -Wno-portability: ...except this one, since GNU make is required. # -Wno-override: ... and this one, since we do want this in testsuite. -am__api_version='1.15' +am__api_version='1.16' # Find a good install program. We prefer a C program (faster), # so one script is as good as another. But avoid the broken or @@ -3407,8 +3401,8 @@ MAKEINFO=${MAKEINFO-"${am_missing_run}makeinfo"} # For better backward compatibility. To be removed once Automake 1.9.x # dies out for good. For more background, see: -# <http://lists.gnu.org/archive/html/automake/2012-07/msg00001.html> -# <http://lists.gnu.org/archive/html/automake/2012-07/msg00014.html> +# <https://lists.gnu.org/archive/html/automake/2012-07/msg00001.html> +# <https://lists.gnu.org/archive/html/automake/2012-07/msg00014.html> mkdir_p='$(MKDIR_P)' # We need awk for the "check" target (and possibly the TAP driver). The @@ -3459,7 +3453,7 @@ END Aborting the configuration process, to ensure you take notice of the issue. You can download and install GNU coreutils to get an 'rm' implementation -that behaves properly: <http://www.gnu.org/software/coreutils/>. +that behaves properly: <https://www.gnu.org/software/coreutils/>. If you want to complete the configuration process using your problematic 'rm' anyway, export the environment variable ACCEPT_INFERIOR_RM_PROGRAM @@ -4420,45 +4414,45 @@ DEPDIR="${am__leading_dot}deps" ac_config_commands="$ac_config_commands depfiles" - -am_make=${MAKE-make} -cat > confinc << 'END' +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether ${MAKE-make} supports the include directive" >&5 +$as_echo_n "checking whether ${MAKE-make} supports the include directive... " >&6; } +cat > confinc.mk << 'END' am__doit: - @echo this is the am__doit target + @echo this is the am__doit target >confinc.out .PHONY: am__doit END -# If we don't find an include directive, just comment out the code. -{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for style of include used by $am_make" >&5 -$as_echo_n "checking for style of include used by $am_make... " >&6; } am__include="#" am__quote= -_am_result=none -# First try GNU make style include. -echo "include confinc" > confmf -# Ignore all kinds of additional output from 'make'. -case `$am_make -s -f confmf 2> /dev/null` in #( -*the\ am__doit\ target*) - am__include=include - am__quote= - _am_result=GNU - ;; -esac -# Now try BSD make style include. -if test "$am__include" = "#"; then - echo '.include "confinc"' > confmf - case `$am_make -s -f confmf 2> /dev/null` in #( - *the\ am__doit\ target*) - am__include=.include - am__quote="\"" - _am_result=BSD +# BSD make does it like this. +echo '.include "confinc.mk" # ignored' > confmf.BSD +# Other make implementations (GNU, Solaris 10, AIX) do it like this. +echo 'include confinc.mk # ignored' > confmf.GNU +_am_result=no +for s in GNU BSD; do + { echo "$as_me:$LINENO: ${MAKE-make} -f confmf.$s && cat confinc.out" >&5 + (${MAKE-make} -f confmf.$s && cat confinc.out) >&5 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } + case $?:`cat confinc.out 2>/dev/null` in #( + '0:this is the am__doit target') : + case $s in #( + BSD) : + am__include='.include' am__quote='"' ;; #( + *) : + am__include='include' am__quote='' ;; +esac ;; #( + *) : ;; - esac -fi - - -{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $_am_result" >&5 -$as_echo "$_am_result" >&6; } -rm -f confinc confmf +esac + if test "$am__include" != "#"; then + _am_result="yes ($s style)" + break + fi +done +rm -f confinc.* confmf.* +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: ${_am_result}" >&5 +$as_echo "${_am_result}" >&6; } # Check whether --enable-dependency-tracking was given. if test "${enable_dependency_tracking+set}" = set; then : @@ -11435,7 +11429,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 11438 "configure" +#line 11432 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -11541,7 +11535,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 11544 "configure" +#line 11538 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -15256,15 +15250,6 @@ if test "x$HSA_RUNTIME_LIB" != x; then HSA_RUNTIME_LDFLAGS=-L$HSA_RUNTIME_LIB fi -PLUGIN_HSA=0 -PLUGIN_HSA_CPPFLAGS= -PLUGIN_HSA_LDFLAGS= -PLUGIN_HSA_LIBS= - - - - - PLUGIN_GCN=0 PLUGIN_GCN_CPPFLAGS= PLUGIN_GCN_LDFLAGS= @@ -15346,45 +15331,6 @@ rm -f core conftest.err conftest.$ac_objext \ ;; esac ;; - hsa*) - case "${target}" in - x86_64-*-*) - case " ${CC} ${CFLAGS} " in - *" -m32 "*|*" -mx32 "*) - PLUGIN_HSA=0 - ;; - *) - tgt_plugin=hsa - PLUGIN_HSA=$tgt - PLUGIN_HSA_CPPFLAGS=$HSA_RUNTIME_CPPFLAGS - PLUGIN_HSA_LDFLAGS="$HSA_RUNTIME_LDFLAGS" - PLUGIN_HSA_LIBS="-ldl" - - PLUGIN_HSA_save_CPPFLAGS=$CPPFLAGS - CPPFLAGS="$PLUGIN_HSA_CPPFLAGS $CPPFLAGS" - PLUGIN_HSA_save_LDFLAGS=$LDFLAGS - LDFLAGS="$PLUGIN_HSA_LDFLAGS $LDFLAGS" - PLUGIN_HSA_save_LIBS=$LIBS - LIBS="$PLUGIN_HSA_LIBS $LIBS" - - PLUGIN_HSA=1 - CPPFLAGS=$PLUGIN_HSA_save_CPPFLAGS - LDFLAGS=$PLUGIN_HSA_save_LDFLAGS - LIBS=$PLUGIN_HSA_save_LIBS - case $PLUGIN_HSA in - hsa*) - HSA_PLUGIN=0 - as_fn_error $? "HSA run-time package required for HSA support" "$LINENO" 5 - ;; - esac - ;; - esac - ;; - *-*-*) - PLUGIN_HSA=0 - ;; - esac - ;; amdgcn*) case "${target}" in @@ -15424,10 +15370,7 @@ rm -f core conftest.err conftest.$ac_objext \ offload_targets=$offload_targets,$tgt fi # Configure additional search paths. - if test "$tgt_plugin" = hsa; then - # Offloading compilation is all handled by the target compiler. - : - elif test x"$tgt_dir" != x; then + if test x"$tgt_dir" != x; then offload_additional_options="$offload_additional_options -B$tgt_dir/libexec/gcc/\$(target_alias)/\$(gcc_version) -B$tgt_dir/bin" offload_additional_lib_paths="$offload_additional_lib_paths:$tgt_dir/lib64:$tgt_dir/lib:$tgt_dir/lib32" else @@ -15459,19 +15402,6 @@ cat >>confdefs.h <<_ACEOF #define PLUGIN_NVPTX_DYNAMIC $PLUGIN_NVPTX_DYNAMIC _ACEOF - if test $PLUGIN_HSA = 1; then - PLUGIN_HSA_TRUE= - PLUGIN_HSA_FALSE='#' -else - PLUGIN_HSA_TRUE='#' - PLUGIN_HSA_FALSE= -fi - - -cat >>confdefs.h <<_ACEOF -#define PLUGIN_HSA $PLUGIN_HSA -_ACEOF - if test $PLUGIN_GCN = 1; then PLUGIN_GCN_TRUE= PLUGIN_GCN_FALSE='#' @@ -16756,7 +16686,7 @@ case "$host" in case "$enable_cet" in auto) # Check if target supports multi-byte NOPs - # and if assembler supports CET insn. + # and if compiler and assembler support CET insn. cet_save_CFLAGS="$CFLAGS" CFLAGS="$CFLAGS -fcf-protection" cat confdefs.h - <<_ACEOF >conftest.$ac_ext @@ -17247,10 +17177,6 @@ if test -z "${PLUGIN_NVPTX_TRUE}" && test -z "${PLUGIN_NVPTX_FALSE}"; then as_fn_error $? "conditional \"PLUGIN_NVPTX\" was never defined. Usually this means the macro was only invoked conditionally." "$LINENO" 5 fi -if test -z "${PLUGIN_HSA_TRUE}" && test -z "${PLUGIN_HSA_FALSE}"; then - as_fn_error $? "conditional \"PLUGIN_HSA\" was never defined. -Usually this means the macro was only invoked conditionally." "$LINENO" 5 -fi if test -z "${PLUGIN_GCN_TRUE}" && test -z "${PLUGIN_GCN_FALSE}"; then as_fn_error $? "conditional \"PLUGIN_GCN\" was never defined. Usually this means the macro was only invoked conditionally." "$LINENO" 5 @@ -17869,7 +17795,7 @@ CC="$CC" CXX="$CXX" GFORTRAN="$GFORTRAN" GDC="$GDC" -AMDEP_TRUE="$AMDEP_TRUE" ac_aux_dir="$ac_aux_dir" +AMDEP_TRUE="$AMDEP_TRUE" MAKE="${MAKE-make}" # The HP-UX ksh and POSIX shell print the target directory to stdout @@ -18859,29 +18785,35 @@ esac ;; # Older Autoconf quotes --file arguments for eval, but not when files # are listed without --file. Let's play safe and only enable the eval # if we detect the quoting. - case $CONFIG_FILES in - *\'*) eval set x "$CONFIG_FILES" ;; - *) set x $CONFIG_FILES ;; - esac + # TODO: see whether this extra hack can be removed once we start + # requiring Autoconf 2.70 or later. + case $CONFIG_FILES in #( + *\'*) : + eval set x "$CONFIG_FILES" ;; #( + *) : + set x $CONFIG_FILES ;; #( + *) : + ;; +esac shift - for mf + # Used to flag and report bootstrapping failures. + am_rc=0 + for am_mf do # Strip MF so we end up with the name of the file. - mf=`echo "$mf" | sed -e 's/:.*$//'` - # Check whether this is an Automake generated Makefile or not. - # We used to match only the files named 'Makefile.in', but - # some people rename them; so instead we look at the file content. - # Grep'ing the first line is not enough: some people post-process - # each Makefile.in and add a new line on top of each file to say so. - # Grep'ing the whole file is not good either: AIX grep has a line + am_mf=`$as_echo "$am_mf" | sed -e 's/:.*$//'` + # Check whether this is an Automake generated Makefile which includes + # dependency-tracking related rules and includes. + # Grep'ing the whole file directly is not great: AIX grep has a line # limit of 2048, but all sed's we know have understand at least 4000. - if sed -n 's,^#.*generated by automake.*,X,p' "$mf" | grep X >/dev/null 2>&1; then - dirpart=`$as_dirname -- "$mf" || -$as_expr X"$mf" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ - X"$mf" : 'X\(//\)[^/]' \| \ - X"$mf" : 'X\(//\)$' \| \ - X"$mf" : 'X\(/\)' \| . 2>/dev/null || -$as_echo X"$mf" | + sed -n 's,^am--depfiles:.*,X,p' "$am_mf" | grep X >/dev/null 2>&1 \ + || continue + am_dirpart=`$as_dirname -- "$am_mf" || +$as_expr X"$am_mf" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ + X"$am_mf" : 'X\(//\)[^/]' \| \ + X"$am_mf" : 'X\(//\)$' \| \ + X"$am_mf" : 'X\(/\)' \| . 2>/dev/null || +$as_echo X"$am_mf" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/ q @@ -18899,53 +18831,48 @@ $as_echo X"$mf" | q } s/.*/./; q'` - else - continue - fi - # Extract the definition of DEPDIR, am__include, and am__quote - # from the Makefile without running 'make'. - DEPDIR=`sed -n 's/^DEPDIR = //p' < "$mf"` - test -z "$DEPDIR" && continue - am__include=`sed -n 's/^am__include = //p' < "$mf"` - test -z "$am__include" && continue - am__quote=`sed -n 's/^am__quote = //p' < "$mf"` - # Find all dependency output files, they are included files with - # $(DEPDIR) in their names. We invoke sed twice because it is the - # simplest approach to changing $(DEPDIR) to its actual value in the - # expansion. - for file in `sed -n " - s/^$am__include $am__quote\(.*(DEPDIR).*\)$am__quote"'$/\1/p' <"$mf" | \ - sed -e 's/\$(DEPDIR)/'"$DEPDIR"'/g'`; do - # Make sure the directory exists. - test -f "$dirpart/$file" && continue - fdir=`$as_dirname -- "$file" || -$as_expr X"$file" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ - X"$file" : 'X\(//\)[^/]' \| \ - X"$file" : 'X\(//\)$' \| \ - X"$file" : 'X\(/\)' \| . 2>/dev/null || -$as_echo X"$file" | - sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ - s//\1/ - q - } - /^X\(\/\/\)[^/].*/{ + am_filepart=`$as_basename -- "$am_mf" || +$as_expr X/"$am_mf" : '.*/\([^/][^/]*\)/*$' \| \ + X"$am_mf" : 'X\(//\)$' \| \ + X"$am_mf" : 'X\(/\)' \| . 2>/dev/null || +$as_echo X/"$am_mf" | + sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/ q } - /^X\(\/\/\)$/{ + /^X\/\(\/\/\)$/{ s//\1/ q } - /^X\(\/\).*/{ + /^X\/\(\/\).*/{ s//\1/ q } s/.*/./; q'` - as_dir=$dirpart/$fdir; as_fn_mkdir_p - # echo "creating $dirpart/$file" - echo '# dummy' > "$dirpart/$file" - done + { echo "$as_me:$LINENO: cd "$am_dirpart" \ + && sed -e '/# am--include-marker/d' "$am_filepart" \ + | $MAKE -f - am--depfiles" >&5 + (cd "$am_dirpart" \ + && sed -e '/# am--include-marker/d' "$am_filepart" \ + | $MAKE -f - am--depfiles) >&5 2>&5 + ac_status=$? + echo "$as_me:$LINENO: \$? = $ac_status" >&5 + (exit $ac_status); } || am_rc=$? done + if test $am_rc -ne 0; then + { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 +$as_echo "$as_me: error: in \`$ac_pwd':" >&2;} +as_fn_error $? "Something went wrong bootstrapping makefile fragments + for automatic dependency tracking. Try re-running configure with the + '--disable-dependency-tracking' option to at least be able to build + the package (albeit without support for automatic dependency tracking). +See \`config.log' for more details" "$LINENO" 5; } + fi + { am_dirpart=; unset am_dirpart;} + { am_filepart=; unset am_filepart;} + { am_mf=; unset am_mf;} + { am_rc=; unset am_rc;} + rm -f conftest-deps.mk } ;; "libtool":C) diff --git a/libgomp/plugin/Makefrag.am b/libgomp/plugin/Makefrag.am index 7924959ec78..54301f817c2 100644 --- a/libgomp/plugin/Makefrag.am +++ b/libgomp/plugin/Makefrag.am @@ -39,20 +39,6 @@ libgomp_plugin_nvptx_la_LIBADD = libgomp.la $(PLUGIN_NVPTX_LIBS) libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static endif -if PLUGIN_HSA -# Heterogenous Systems Architecture plugin -libgomp_plugin_hsa_version_info = -version-info $(libtool_VERSION) -toolexeclib_LTLIBRARIES += libgomp-plugin-hsa.la -libgomp_plugin_hsa_la_SOURCES = plugin/plugin-hsa.c -libgomp_plugin_hsa_la_CPPFLAGS = $(AM_CPPFLAGS) $(PLUGIN_HSA_CPPFLAGS) \ - -D_GNU_SOURCE -libgomp_plugin_hsa_la_LDFLAGS = $(libgomp_plugin_hsa_version_info) \ - $(lt_host_flags) -libgomp_plugin_hsa_la_LDFLAGS += $(PLUGIN_HSA_LDFLAGS) -libgomp_plugin_hsa_la_LIBADD = libgomp.la $(PLUGIN_HSA_LIBS) -libgomp_plugin_hsa_la_LIBTOOLFLAGS = --tag=disable-static -endif - if PLUGIN_GCN # AMD GCN plugin libgomp_plugin_gcn_version_info = -version-info $(libtool_VERSION) diff --git a/libgomp/plugin/configfrag.ac b/libgomp/plugin/configfrag.ac index fc91702a434..14030082ea8 100644 --- a/libgomp/plugin/configfrag.ac +++ b/libgomp/plugin/configfrag.ac @@ -128,15 +128,6 @@ if test "x$HSA_RUNTIME_LIB" != x; then HSA_RUNTIME_LDFLAGS=-L$HSA_RUNTIME_LIB fi -PLUGIN_HSA=0 -PLUGIN_HSA_CPPFLAGS= -PLUGIN_HSA_LDFLAGS= -PLUGIN_HSA_LIBS= -AC_SUBST(PLUGIN_HSA) -AC_SUBST(PLUGIN_HSA_CPPFLAGS) -AC_SUBST(PLUGIN_HSA_LDFLAGS) -AC_SUBST(PLUGIN_HSA_LIBS) - PLUGIN_GCN=0 PLUGIN_GCN_CPPFLAGS= PLUGIN_GCN_LDFLAGS= @@ -207,45 +198,6 @@ if test x"$enable_offload_targets" != x; then ;; esac ;; - hsa*) - case "${target}" in - x86_64-*-*) - case " ${CC} ${CFLAGS} " in - *" -m32 "*|*" -mx32 "*) - PLUGIN_HSA=0 - ;; - *) - tgt_plugin=hsa - PLUGIN_HSA=$tgt - PLUGIN_HSA_CPPFLAGS=$HSA_RUNTIME_CPPFLAGS - PLUGIN_HSA_LDFLAGS="$HSA_RUNTIME_LDFLAGS" - PLUGIN_HSA_LIBS="-ldl" - - PLUGIN_HSA_save_CPPFLAGS=$CPPFLAGS - CPPFLAGS="$PLUGIN_HSA_CPPFLAGS $CPPFLAGS" - PLUGIN_HSA_save_LDFLAGS=$LDFLAGS - LDFLAGS="$PLUGIN_HSA_LDFLAGS $LDFLAGS" - PLUGIN_HSA_save_LIBS=$LIBS - LIBS="$PLUGIN_HSA_LIBS $LIBS" - - PLUGIN_HSA=1 - CPPFLAGS=$PLUGIN_HSA_save_CPPFLAGS - LDFLAGS=$PLUGIN_HSA_save_LDFLAGS - LIBS=$PLUGIN_HSA_save_LIBS - case $PLUGIN_HSA in - hsa*) - HSA_PLUGIN=0 - AC_MSG_ERROR([HSA run-time package required for HSA support]) - ;; - esac - ;; - esac - ;; - *-*-*) - PLUGIN_HSA=0 - ;; - esac - ;; amdgcn*) case "${target}" in @@ -285,10 +237,7 @@ if test x"$enable_offload_targets" != x; then offload_targets=$offload_targets,$tgt fi # Configure additional search paths. - if test "$tgt_plugin" = hsa; then - # Offloading compilation is all handled by the target compiler. - : - elif test x"$tgt_dir" != x; then + if test x"$tgt_dir" != x; then offload_additional_options="$offload_additional_options -B$tgt_dir/libexec/gcc/\$(target_alias)/\$(gcc_version) -B$tgt_dir/bin" offload_additional_lib_paths="$offload_additional_lib_paths:$tgt_dir/lib64:$tgt_dir/lib:$tgt_dir/lib32" else @@ -304,9 +253,6 @@ AC_DEFINE_UNQUOTED([PLUGIN_NVPTX], [$PLUGIN_NVPTX], [Define to 1 if the NVIDIA plugin is built, 0 if not.]) AC_DEFINE_UNQUOTED([PLUGIN_NVPTX_DYNAMIC], [$PLUGIN_NVPTX_DYNAMIC], [Define to 1 if the NVIDIA plugin should dlopen libcuda.so.1, 0 if it should be linked against it.]) -AM_CONDITIONAL([PLUGIN_HSA], [test $PLUGIN_HSA = 1]) -AC_DEFINE_UNQUOTED([PLUGIN_HSA], [$PLUGIN_HSA], - [Define to 1 if the HSA plugin is built, 0 if not.]) AM_CONDITIONAL([PLUGIN_GCN], [test $PLUGIN_GCN = 1]) AC_DEFINE_UNQUOTED([PLUGIN_GCN], [$PLUGIN_GCN], [Define to 1 if the GCN plugin is built, 0 if not.]) diff --git a/libgomp/plugin/hsa_ext_finalize.h b/libgomp/plugin/hsa_ext_finalize.h deleted file mode 100644 index 39284bdf5f7..00000000000 --- a/libgomp/plugin/hsa_ext_finalize.h +++ /dev/null @@ -1,270 +0,0 @@ -/* HSA Extensions API 1.0.1 representation description. - Copyright (C) 2016-2020 Free Software Foundation, Inc. - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3, or (at your option) -any later version. - -GCC is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -Under Section 7 of GPL version 3, you are granted additional -permissions described in the GCC Runtime Library Exception, version -3.1, as published by the Free Software Foundation. - -You should have received a copy of the GNU General Public License and -a copy of the GCC Runtime Library Exception along with this program; -see the files COPYING3 and COPYING.RUNTIME respectively. If not, see -<http://www.gnu.org/licenses/>. - -The contents of the file was created by extracting data structures, enum, -typedef and other definitions from HSA Runtime Programmer’s Reference Manual -Version 1.0 (http://www.hsafoundation.com/standards/). - -HTML version is provided on the following link: -http://www.hsafoundation.com/html/Content/Runtime/Topics/Runtime_title_page.htm -*/ - - -#ifndef _HSA_EXT_FINALIZE_H -#define _HSA_EXT_FINALIZE_H 1 - -struct BrigModuleHeader; -typedef struct BrigModuleHeader *BrigModule_t; - -typedef enum { - HSA_EXT_IMAGE_GEOMETRY_1D = 0, - HSA_EXT_IMAGE_GEOMETRY_2D = 1, - HSA_EXT_IMAGE_GEOMETRY_3D = 2, - HSA_EXT_IMAGE_GEOMETRY_1DA = 3, - HSA_EXT_IMAGE_GEOMETRY_2DA = 4, - HSA_EXT_IMAGE_GEOMETRY_1DB = 5, - HSA_EXT_IMAGE_GEOMETRY_2DDEPTH = 6, - HSA_EXT_IMAGE_GEOMETRY_2DADEPTH = 7 -} hsa_ext_image_geometry_t; - -typedef enum { - HSA_EXT_IMAGE_CHANNEL_TYPE_SNORM_INT8 = 0, - HSA_EXT_IMAGE_CHANNEL_TYPE_SNORM_INT16 = 1, - HSA_EXT_IMAGE_CHANNEL_TYPE_UNORM_INT8 = 2, - HSA_EXT_IMAGE_CHANNEL_TYPE_UNORM_INT16 = 3, - HSA_EXT_IMAGE_CHANNEL_TYPE_UNORM_INT24 = 4, - HSA_EXT_IMAGE_CHANNEL_TYPE_UNORM_SHORT_555 = 5, - HSA_EXT_IMAGE_CHANNEL_TYPE_UNORM_SHORT_565 = 6, - HSA_EXT_IMAGE_CHANNEL_TYPE_UNORM_SHORT_101010 = 7, - HSA_EXT_IMAGE_CHANNEL_TYPE_SIGNED_INT8 = 8, - HSA_EXT_IMAGE_CHANNEL_TYPE_SIGNED_INT16 = 9, - HSA_EXT_IMAGE_CHANNEL_TYPE_SIGNED_INT32 = 10, - HSA_EXT_IMAGE_CHANNEL_TYPE_UNSIGNED_INT8 = 11, - HSA_EXT_IMAGE_CHANNEL_TYPE_UNSIGNED_INT16 = 12, - HSA_EXT_IMAGE_CHANNEL_TYPE_UNSIGNED_INT32 = 13, - HSA_EXT_IMAGE_CHANNEL_TYPE_HALF_FLOAT = 14, - HSA_EXT_IMAGE_CHANNEL_TYPE_FLOAT = 15 -} hsa_ext_image_channel_type_t; - -typedef enum { - HSA_EXT_IMAGE_CHANNEL_ORDER_A = 0, - HSA_EXT_IMAGE_CHANNEL_ORDER_R = 1, - HSA_EXT_IMAGE_CHANNEL_ORDER_RX = 2, - HSA_EXT_IMAGE_CHANNEL_ORDER_RG = 3, - HSA_EXT_IMAGE_CHANNEL_ORDER_RGX = 4, - HSA_EXT_IMAGE_CHANNEL_ORDER_RA = 5, - HSA_EXT_IMAGE_CHANNEL_ORDER_RGB = 6, - HSA_EXT_IMAGE_CHANNEL_ORDER_RGBX = 7, - HSA_EXT_IMAGE_CHANNEL_ORDER_RGBA = 8, - HSA_EXT_IMAGE_CHANNEL_ORDER_BGRA = 9, - HSA_EXT_IMAGE_CHANNEL_ORDER_ARGB = 10, - HSA_EXT_IMAGE_CHANNEL_ORDER_ABGR = 11, - HSA_EXT_IMAGE_CHANNEL_ORDER_SRGB = 12, - HSA_EXT_IMAGE_CHANNEL_ORDER_SRGBX = 13, - HSA_EXT_IMAGE_CHANNEL_ORDER_SRGBA = 14, - HSA_EXT_IMAGE_CHANNEL_ORDER_SBGRA = 15, - HSA_EXT_IMAGE_CHANNEL_ORDER_INTENSITY = 16, - HSA_EXT_IMAGE_CHANNEL_ORDER_LUMINANCE = 17, - HSA_EXT_IMAGE_CHANNEL_ORDER_DEPTH = 18, - HSA_EXT_IMAGE_CHANNEL_ORDER_DEPTH_STENCIL = 19 -} hsa_ext_image_channel_order_t; - -typedef struct hsa_ext_image_format_s -{ - hsa_ext_image_channel_type_t channel_type; - hsa_ext_image_channel_order_t channel_order; -} hsa_ext_image_format_t; - -typedef struct hsa_ext_sampler_s -{ - uint64_t handle; -} hsa_ext_sampler_t; -typedef struct hsa_ext_image_data_info_s -{ - size_t size; - size_t alignment; -} hsa_ext_image_data_info_t; -typedef enum { - HSA_EXT_SAMPLER_ADDRESSING_MODE_UNDEFINED = 0, - HSA_EXT_SAMPLER_ADDRESSING_MODE_CLAMP_TO_EDGE = 1, - HSA_EXT_SAMPLER_ADDRESSING_MODE_CLAMP_TO_BORDER = 2, - HSA_EXT_SAMPLER_ADDRESSING_MODE_REPEAT = 3, - HSA_EXT_SAMPLER_ADDRESSING_MODE_MIRRORED_REPEAT = 4 -} hsa_ext_sampler_addressing_mode_t; -typedef struct hsa_ext_image_s -{ - uint64_t handle; -} hsa_ext_image_t; -typedef enum { - HSA_EXT_IMAGE_CAPABILITY_NOT_SUPPORTED = 0x0, - HSA_EXT_IMAGE_CAPABILITY_READ_ONLY = 0x1, - HSA_EXT_IMAGE_CAPABILITY_WRITE_ONLY = 0x2, - HSA_EXT_IMAGE_CAPABILITY_READ_WRITE = 0x4, - HSA_EXT_IMAGE_CAPABILITY_READ_MODIFY_WRITE = 0x8, - HSA_EXT_IMAGE_CAPABILITY_ACCESS_INVARIANT_DATA_LAYOUT = 0x10 -} hsa_ext_image_capability_t; -typedef struct hsa_ext_control_directives_s -{ - uint64_t control_directives_mask; - uint16_t break_exceptions_mask; - uint16_t detect_exceptions_mask; - uint32_t max_dynamic_group_size; - uint64_t max_flat_grid_size; - uint32_t max_flat_workgroup_size; - uint32_t reserved1; - uint64_t required_grid_size[3]; - hsa_dim3_t required_workgroup_size; - uint8_t required_dim; - uint8_t reserved2[75]; -} hsa_ext_control_directives_t; -typedef enum { - HSA_EXT_SAMPLER_FILTER_MODE_NEAREST = 0, - HSA_EXT_SAMPLER_FILTER_MODE_LINEAR = 1 -} hsa_ext_sampler_filter_mode_t; - -typedef enum { - HSA_EXT_SAMPLER_COORDINATE_MODE_UNNORMALIZED = 0, - HSA_EXT_SAMPLER_COORDINATE_MODE_NORMALIZED = 1 -} hsa_ext_sampler_coordinate_mode_t; -typedef enum { - HSA_EXT_FINALIZER_CALL_CONVENTION_AUTO = -1 -} hsa_ext_finalizer_call_convention_t; -typedef struct hsa_ext_program_s -{ - uint64_t handle; -} hsa_ext_program_t; -typedef struct hsa_ext_image_descriptor_s -{ - hsa_ext_image_geometry_t geometry; - size_t width; - size_t height; - size_t depth; - size_t array_size; - hsa_ext_image_format_t format; -} hsa_ext_image_descriptor_t; -typedef enum { - HSA_EXT_PROGRAM_INFO_MACHINE_MODEL = 0, - HSA_EXT_PROGRAM_INFO_PROFILE = 1, - HSA_EXT_PROGRAM_INFO_DEFAULT_FLOAT_ROUNDING_MODE = 2 -} hsa_ext_program_info_t; -typedef BrigModule_t hsa_ext_module_t; -typedef struct hsa_ext_sampler_descriptor_s -{ - hsa_ext_sampler_coordinate_mode_t coordinate_mode; - hsa_ext_sampler_filter_mode_t filter_mode; - hsa_ext_sampler_addressing_mode_t address_mode; -} hsa_ext_sampler_descriptor_t; - -typedef struct hsa_ext_image_region_s -{ - hsa_dim3_t offset; - hsa_dim3_t range; -} hsa_ext_image_region_t; -hsa_status_t hsa_ext_image_export (hsa_agent_t agent, hsa_ext_image_t src_image, - void *dst_memory, size_t dst_row_pitch, - size_t dst_slice_pitch, - const hsa_ext_image_region_t *image_region); -hsa_status_t hsa_ext_program_add_module (hsa_ext_program_t program, - hsa_ext_module_t module); -hsa_status_t hsa_ext_program_iterate_modules ( - hsa_ext_program_t program, - hsa_status_t (*callback) (hsa_ext_program_t program, hsa_ext_module_t module, - void *data), - void *data); -hsa_status_t hsa_ext_program_create ( - hsa_machine_model_t machine_model, hsa_profile_t profile, - hsa_default_float_rounding_mode_t default_float_rounding_mode, - const char *options, hsa_ext_program_t *program); -hsa_status_t -hsa_ext_image_data_get_info (hsa_agent_t agent, - const hsa_ext_image_descriptor_t *image_descriptor, - hsa_access_permission_t access_permission, - hsa_ext_image_data_info_t *image_data_info); - -hsa_status_t hsa_ext_image_import (hsa_agent_t agent, const void *src_memory, - size_t src_row_pitch, size_t src_slice_pitch, - hsa_ext_image_t dst_image, - const hsa_ext_image_region_t *image_region); -hsa_status_t hsa_ext_program_get_info (hsa_ext_program_t program, - hsa_ext_program_info_t attribute, - void *value); -enum -{ - HSA_EXT_STATUS_ERROR_IMAGE_FORMAT_UNSUPPORTED = 0x3000, - HSA_EXT_STATUS_ERROR_IMAGE_SIZE_UNSUPPORTED = 0x3001 -}; -hsa_status_t hsa_ext_image_destroy (hsa_agent_t agent, hsa_ext_image_t image); -hsa_status_t hsa_ext_image_get_capability ( - hsa_agent_t agent, hsa_ext_image_geometry_t geometry, - const hsa_ext_image_format_t *image_format, uint32_t *capability_mask); -enum -{ - HSA_EXT_STATUS_ERROR_INVALID_PROGRAM = 0x2000, - HSA_EXT_STATUS_ERROR_INVALID_MODULE = 0x2001, - HSA_EXT_STATUS_ERROR_INCOMPATIBLE_MODULE = 0x2002, - HSA_EXT_STATUS_ERROR_MODULE_ALREADY_INCLUDED = 0x2003, - HSA_EXT_STATUS_ERROR_SYMBOL_MISMATCH = 0x2004, - HSA_EXT_STATUS_ERROR_FINALIZATION_FAILED = 0x2005, - HSA_EXT_STATUS_ERROR_DIRECTIVE_MISMATCH = 0x2006 -}; -hsa_status_t hsa_ext_sampler_destroy (hsa_agent_t agent, - hsa_ext_sampler_t sampler); -hsa_status_t hsa_ext_program_finalize ( - hsa_ext_program_t program, hsa_isa_t isa, int32_t call_convention, - hsa_ext_control_directives_t control_directives, const char *options, - hsa_code_object_type_t code_object_type, hsa_code_object_t *code_object); -hsa_status_t hsa_ext_image_create ( - hsa_agent_t agent, const hsa_ext_image_descriptor_t *image_descriptor, - const void *image_data, hsa_access_permission_t access_permission, - hsa_ext_image_t *image); -hsa_status_t hsa_ext_program_destroy (hsa_ext_program_t program); -hsa_status_t hsa_ext_image_copy (hsa_agent_t agent, hsa_ext_image_t src_image, - const hsa_dim3_t *src_offset, - hsa_ext_image_t dst_image, - const hsa_dim3_t *dst_offset, - const hsa_dim3_t *range); -hsa_status_t hsa_ext_image_clear (hsa_agent_t agent, hsa_ext_image_t image, - const void *data, - const hsa_ext_image_region_t *image_region); -enum -{ - HSA_EXT_AGENT_INFO_IMAGE_1D_MAX_ELEMENTS = 0x3000, - HSA_EXT_AGENT_INFO_IMAGE_1DA_MAX_ELEMENTS = 0x3001, - HSA_EXT_AGENT_INFO_IMAGE_1DB_MAX_ELEMENTS = 0x3002, - HSA_EXT_AGENT_INFO_IMAGE_2D_MAX_ELEMENTS = 0x3003, - HSA_EXT_AGENT_INFO_IMAGE_2DA_MAX_ELEMENTS = 0x3004, - HSA_EXT_AGENT_INFO_IMAGE_2DDEPTH_MAX_ELEMENTS = 0x3005, - HSA_EXT_AGENT_INFO_IMAGE_2DADEPTH_MAX_ELEMENTS = 0x3006, - HSA_EXT_AGENT_INFO_IMAGE_3D_MAX_ELEMENTS = 0x3007, - HSA_EXT_AGENT_INFO_IMAGE_ARRAY_MAX_LAYERS = 0x3008, - HSA_EXT_AGENT_INFO_MAX_IMAGE_RD_HANDLES = 0x3009, - HSA_EXT_AGENT_INFO_MAX_IMAGE_RORW_HANDLES = 0x300A, - HSA_EXT_AGENT_INFO_MAX_SAMPLER_HANDLERS = 0x300B -}; -hsa_status_t -hsa_ext_sampler_create (hsa_agent_t agent, - const hsa_ext_sampler_descriptor_t *sampler_descriptor, - hsa_ext_sampler_t *sampler); - -#endif /* _HSA_EXT_FINALIZE_H */ diff --git a/libgomp/plugin/plugin-hsa.c b/libgomp/plugin/plugin-hsa.c deleted file mode 100644 index abd3bc64163..00000000000 --- a/libgomp/plugin/plugin-hsa.c +++ /dev/null @@ -1,1871 +0,0 @@ -/* Plugin for HSAIL execution. - - Copyright (C) 2013-2020 Free Software Foundation, Inc. - - Contributed by Martin Jambor <mjambor@suse.cz> and - Martin Liska <mliska@suse.cz>. - - This file is part of the GNU Offloading and Multi Processing Library - (libgomp). - - Libgomp is free software; you can redistribute it and/or modify it - under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 3, or (at your option) - any later version. - - Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY - WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS - FOR A PARTICULAR PURPOSE. See the GNU General Public License for - more details. - - Under Section 7 of GPL version 3, you are granted additional - permissions described in the GCC Runtime Library Exception, version - 3.1, as published by the Free Software Foundation. - - You should have received a copy of the GNU General Public License and - a copy of the GCC Runtime Library Exception along with this program; - see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - <http://www.gnu.org/licenses/>. */ - -#include "config.h" -#include <stdint.h> -#include <stdio.h> -#include <stdlib.h> -#include <string.h> -#include <pthread.h> -#ifdef HAVE_INTTYPES_H -#include <inttypes.h> -#endif -#include <stdbool.h> -#include <hsa.h> -#include <plugin/hsa_ext_finalize.h> -#include <dlfcn.h> -#include "libgomp-plugin.h" -#include "gomp-constants.h" -#include "secure_getenv.h" - -#ifdef HAVE_INTTYPES_H -typedef uint64_t print_uint64_t; -#else -#define PRIu64 "lu" -typedef unsigned long print_uint64_t; -#endif - -/* As an HSA runtime is dlopened, following structure defines function - pointers utilized by the HSA plug-in. */ - -struct hsa_runtime_fn_info -{ - /* HSA runtime. */ - hsa_status_t (*hsa_status_string_fn) (hsa_status_t status, - const char **status_string); - hsa_status_t (*hsa_agent_get_info_fn) (hsa_agent_t agent, - hsa_agent_info_t attribute, - void *value); - hsa_status_t (*hsa_init_fn) (void); - hsa_status_t (*hsa_iterate_agents_fn) - (hsa_status_t (*callback)(hsa_agent_t agent, void *data), void *data); - hsa_status_t (*hsa_region_get_info_fn) (hsa_region_t region, - hsa_region_info_t attribute, - void *value); - hsa_status_t (*hsa_queue_create_fn) - (hsa_agent_t agent, uint32_t size, hsa_queue_type_t type, - void (*callback)(hsa_status_t status, hsa_queue_t *source, void *data), - void *data, uint32_t private_segment_size, - uint32_t group_segment_size, hsa_queue_t **queue); - hsa_status_t (*hsa_agent_iterate_regions_fn) - (hsa_agent_t agent, - hsa_status_t (*callback)(hsa_region_t region, void *data), void *data); - hsa_status_t (*hsa_executable_destroy_fn) (hsa_executable_t executable); - hsa_status_t (*hsa_executable_create_fn) - (hsa_profile_t profile, hsa_executable_state_t executable_state, - const char *options, hsa_executable_t *executable); - hsa_status_t (*hsa_executable_global_variable_define_fn) - (hsa_executable_t executable, const char *variable_name, void *address); - hsa_status_t (*hsa_executable_load_code_object_fn) - (hsa_executable_t executable, hsa_agent_t agent, - hsa_code_object_t code_object, const char *options); - hsa_status_t (*hsa_executable_freeze_fn)(hsa_executable_t executable, - const char *options); - hsa_status_t (*hsa_signal_create_fn) (hsa_signal_value_t initial_value, - uint32_t num_consumers, - const hsa_agent_t *consumers, - hsa_signal_t *signal); - hsa_status_t (*hsa_memory_allocate_fn) (hsa_region_t region, size_t size, - void **ptr); - hsa_status_t (*hsa_memory_free_fn) (void *ptr); - hsa_status_t (*hsa_signal_destroy_fn) (hsa_signal_t signal); - hsa_status_t (*hsa_executable_get_symbol_fn) - (hsa_executable_t executable, const char *module_name, - const char *symbol_name, hsa_agent_t agent, int32_t call_convention, - hsa_executable_symbol_t *symbol); - hsa_status_t (*hsa_executable_symbol_get_info_fn) - (hsa_executable_symbol_t executable_symbol, - hsa_executable_symbol_info_t attribute, void *value); - uint64_t (*hsa_queue_add_write_index_release_fn) (const hsa_queue_t *queue, - uint64_t value); - uint64_t (*hsa_queue_load_read_index_acquire_fn) (const hsa_queue_t *queue); - void (*hsa_signal_store_relaxed_fn) (hsa_signal_t signal, - hsa_signal_value_t value); - void (*hsa_signal_store_release_fn) (hsa_signal_t signal, - hsa_signal_value_t value); - hsa_signal_value_t (*hsa_signal_wait_acquire_fn) - (hsa_signal_t signal, hsa_signal_condition_t condition, - hsa_signal_value_t compare_value, uint64_t timeout_hint, - hsa_wait_state_t wait_state_hint); - hsa_signal_value_t (*hsa_signal_load_acquire_fn) (hsa_signal_t signal); - hsa_status_t (*hsa_queue_destroy_fn) (hsa_queue_t *queue); - - /* HSA finalizer. */ - hsa_status_t (*hsa_ext_program_add_module_fn) (hsa_ext_program_t program, - hsa_ext_module_t module); - hsa_status_t (*hsa_ext_program_create_fn) - (hsa_machine_model_t machine_model, hsa_profile_t profile, - hsa_default_float_rounding_mode_t default_float_rounding_mode, - const char *options, hsa_ext_program_t *program); - hsa_status_t (*hsa_ext_program_destroy_fn) (hsa_ext_program_t program); - hsa_status_t (*hsa_ext_program_finalize_fn) - (hsa_ext_program_t program,hsa_isa_t isa, - int32_t call_convention, hsa_ext_control_directives_t control_directives, - const char *options, hsa_code_object_type_t code_object_type, - hsa_code_object_t *code_object); -}; - -/* HSA runtime functions that are initialized in init_hsa_context. */ - -static struct hsa_runtime_fn_info hsa_fns; - -/* Keep the following GOMP prefixed structures in sync with respective parts of - the compiler. */ - -/* Structure describing the run-time and grid properties of an HSA kernel - lauch. */ - -struct GOMP_kernel_launch_attributes -{ - /* Number of dimensions the workload has. Maximum number is 3. */ - uint32_t ndim; - /* Size of the grid in the three respective dimensions. */ - uint32_t gdims[3]; - /* Size of work-groups in the respective dimensions. */ - uint32_t wdims[3]; -}; - -/* Collection of information needed for a dispatch of a kernel from a - kernel. */ - -struct GOMP_hsa_kernel_dispatch -{ - /* Pointer to a command queue associated with a kernel dispatch agent. */ - void *queue; - /* Pointer to reserved memory for OMP data struct copying. */ - void *omp_data_memory; - /* Pointer to a memory space used for kernel arguments passing. */ - void *kernarg_address; - /* Kernel object. */ - uint64_t object; - /* Synchronization signal used for dispatch synchronization. */ - uint64_t signal; - /* Private segment size. */ - uint32_t private_segment_size; - /* Group segment size. */ - uint32_t group_segment_size; - /* Number of children kernel dispatches. */ - uint64_t kernel_dispatch_count; - /* Debug purpose argument. */ - uint64_t debug; - /* Levels-var ICV. */ - uint64_t omp_level; - /* Kernel dispatch structures created for children kernel dispatches. */ - struct GOMP_hsa_kernel_dispatch **children_dispatches; - /* Number of threads. */ - uint32_t omp_num_threads; -}; - -/* Part of the libgomp plugin interface. Return the name of the accelerator, - which is "hsa". */ - -const char * -GOMP_OFFLOAD_get_name (void) -{ - return "hsa"; -} - -/* Part of the libgomp plugin interface. Return the specific capabilities the - HSA accelerator have. */ - -unsigned int -GOMP_OFFLOAD_get_caps (void) -{ - return GOMP_OFFLOAD_CAP_SHARED_MEM | GOMP_OFFLOAD_CAP_OPENMP_400; -} - -/* Part of the libgomp plugin interface. Identify as HSA accelerator. */ - -int -GOMP_OFFLOAD_get_type (void) -{ - return OFFLOAD_TARGET_TYPE_HSA; -} - -/* Return the libgomp version number we're compatible with. There is - no requirement for cross-version compatibility. */ - -unsigned -GOMP_OFFLOAD_version (void) -{ - return GOMP_VERSION; -} - -/* Flag to decide whether print to stderr information about what is going on. - Set in init_debug depending on environment variables. */ - -static bool debug; - -/* Flag to decide if the runtime should suppress a possible fallback to host - execution. */ - -static bool suppress_host_fallback; - -/* Flag to locate HSA runtime shared library that is dlopened - by this plug-in. */ - -static const char *hsa_runtime_lib; - -/* Flag to decide if the runtime should support also CPU devices (can be - a simulator). */ - -static bool support_cpu_devices; - -/* Initialize debug and suppress_host_fallback according to the environment. */ - -static void -init_enviroment_variables (void) -{ - if (secure_getenv ("HSA_DEBUG")) - debug = true; - else - debug = false; - - if (secure_getenv ("HSA_SUPPRESS_HOST_FALLBACK")) - suppress_host_fallback = true; - else - suppress_host_fallback = false; - - hsa_runtime_lib = secure_getenv ("HSA_RUNTIME_LIB"); - if (hsa_runtime_lib == NULL) - hsa_runtime_lib = HSA_RUNTIME_LIB "libhsa-runtime64.so"; - - support_cpu_devices = secure_getenv ("HSA_SUPPORT_CPU_DEVICES"); -} - -/* Print a logging message with PREFIX to stderr if HSA_DEBUG value - is set to true. */ - -#define HSA_LOG(prefix, ...) \ - do \ - { \ - if (debug) \ - { \ - fprintf (stderr, prefix); \ - fprintf (stderr, __VA_ARGS__); \ - } \ - } \ - while (false) - -/* Print a debugging message to stderr. */ - -#define HSA_DEBUG(...) HSA_LOG ("HSA debug: ", __VA_ARGS__) - -/* Print a warning message to stderr. */ - -#define HSA_WARNING(...) HSA_LOG ("HSA warning: ", __VA_ARGS__) - -/* Print HSA warning STR with an HSA STATUS code. */ - -static void -hsa_warn (const char *str, hsa_status_t status) -{ - if (!debug) - return; - - const char *hsa_error_msg = "[unknown]"; - hsa_fns.hsa_status_string_fn (status, &hsa_error_msg); - - fprintf (stderr, "HSA warning: %s\nRuntime message: %s", str, hsa_error_msg); -} - -/* Report a fatal error STR together with the HSA error corresponding to STATUS - and terminate execution of the current process. */ - -static void -hsa_fatal (const char *str, hsa_status_t status) -{ - const char *hsa_error_msg = "[unknown]"; - hsa_fns.hsa_status_string_fn (status, &hsa_error_msg); - GOMP_PLUGIN_fatal ("HSA fatal error: %s\nRuntime message: %s", str, - hsa_error_msg); -} - -/* Like hsa_fatal, except only report error message, and return FALSE - for propagating error processing to outside of plugin. */ - -static bool -hsa_error (const char *str, hsa_status_t status) -{ - const char *hsa_error_msg = "[unknown]"; - hsa_fns.hsa_status_string_fn (status, &hsa_error_msg); - GOMP_PLUGIN_error ("HSA fatal error: %s\nRuntime message: %s", str, - hsa_error_msg); - return false; -} - -struct hsa_kernel_description -{ - const char *name; - unsigned omp_data_size; - bool gridified_kernel_p; - unsigned kernel_dependencies_count; - const char **kernel_dependencies; -}; - -struct global_var_info -{ - const char *name; - void *address; -}; - -/* Data passed by the static initializer of a compilation unit containing BRIG - to GOMP_offload_register. */ - -struct brig_image_desc -{ - hsa_ext_module_t brig_module; - const unsigned kernel_count; - struct hsa_kernel_description *kernel_infos; - const unsigned global_variable_count; - struct global_var_info *global_variables; -}; - -struct agent_info; - -/* Information required to identify, finalize and run any given kernel. */ - -struct kernel_info -{ - /* Name of the kernel, required to locate it within the brig module. */ - const char *name; - /* Size of memory space for OMP data. */ - unsigned omp_data_size; - /* The specific agent the kernel has been or will be finalized for and run - on. */ - struct agent_info *agent; - /* The specific module where the kernel takes place. */ - struct module_info *module; - /* Mutex enforcing that at most once thread ever initializes a kernel for - use. A thread should have locked agent->modules_rwlock for reading before - acquiring it. */ - pthread_mutex_t init_mutex; - /* Flag indicating whether the kernel has been initialized and all fields - below it contain valid data. */ - bool initialized; - /* Flag indicating that the kernel has a problem that blocks an execution. */ - bool initialization_failed; - /* The object to be put into the dispatch queue. */ - uint64_t object; - /* Required size of kernel arguments. */ - uint32_t kernarg_segment_size; - /* Required size of group segment. */ - uint32_t group_segment_size; - /* Required size of private segment. */ - uint32_t private_segment_size; - /* List of all kernel dependencies. */ - const char **dependencies; - /* Number of dependencies. */ - unsigned dependencies_count; - /* Maximum OMP data size necessary for kernel from kernel dispatches. */ - unsigned max_omp_data_size; - /* True if the kernel is gridified. */ - bool gridified_kernel_p; -}; - -/* Information about a particular brig module, its image and kernels. */ - -struct module_info -{ - /* The next and previous module in the linked list of modules of an agent. */ - struct module_info *next, *prev; - /* The description with which the program has registered the image. */ - struct brig_image_desc *image_desc; - - /* Number of kernels in this module. */ - int kernel_count; - /* An array of kernel_info structures describing each kernel in this - module. */ - struct kernel_info kernels[]; -}; - -/* Information about shared brig library. */ - -struct brig_library_info -{ - char *file_name; - hsa_ext_module_t image; -}; - -/* Description of an HSA GPU agent and the program associated with it. */ - -struct agent_info -{ - /* The HSA ID of the agent. Assigned when hsa_context is initialized. */ - hsa_agent_t id; - /* Whether the agent has been initialized. The fields below are usable only - if it has been. */ - bool initialized; - /* The HSA ISA of this agent. */ - hsa_isa_t isa; - /* Command queue of the agent. */ - hsa_queue_t *command_q; - /* Kernel from kernel dispatch command queue. */ - hsa_queue_t *kernel_dispatch_command_q; - /* The HSA memory region from which to allocate kernel arguments. */ - hsa_region_t kernarg_region; - - /* Read-write lock that protects kernels which are running or about to be run - from interference with loading and unloading of images. Needs to be - locked for reading while a kernel is being run, and for writing if the - list of modules is manipulated (and thus the HSA program invalidated). */ - pthread_rwlock_t modules_rwlock; - /* The first module in a linked list of modules associated with this - kernel. */ - struct module_info *first_module; - - /* Mutex enforcing that only one thread will finalize the HSA program. A - thread should have locked agent->modules_rwlock for reading before - acquiring it. */ - pthread_mutex_t prog_mutex; - /* Flag whether the HSA program that consists of all the modules has been - finalized. */ - bool prog_finalized; - /* Flag whether the program was finalized but with a failure. */ - bool prog_finalized_error; - /* HSA executable - the finalized program that is used to locate kernels. */ - hsa_executable_t executable; - /* List of BRIG libraries. */ - struct brig_library_info **brig_libraries; - /* Number of loaded shared BRIG libraries. */ - unsigned brig_libraries_count; -}; - -/* Information about the whole HSA environment and all of its agents. */ - -struct hsa_context_info -{ - /* Whether the structure has been initialized. */ - bool initialized; - /* Number of usable GPU HSA agents in the system. */ - int agent_count; - /* Array of agent_info structures describing the individual HSA agents. */ - struct agent_info *agents; -}; - -/* Information about the whole HSA environment and all of its agents. */ - -static struct hsa_context_info hsa_context; - -#define DLSYM_FN(function) \ - hsa_fns.function##_fn = dlsym (handle, #function); \ - if (hsa_fns.function##_fn == NULL) \ - goto dl_fail; - -static bool -init_hsa_runtime_functions (void) -{ - void *handle = dlopen (hsa_runtime_lib, RTLD_LAZY); - if (handle == NULL) - goto dl_fail; - - DLSYM_FN (hsa_status_string) - DLSYM_FN (hsa_agent_get_info) - DLSYM_FN (hsa_init) - DLSYM_FN (hsa_iterate_agents) - DLSYM_FN (hsa_region_get_info) - DLSYM_FN (hsa_queue_create) - DLSYM_FN (hsa_agent_iterate_regions) - DLSYM_FN (hsa_executable_destroy) - DLSYM_FN (hsa_executable_create) - DLSYM_FN (hsa_executable_global_variable_define) - DLSYM_FN (hsa_executable_load_code_object) - DLSYM_FN (hsa_executable_freeze) - DLSYM_FN (hsa_signal_create) - DLSYM_FN (hsa_memory_allocate) - DLSYM_FN (hsa_memory_free) - DLSYM_FN (hsa_signal_destroy) - DLSYM_FN (hsa_executable_get_symbol) - DLSYM_FN (hsa_executable_symbol_get_info) - DLSYM_FN (hsa_queue_add_write_index_release) - DLSYM_FN (hsa_queue_load_read_index_acquire) - DLSYM_FN (hsa_signal_wait_acquire) - DLSYM_FN (hsa_signal_store_relaxed) - DLSYM_FN (hsa_signal_store_release) - DLSYM_FN (hsa_signal_load_acquire) - DLSYM_FN (hsa_queue_destroy) - DLSYM_FN (hsa_ext_program_add_module) - DLSYM_FN (hsa_ext_program_create) - DLSYM_FN (hsa_ext_program_destroy) - DLSYM_FN (hsa_ext_program_finalize) - return true; - - dl_fail: - HSA_DEBUG ("while loading %s: %s\n", hsa_runtime_lib, dlerror ()); - return false; -} - -/* Find kernel for an AGENT by name provided in KERNEL_NAME. */ - -static struct kernel_info * -get_kernel_for_agent (struct agent_info *agent, const char *kernel_name) -{ - struct module_info *module = agent->first_module; - - while (module) - { - for (unsigned i = 0; i < module->kernel_count; i++) - if (strcmp (module->kernels[i].name, kernel_name) == 0) - return &module->kernels[i]; - - module = module->next; - } - - return NULL; -} - -/* Return true if the agent is a GPU and acceptable of concurrent submissions - from different threads. */ - -static bool -suitable_hsa_agent_p (hsa_agent_t agent) -{ - hsa_device_type_t device_type; - hsa_status_t status - = hsa_fns.hsa_agent_get_info_fn (agent, HSA_AGENT_INFO_DEVICE, - &device_type); - if (status != HSA_STATUS_SUCCESS) - return false; - - switch (device_type) - { - case HSA_DEVICE_TYPE_GPU: - break; - case HSA_DEVICE_TYPE_CPU: - if (!support_cpu_devices) - return false; - break; - default: - return false; - } - - uint32_t features = 0; - status = hsa_fns.hsa_agent_get_info_fn (agent, HSA_AGENT_INFO_FEATURE, - &features); - if (status != HSA_STATUS_SUCCESS - || !(features & HSA_AGENT_FEATURE_KERNEL_DISPATCH)) - return false; - hsa_queue_type_t queue_type; - status = hsa_fns.hsa_agent_get_info_fn (agent, HSA_AGENT_INFO_QUEUE_TYPE, - &queue_type); - if (status != HSA_STATUS_SUCCESS - || (queue_type != HSA_QUEUE_TYPE_MULTI)) - return false; - - return true; -} - -/* Callback of hsa_iterate_agents, if AGENT is a GPU device, increment - agent_count in hsa_context. */ - -static hsa_status_t -count_gpu_agents (hsa_agent_t agent, void *data __attribute__ ((unused))) -{ - if (suitable_hsa_agent_p (agent)) - hsa_context.agent_count++; - return HSA_STATUS_SUCCESS; -} - -/* Callback of hsa_iterate_agents, if AGENT is a GPU device, assign the agent - id to the describing structure in the hsa context. The index of the - structure is pointed to by DATA, increment it afterwards. */ - -static hsa_status_t -assign_agent_ids (hsa_agent_t agent, void *data) -{ - if (suitable_hsa_agent_p (agent)) - { - int *agent_index = (int *) data; - hsa_context.agents[*agent_index].id = agent; - ++*agent_index; - } - return HSA_STATUS_SUCCESS; -} - -/* Initialize hsa_context if it has not already been done. - Return TRUE on success. */ - -static bool -init_hsa_context (void) -{ - hsa_status_t status; - int agent_index = 0; - - if (hsa_context.initialized) - return true; - init_enviroment_variables (); - if (!init_hsa_runtime_functions ()) - { - HSA_DEBUG ("Run-time could not be dynamically opened\n"); - return false; - } - status = hsa_fns.hsa_init_fn (); - if (status != HSA_STATUS_SUCCESS) - return hsa_error ("Run-time could not be initialized", status); - HSA_DEBUG ("HSA run-time initialized\n"); - status = hsa_fns.hsa_iterate_agents_fn (count_gpu_agents, NULL); - if (status != HSA_STATUS_SUCCESS) - return hsa_error ("HSA GPU devices could not be enumerated", status); - HSA_DEBUG ("There are %i HSA GPU devices.\n", hsa_context.agent_count); - - hsa_context.agents - = GOMP_PLUGIN_malloc_cleared (hsa_context.agent_count - * sizeof (struct agent_info)); - status = hsa_fns.hsa_iterate_agents_fn (assign_agent_ids, &agent_index); - if (agent_index != hsa_context.agent_count) - { - GOMP_PLUGIN_error ("Failed to assign IDs to all HSA agents"); - return false; - } - hsa_context.initialized = true; - return true; -} - -/* Callback of dispatch queues to report errors. */ - -static void -queue_callback (hsa_status_t status, - hsa_queue_t *queue __attribute__ ((unused)), - void *data __attribute__ ((unused))) -{ - hsa_fatal ("Asynchronous queue error", status); -} - -/* Callback of hsa_agent_iterate_regions. Determine if a memory REGION can be - used for kernarg allocations and if so write it to the memory pointed to by - DATA and break the query. */ - -static hsa_status_t -get_kernarg_memory_region (hsa_region_t region, void *data) -{ - hsa_status_t status; - hsa_region_segment_t segment; - - status = hsa_fns.hsa_region_get_info_fn (region, HSA_REGION_INFO_SEGMENT, - &segment); - if (status != HSA_STATUS_SUCCESS) - return status; - if (segment != HSA_REGION_SEGMENT_GLOBAL) - return HSA_STATUS_SUCCESS; - - uint32_t flags; - status = hsa_fns.hsa_region_get_info_fn (region, HSA_REGION_INFO_GLOBAL_FLAGS, - &flags); - if (status != HSA_STATUS_SUCCESS) - return status; - if (flags & HSA_REGION_GLOBAL_FLAG_KERNARG) - { - hsa_region_t *ret = (hsa_region_t *) data; - *ret = region; - return HSA_STATUS_INFO_BREAK; - } - return HSA_STATUS_SUCCESS; -} - -/* Part of the libgomp plugin interface. Return the number of HSA devices on - the system. */ - -int -GOMP_OFFLOAD_get_num_devices (void) -{ - if (!init_hsa_context ()) - return 0; - return hsa_context.agent_count; -} - -/* Part of the libgomp plugin interface. Initialize agent number N so that it - can be used for computation. Return TRUE on success. */ - -bool -GOMP_OFFLOAD_init_device (int n) -{ - if (!init_hsa_context ()) - return false; - if (n >= hsa_context.agent_count) - { - GOMP_PLUGIN_error ("Request to initialize non-existing HSA device %i", n); - return false; - } - struct agent_info *agent = &hsa_context.agents[n]; - - if (agent->initialized) - return true; - - if (pthread_rwlock_init (&agent->modules_rwlock, NULL)) - { - GOMP_PLUGIN_error ("Failed to initialize an HSA agent rwlock"); - return false; - } - if (pthread_mutex_init (&agent->prog_mutex, NULL)) - { - GOMP_PLUGIN_error ("Failed to initialize an HSA agent program mutex"); - return false; - } - - uint32_t queue_size; - hsa_status_t status; - status = hsa_fns.hsa_agent_get_info_fn (agent->id, - HSA_AGENT_INFO_QUEUE_MAX_SIZE, - &queue_size); - if (status != HSA_STATUS_SUCCESS) - return hsa_error ("Error requesting maximum queue size of the HSA agent", - status); - status = hsa_fns.hsa_agent_get_info_fn (agent->id, HSA_AGENT_INFO_ISA, - &agent->isa); - if (status != HSA_STATUS_SUCCESS) - return hsa_error ("Error querying the ISA of the agent", status); - status = hsa_fns.hsa_queue_create_fn (agent->id, queue_size, - HSA_QUEUE_TYPE_MULTI, - queue_callback, NULL, UINT32_MAX, - UINT32_MAX, - &agent->command_q); - if (status != HSA_STATUS_SUCCESS) - return hsa_error ("Error creating command queue", status); - - status = hsa_fns.hsa_queue_create_fn (agent->id, queue_size, - HSA_QUEUE_TYPE_MULTI, - queue_callback, NULL, UINT32_MAX, - UINT32_MAX, - &agent->kernel_dispatch_command_q); - if (status != HSA_STATUS_SUCCESS) - return hsa_error ("Error creating kernel dispatch command queue", status); - - agent->kernarg_region.handle = (uint64_t) -1; - status = hsa_fns.hsa_agent_iterate_regions_fn (agent->id, - get_kernarg_memory_region, - &agent->kernarg_region); - if (agent->kernarg_region.handle == (uint64_t) -1) - { - GOMP_PLUGIN_error ("Could not find suitable memory region for kernel " - "arguments"); - return false; - } - HSA_DEBUG ("HSA agent initialized, queue has id %llu\n", - (long long unsigned) agent->command_q->id); - HSA_DEBUG ("HSA agent initialized, kernel dispatch queue has id %llu\n", - (long long unsigned) agent->kernel_dispatch_command_q->id); - agent->initialized = true; - return true; -} - -/* Verify that hsa_context has already been initialized and return the - agent_info structure describing device number N. Return NULL on error. */ - -static struct agent_info * -get_agent_info (int n) -{ - if (!hsa_context.initialized) - { - GOMP_PLUGIN_error ("Attempt to use uninitialized HSA context."); - return NULL; - } - if (n >= hsa_context.agent_count) - { - GOMP_PLUGIN_error ("Request to operate on anon-existing HSA device %i", n); - return NULL; - } - if (!hsa_context.agents[n].initialized) - { - GOMP_PLUGIN_error ("Attempt to use an uninitialized HSA agent."); - return NULL; - } - return &hsa_context.agents[n]; -} - -/* Insert MODULE to the linked list of modules of AGENT. */ - -static void -add_module_to_agent (struct agent_info *agent, struct module_info *module) -{ - if (agent->first_module) - agent->first_module->prev = module; - module->next = agent->first_module; - module->prev = NULL; - agent->first_module = module; -} - -/* Remove MODULE from the linked list of modules of AGENT. */ - -static void -remove_module_from_agent (struct agent_info *agent, struct module_info *module) -{ - if (agent->first_module == module) - agent->first_module = module->next; - if (module->prev) - module->prev->next = module->next; - if (module->next) - module->next->prev = module->prev; -} - -/* Free the HSA program in agent and everything associated with it and set - agent->prog_finalized and the initialized flags of all kernels to false. - Return TRUE on success. */ - -static bool -destroy_hsa_program (struct agent_info *agent) -{ - if (!agent->prog_finalized || agent->prog_finalized_error) - return true; - - hsa_status_t status; - - HSA_DEBUG ("Destroying the current HSA program.\n"); - - status = hsa_fns.hsa_executable_destroy_fn (agent->executable); - if (status != HSA_STATUS_SUCCESS) - return hsa_error ("Could not destroy HSA executable", status); - - struct module_info *module; - for (module = agent->first_module; module; module = module->next) - { - int i; - for (i = 0; i < module->kernel_count; i++) - module->kernels[i].initialized = false; - } - agent->prog_finalized = false; - return true; -} - -/* Initialize KERNEL from D and other parameters. Return true on success. */ - -static bool -init_basic_kernel_info (struct kernel_info *kernel, - struct hsa_kernel_description *d, - struct agent_info *agent, - struct module_info *module) -{ - kernel->agent = agent; - kernel->module = module; - kernel->name = d->name; - kernel->omp_data_size = d->omp_data_size; - kernel->gridified_kernel_p = d->gridified_kernel_p; - kernel->dependencies_count = d->kernel_dependencies_count; - kernel->dependencies = d->kernel_dependencies; - if (pthread_mutex_init (&kernel->init_mutex, NULL)) - { - GOMP_PLUGIN_error ("Failed to initialize an HSA kernel mutex"); - return false; - } - return true; -} - -/* Part of the libgomp plugin interface. Load BRIG module described by struct - brig_image_desc in TARGET_DATA and return references to kernel descriptors - in TARGET_TABLE. */ - -int -GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data, - struct addr_pair **target_table) -{ - if (GOMP_VERSION_DEV (version) > GOMP_VERSION_HSA) - { - GOMP_PLUGIN_error ("Offload data incompatible with HSA plugin" - " (expected %u, received %u)", - GOMP_VERSION_HSA, GOMP_VERSION_DEV (version)); - return -1; - } - - struct brig_image_desc *image_desc = (struct brig_image_desc *) target_data; - struct agent_info *agent; - struct addr_pair *pair; - struct module_info *module; - struct kernel_info *kernel; - int kernel_count = image_desc->kernel_count; - - agent = get_agent_info (ord); - if (!agent) - return -1; - - if (pthread_rwlock_wrlock (&agent->modules_rwlock)) - { - GOMP_PLUGIN_error ("Unable to write-lock an HSA agent rwlock"); - return -1; - } - if (agent->prog_finalized - && !destroy_hsa_program (agent)) - return -1; - - HSA_DEBUG ("Encountered %d kernels in an image\n", kernel_count); - pair = GOMP_PLUGIN_malloc (kernel_count * sizeof (struct addr_pair)); - *target_table = pair; - module = (struct module_info *) - GOMP_PLUGIN_malloc_cleared (sizeof (struct module_info) - + kernel_count * sizeof (struct kernel_info)); - module->image_desc = image_desc; - module->kernel_count = kernel_count; - - kernel = &module->kernels[0]; - - /* Allocate memory for kernel dependencies. */ - for (unsigned i = 0; i < kernel_count; i++) - { - pair->start = (uintptr_t) kernel; - pair->end = (uintptr_t) (kernel + 1); - - struct hsa_kernel_description *d = &image_desc->kernel_infos[i]; - if (!init_basic_kernel_info (kernel, d, agent, module)) - return -1; - kernel++; - pair++; - } - - add_module_to_agent (agent, module); - if (pthread_rwlock_unlock (&agent->modules_rwlock)) - { - GOMP_PLUGIN_error ("Unable to unlock an HSA agent rwlock"); - return -1; - } - return kernel_count; -} - -/* Add a shared BRIG library from a FILE_NAME to an AGENT. */ - -static struct brig_library_info * -add_shared_library (const char *file_name, struct agent_info *agent) -{ - struct brig_library_info *library = NULL; - - void *f = dlopen (file_name, RTLD_NOW); - void *start = dlsym (f, "__brig_start"); - void *end = dlsym (f, "__brig_end"); - - if (start == NULL || end == NULL) - return NULL; - - unsigned size = end - start; - char *buf = (char *) GOMP_PLUGIN_malloc (size); - memcpy (buf, start, size); - - library = GOMP_PLUGIN_malloc (sizeof (struct agent_info)); - library->file_name = (char *) GOMP_PLUGIN_malloc - ((strlen (file_name) + 1)); - strcpy (library->file_name, file_name); - library->image = (hsa_ext_module_t) buf; - - return library; -} - -/* Release memory used for BRIG shared libraries that correspond - to an AGENT. */ - -static void -release_agent_shared_libraries (struct agent_info *agent) -{ - for (unsigned i = 0; i < agent->brig_libraries_count; i++) - if (agent->brig_libraries[i]) - { - free (agent->brig_libraries[i]->file_name); - free (agent->brig_libraries[i]->image); - free (agent->brig_libraries[i]); - } - - free (agent->brig_libraries); -} - -/* Create and finalize the program consisting of all loaded modules. */ - -static void -create_and_finalize_hsa_program (struct agent_info *agent) -{ - hsa_status_t status; - hsa_ext_program_t prog_handle; - int mi = 0; - - if (pthread_mutex_lock (&agent->prog_mutex)) - GOMP_PLUGIN_fatal ("Could not lock an HSA agent program mutex"); - if (agent->prog_finalized) - goto final; - - status = hsa_fns.hsa_ext_program_create_fn - (HSA_MACHINE_MODEL_LARGE, HSA_PROFILE_FULL, - HSA_DEFAULT_FLOAT_ROUNDING_MODE_DEFAULT, - NULL, &prog_handle); - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Could not create an HSA program", status); - - HSA_DEBUG ("Created a finalized program\n"); - - struct module_info *module = agent->first_module; - while (module) - { - status = hsa_fns.hsa_ext_program_add_module_fn - (prog_handle, module->image_desc->brig_module); - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Could not add a module to the HSA program", status); - module = module->next; - mi++; - } - - /* Load all shared libraries. */ - const char *libraries[] = { "libhsamath.so", "libhsastd.so" }; - const unsigned libraries_count = sizeof (libraries) / sizeof (const char *); - - agent->brig_libraries_count = libraries_count; - agent->brig_libraries = GOMP_PLUGIN_malloc_cleared - (sizeof (struct brig_library_info) * libraries_count); - - for (unsigned i = 0; i < libraries_count; i++) - { - struct brig_library_info *library = add_shared_library (libraries[i], - agent); - if (library == NULL) - { - HSA_WARNING ("Could not open a shared BRIG library: %s\n", - libraries[i]); - continue; - } - - status = hsa_fns.hsa_ext_program_add_module_fn (prog_handle, - library->image); - if (status != HSA_STATUS_SUCCESS) - hsa_warn ("Could not add a shared BRIG library the HSA program", - status); - else - HSA_DEBUG ("a shared BRIG library has been added to a program: %s\n", - libraries[i]); - } - - hsa_ext_control_directives_t control_directives; - memset (&control_directives, 0, sizeof (control_directives)); - hsa_code_object_t code_object; - status = hsa_fns.hsa_ext_program_finalize_fn - (prog_handle, agent->isa,HSA_EXT_FINALIZER_CALL_CONVENTION_AUTO, - control_directives, "", HSA_CODE_OBJECT_TYPE_PROGRAM, &code_object); - if (status != HSA_STATUS_SUCCESS) - { - hsa_warn ("Finalization of the HSA program failed", status); - goto failure; - } - - HSA_DEBUG ("Finalization done\n"); - hsa_fns.hsa_ext_program_destroy_fn (prog_handle); - - status - = hsa_fns.hsa_executable_create_fn (HSA_PROFILE_FULL, - HSA_EXECUTABLE_STATE_UNFROZEN, - "", &agent->executable); - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Could not create HSA executable", status); - - module = agent->first_module; - while (module) - { - /* Initialize all global variables declared in the module. */ - for (unsigned i = 0; i < module->image_desc->global_variable_count; i++) - { - struct global_var_info *var; - var = &module->image_desc->global_variables[i]; - status = hsa_fns.hsa_executable_global_variable_define_fn - (agent->executable, var->name, var->address); - - HSA_DEBUG ("Defining global variable: %s, address: %p\n", var->name, - var->address); - - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Could not define a global variable in the HSA program", - status); - } - - module = module->next; - } - - status = hsa_fns.hsa_executable_load_code_object_fn (agent->executable, - agent->id, - code_object, ""); - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Could not add a code object to the HSA executable", status); - status = hsa_fns.hsa_executable_freeze_fn (agent->executable, ""); - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Could not freeze the HSA executable", status); - - HSA_DEBUG ("Froze HSA executable with the finalized code object\n"); - - /* If all goes good, jump to final. */ - goto final; - -failure: - agent->prog_finalized_error = true; - -final: - agent->prog_finalized = true; - - if (pthread_mutex_unlock (&agent->prog_mutex)) - GOMP_PLUGIN_fatal ("Could not unlock an HSA agent program mutex"); -} - -/* Create kernel dispatch data structure for given KERNEL. */ - -static struct GOMP_hsa_kernel_dispatch * -create_single_kernel_dispatch (struct kernel_info *kernel, - unsigned omp_data_size) -{ - struct agent_info *agent = kernel->agent; - struct GOMP_hsa_kernel_dispatch *shadow - = GOMP_PLUGIN_malloc_cleared (sizeof (struct GOMP_hsa_kernel_dispatch)); - - shadow->queue = agent->command_q; - shadow->omp_data_memory - = omp_data_size > 0 ? GOMP_PLUGIN_malloc (omp_data_size) : NULL; - unsigned dispatch_count = kernel->dependencies_count; - shadow->kernel_dispatch_count = dispatch_count; - - shadow->children_dispatches - = GOMP_PLUGIN_malloc (dispatch_count * sizeof (shadow)); - - shadow->object = kernel->object; - - hsa_signal_t sync_signal; - hsa_status_t status = hsa_fns.hsa_signal_create_fn (1, 0, NULL, &sync_signal); - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Error creating the HSA sync signal", status); - - shadow->signal = sync_signal.handle; - shadow->private_segment_size = kernel->private_segment_size; - shadow->group_segment_size = kernel->group_segment_size; - - status - = hsa_fns.hsa_memory_allocate_fn (agent->kernarg_region, - kernel->kernarg_segment_size, - &shadow->kernarg_address); - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Could not allocate memory for HSA kernel arguments", status); - - return shadow; -} - -/* Release data structure created for a kernel dispatch in SHADOW argument. */ - -static void -release_kernel_dispatch (struct GOMP_hsa_kernel_dispatch *shadow) -{ - HSA_DEBUG ("Released kernel dispatch: %p has value: %" PRIu64 " (%p)\n", - shadow, (print_uint64_t) shadow->debug, - (void *) (uintptr_t) shadow->debug); - - hsa_fns.hsa_memory_free_fn (shadow->kernarg_address); - - hsa_signal_t s; - s.handle = shadow->signal; - hsa_fns.hsa_signal_destroy_fn (s); - - free (shadow->omp_data_memory); - - for (unsigned i = 0; i < shadow->kernel_dispatch_count; i++) - release_kernel_dispatch (shadow->children_dispatches[i]); - - free (shadow->children_dispatches); - free (shadow); -} - -/* Initialize a KERNEL without its dependencies. MAX_OMP_DATA_SIZE is used - to calculate maximum necessary memory for OMP data allocation. */ - -static void -init_single_kernel (struct kernel_info *kernel, unsigned *max_omp_data_size) -{ - hsa_status_t status; - struct agent_info *agent = kernel->agent; - hsa_executable_symbol_t kernel_symbol; - status = hsa_fns.hsa_executable_get_symbol_fn (agent->executable, NULL, - kernel->name, agent->id, - 0, &kernel_symbol); - if (status != HSA_STATUS_SUCCESS) - { - hsa_warn ("Could not find symbol for kernel in the code object", status); - goto failure; - } - HSA_DEBUG ("Located kernel %s\n", kernel->name); - status = hsa_fns.hsa_executable_symbol_get_info_fn - (kernel_symbol, HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_OBJECT, &kernel->object); - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Could not extract a kernel object from its symbol", status); - status = hsa_fns.hsa_executable_symbol_get_info_fn - (kernel_symbol, HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE, - &kernel->kernarg_segment_size); - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Could not get info about kernel argument size", status); - status = hsa_fns.hsa_executable_symbol_get_info_fn - (kernel_symbol, HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_GROUP_SEGMENT_SIZE, - &kernel->group_segment_size); - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Could not get info about kernel group segment size", status); - status = hsa_fns.hsa_executable_symbol_get_info_fn - (kernel_symbol, HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_PRIVATE_SEGMENT_SIZE, - &kernel->private_segment_size); - if (status != HSA_STATUS_SUCCESS) - hsa_fatal ("Could not get info about kernel private segment size", - status); - - HSA_DEBUG ("Kernel structure for %s fully initialized with " - "following segment sizes: \n", kernel->name); - HSA_DEBUG (" group_segment_size: %u\n", - (unsigned) kernel->group_segment_size); - HSA_DEBUG (" private_segment_size: %u\n", - (unsigned) kernel->private_segment_size); - HSA_DEBUG (" kernarg_segment_size: %u\n", - (unsigned) kernel->kernarg_segment_size); - HSA_DEBUG (" omp_data_size: %u\n", kernel->omp_data_size); - HSA_DEBUG (" gridified_kernel_p: %u\n", kernel->gridified_kernel_p); - - if (kernel->omp_data_size > *max_omp_data_size) - *max_omp_data_size = kernel->omp_data_size; - - for (unsigned i = 0; i < kernel->dependencies_count; i++) - { - struct kernel_info *dependency - = get_kernel_for_agent (agent, kernel->dependencies[i]); - - if (dependency == NULL) - { - HSA_DEBUG ("Could not find a dependency for a kernel: %s, " - "dependency name: %s\n", kernel->name, - kernel->dependencies[i]); - goto failure; - } - - if (dependency->dependencies_count > 0) - { - HSA_DEBUG ("HSA does not allow kernel dispatching code with " - "a depth bigger than one\n"); - goto failure; - } - - init_single_kernel (dependency, max_omp_data_size); - } - - return; - -failure: - kernel->initialization_failed = true; -} - -/* Indent stream F by INDENT spaces. */ - -static void -indent_stream (FILE *f, unsigned indent) -{ - fprintf (f, "%*s", indent, ""); -} - -/* Dump kernel DISPATCH data structure and indent it by INDENT spaces. */ - -static void -print_kernel_dispatch (struct GOMP_hsa_kernel_dispatch *dispatch, unsigned indent) -{ - indent_stream (stderr, indent); - fprintf (stderr, "this: %p\n", dispatch); - indent_stream (stderr, indent); - fprintf (stderr, "queue: %p\n", dispatch->queue); - indent_stream (stderr, indent); - fprintf (stderr, "omp_data_memory: %p\n", dispatch->omp_data_memory); - indent_stream (stderr, indent); - fprintf (stderr, "kernarg_address: %p\n", dispatch->kernarg_address); - indent_stream (stderr, indent); - fprintf (stderr, "object: %" PRIu64 "\n", (print_uint64_t) dispatch->object); - indent_stream (stderr, indent); - fprintf (stderr, "signal: %" PRIu64 "\n", (print_uint64_t) dispatch->signal); - indent_stream (stderr, indent); - fprintf (stderr, "private_segment_size: %u\n", - dispatch->private_segment_size); - indent_stream (stderr, indent); - fprintf (stderr, "group_segment_size: %u\n", - dispatch->group_segment_size); - indent_stream (stderr, indent); - fprintf (stderr, "children dispatches: %" PRIu64 "\n", - (print_uint64_t) dispatch->kernel_dispatch_count); - indent_stream (stderr, indent); - fprintf (stderr, "omp_num_threads: %u\n", - dispatch->omp_num_threads); - fprintf (stderr, "\n"); - - for (unsigned i = 0; i < dispatch->kernel_dispatch_count; i++) - print_kernel_dispatch (dispatch->children_dispatches[i], indent + 2); -} - -/* Create kernel dispatch data structure for a KERNEL and all its - dependencies. */ - -static struct GOMP_hsa_kernel_dispatch * -create_kernel_dispatch (struct kernel_info *kernel, unsigned omp_data_size) -{ - struct GOMP_hsa_kernel_dispatch *shadow - = create_single_kernel_dispatch (kernel, omp_data_size); - shadow->omp_num_threads = 64; - shadow->debug = 0; - shadow->omp_level = kernel->gridified_kernel_p ? 1 : 0; - - /* Create kernel dispatch data structures. We do not allow to have - a kernel dispatch with depth bigger than one. */ - for (unsigned i = 0; i < kernel->dependencies_count; i++) - { - struct kernel_info *dependency - = get_kernel_for_agent (kernel->agent, kernel->dependencies[i]); - shadow->children_dispatches[i] - = create_single_kernel_dispatch (dependency, omp_data_size); - shadow->children_dispatches[i]->queue - = kernel->agent->kernel_dispatch_command_q; - shadow->children_dispatches[i]->omp_level = 1; - } - - return shadow; -} - -/* Do all the work that is necessary before running KERNEL for the first time. - The function assumes the program has been created, finalized and frozen by - create_and_finalize_hsa_program. */ - -static void -init_kernel (struct kernel_info *kernel) -{ - if (pthread_mutex_lock (&kernel->init_mutex)) - GOMP_PLUGIN_fatal ("Could not lock an HSA kernel initialization mutex"); - if (kernel->initialized) - { - if (pthread_mutex_unlock (&kernel->init_mutex)) - GOMP_PLUGIN_fatal ("Could not unlock an HSA kernel initialization " - "mutex"); - - return; - } - - /* Precomputed maximum size of OMP data necessary for a kernel from kernel - dispatch operation. */ - init_single_kernel (kernel, &kernel->max_omp_data_size); - - if (!kernel->initialization_failed) - HSA_DEBUG ("\n"); - - kernel->initialized = true; - if (pthread_mutex_unlock (&kernel->init_mutex)) - GOMP_PLUGIN_fatal ("Could not unlock an HSA kernel initialization " - "mutex"); -} - -/* Parse the target attributes INPUT provided by the compiler and return true - if we should run anything all. If INPUT is NULL, fill DEF with default - values, then store INPUT or DEF into *RESULT. */ - -static bool -parse_target_attributes (void **input, - struct GOMP_kernel_launch_attributes *def, - struct GOMP_kernel_launch_attributes **result) -{ - if (!input) - GOMP_PLUGIN_fatal ("No target arguments provided"); - - bool attrs_found = false; - while (*input) - { - uintptr_t id = (uintptr_t) *input; - if ((id & GOMP_TARGET_ARG_DEVICE_MASK) == GOMP_DEVICE_HSA - && ((id & GOMP_TARGET_ARG_ID_MASK) - == GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES)) - { - input++; - attrs_found = true; - break; - } - - if (id & GOMP_TARGET_ARG_SUBSEQUENT_PARAM) - input++; - input++; - } - - if (!attrs_found) - { - def->ndim = 1; - def->gdims[0] = 1; - def->gdims[1] = 1; - def->gdims[2] = 1; - def->wdims[0] = 1; - def->wdims[1] = 1; - def->wdims[2] = 1; - *result = def; - HSA_DEBUG ("GOMP_OFFLOAD_run called with no launch attributes\n"); - return true; - } - - struct GOMP_kernel_launch_attributes *kla; - kla = (struct GOMP_kernel_launch_attributes *) *input; - *result = kla; - if (kla->ndim == 0 || kla->ndim > 3) - GOMP_PLUGIN_fatal ("Invalid number of dimensions (%u)", kla->ndim); - - HSA_DEBUG ("GOMP_OFFLOAD_run called with %u dimensions:\n", kla->ndim); - unsigned i; - for (i = 0; i < kla->ndim; i++) - { - HSA_DEBUG (" Dimension %u: grid size %u and group size %u\n", i, - kla->gdims[i], kla->wdims[i]); - if (kla->gdims[i] == 0) - return false; - } - return true; -} - -/* Return the group size given the requested GROUP size, GRID size and number - of grid dimensions NDIM. */ - -static uint32_t -get_group_size (uint32_t ndim, uint32_t grid, uint32_t group) -{ - if (group == 0) - { - /* TODO: Provide a default via environment or device characteristics. */ - if (ndim == 1) - group = 64; - else if (ndim == 2) - group = 8; - else - group = 4; - } - - if (group > grid) - group = grid; - return group; -} - -/* Return true if the HSA runtime can run function FN_PTR. */ - -bool -GOMP_OFFLOAD_can_run (void *fn_ptr) -{ - struct kernel_info *kernel = (struct kernel_info *) fn_ptr; - struct agent_info *agent = kernel->agent; - create_and_finalize_hsa_program (agent); - - if (agent->prog_finalized_error) - goto failure; - - init_kernel (kernel); - if (kernel->initialization_failed) - goto failure; - - return true; - -failure: - if (suppress_host_fallback) - GOMP_PLUGIN_fatal ("HSA host fallback has been suppressed"); - HSA_DEBUG ("HSA target cannot be launched, doing a host fallback\n"); - return false; -} - -/* Atomically store pair of uint16_t values (HEADER and REST) to a PACKET. */ - -void -packet_store_release (uint32_t* packet, uint16_t header, uint16_t rest) -{ - __atomic_store_n (packet, header | (rest << 16), __ATOMIC_RELEASE); -} - -/* Run KERNEL on its agent, pass VARS to it as arguments and take - launchattributes from KLA. */ - -void -run_kernel (struct kernel_info *kernel, void *vars, - struct GOMP_kernel_launch_attributes *kla) -{ - struct agent_info *agent = kernel->agent; - if (pthread_rwlock_rdlock (&agent->modules_rwlock)) - GOMP_PLUGIN_fatal ("Unable to read-lock an HSA agent rwlock"); - - if (!agent->initialized) - GOMP_PLUGIN_fatal ("Agent must be initialized"); - - if (!kernel->initialized) - GOMP_PLUGIN_fatal ("Called kernel must be initialized"); - - struct GOMP_hsa_kernel_dispatch *shadow - = create_kernel_dispatch (kernel, kernel->max_omp_data_size); - - if (debug) - { - fprintf (stderr, "\nKernel has following dependencies:\n"); - print_kernel_dispatch (shadow, 2); - } - - uint64_t index - = hsa_fns.hsa_queue_add_write_index_release_fn (agent->command_q, 1); - HSA_DEBUG ("Got AQL index %llu\n", (long long int) index); - - /* Wait until the queue is not full before writing the packet. */ - while (index - hsa_fns.hsa_queue_load_read_index_acquire_fn (agent->command_q) - >= agent->command_q->size) - ; - - hsa_kernel_dispatch_packet_t *packet; - packet = ((hsa_kernel_dispatch_packet_t *) agent->command_q->base_address) - + index % agent->command_q->size; - - memset (((uint8_t *) packet) + 4, 0, sizeof (*packet) - 4); - packet->grid_size_x = kla->gdims[0]; - packet->workgroup_size_x = get_group_size (kla->ndim, kla->gdims[0], - kla->wdims[0]); - - if (kla->ndim >= 2) - { - packet->grid_size_y = kla->gdims[1]; - packet->workgroup_size_y = get_group_size (kla->ndim, kla->gdims[1], - kla->wdims[1]); - } - else - { - packet->grid_size_y = 1; - packet->workgroup_size_y = 1; - } - - if (kla->ndim == 3) - { - packet->grid_size_z = kla->gdims[2]; - packet->workgroup_size_z = get_group_size (kla->ndim, kla->gdims[2], - kla->wdims[2]); - } - else - { - packet->grid_size_z = 1; - packet->workgroup_size_z = 1; - } - - packet->private_segment_size = kernel->private_segment_size; - packet->group_segment_size = kernel->group_segment_size; - packet->kernel_object = kernel->object; - packet->kernarg_address = shadow->kernarg_address; - hsa_signal_t s; - s.handle = shadow->signal; - packet->completion_signal = s; - hsa_fns.hsa_signal_store_relaxed_fn (s, 1); - memcpy (shadow->kernarg_address, &vars, sizeof (vars)); - - /* PR hsa/70337. */ - size_t vars_size = sizeof (vars); - if (kernel->kernarg_segment_size > vars_size) - { - if (kernel->kernarg_segment_size != vars_size - + sizeof (struct hsa_kernel_runtime *)) - GOMP_PLUGIN_fatal ("Kernel segment size has an unexpected value"); - memcpy (packet->kernarg_address + vars_size, &shadow, - sizeof (struct hsa_kernel_runtime *)); - } - - HSA_DEBUG ("Copying kernel runtime pointer to kernarg_address\n"); - - uint16_t header; - header = HSA_PACKET_TYPE_KERNEL_DISPATCH << HSA_PACKET_HEADER_TYPE; - header |= HSA_FENCE_SCOPE_SYSTEM << HSA_PACKET_HEADER_ACQUIRE_FENCE_SCOPE; - header |= HSA_FENCE_SCOPE_SYSTEM << HSA_PACKET_HEADER_RELEASE_FENCE_SCOPE; - - HSA_DEBUG ("Going to dispatch kernel %s\n", kernel->name); - - packet_store_release ((uint32_t *) packet, header, - (uint16_t) kla->ndim << HSA_KERNEL_DISPATCH_PACKET_SETUP_DIMENSIONS); - - hsa_fns.hsa_signal_store_release_fn (agent->command_q->doorbell_signal, - index); - - /* TODO: GPU agents in Carrizo APUs cannot properly update L2 cache for - signal wait and signal load operations on their own and we need to - periodically call the hsa_signal_load_acquire on completion signals of - children kernels in the CPU to make that happen. As soon the - limitation will be resolved, this workaround can be removed. */ - - HSA_DEBUG ("Kernel dispatched, waiting for completion\n"); - - /* Root signal waits with 1ms timeout. */ - while (hsa_fns.hsa_signal_wait_acquire_fn (s, HSA_SIGNAL_CONDITION_LT, 1, - 1000 * 1000, - HSA_WAIT_STATE_BLOCKED) != 0) - for (unsigned i = 0; i < shadow->kernel_dispatch_count; i++) - { - hsa_signal_t child_s; - child_s.handle = shadow->children_dispatches[i]->signal; - - HSA_DEBUG ("Waiting for children completion signal: %" PRIu64 "\n", - (print_uint64_t) shadow->children_dispatches[i]->signal); - hsa_fns.hsa_signal_load_acquire_fn (child_s); - } - - release_kernel_dispatch (shadow); - - if (pthread_rwlock_unlock (&agent->modules_rwlock)) - GOMP_PLUGIN_fatal ("Unable to unlock an HSA agent rwlock"); -} - -/* Part of the libgomp plugin interface. Run a kernel on device N (the number - is actually ignored, we assume the FN_PTR has been mapped using the correct - device) and pass it an array of pointers in VARS as a parameter. The kernel - is identified by FN_PTR which must point to a kernel_info structure. */ - -void -GOMP_OFFLOAD_run (int n __attribute__((unused)), - void *fn_ptr, void *vars, void **args) -{ - struct kernel_info *kernel = (struct kernel_info *) fn_ptr; - struct GOMP_kernel_launch_attributes def; - struct GOMP_kernel_launch_attributes *kla; - if (!parse_target_attributes (args, &def, &kla)) - { - HSA_DEBUG ("Will not run HSA kernel because the grid size is zero\n"); - return; - } - run_kernel (kernel, vars, kla); -} - -/* Information to be passed to a thread running a kernel asycnronously. */ - -struct async_run_info -{ - int device; - void *tgt_fn; - void *tgt_vars; - void **args; - void *async_data; -}; - -/* Thread routine to run a kernel asynchronously. */ - -static void * -run_kernel_asynchronously (void *thread_arg) -{ - struct async_run_info *info = (struct async_run_info *) thread_arg; - int device = info->device; - void *tgt_fn = info->tgt_fn; - void *tgt_vars = info->tgt_vars; - void **args = info->args; - void *async_data = info->async_data; - - free (info); - GOMP_OFFLOAD_run (device, tgt_fn, tgt_vars, args); - GOMP_PLUGIN_target_task_completion (async_data); - return NULL; -} - -/* Part of the libgomp plugin interface. Run a kernel like GOMP_OFFLOAD_run - does, but asynchronously and call GOMP_PLUGIN_target_task_completion when it - has finished. */ - -void -GOMP_OFFLOAD_async_run (int device, void *tgt_fn, void *tgt_vars, - void **args, void *async_data) -{ - pthread_t pt; - struct async_run_info *info; - HSA_DEBUG ("GOMP_OFFLOAD_async_run invoked\n"); - info = GOMP_PLUGIN_malloc (sizeof (struct async_run_info)); - - info->device = device; - info->tgt_fn = tgt_fn; - info->tgt_vars = tgt_vars; - info->args = args; - info->async_data = async_data; - - int err = pthread_create (&pt, NULL, &run_kernel_asynchronously, info); - if (err != 0) - GOMP_PLUGIN_fatal ("HSA asynchronous thread creation failed: %s", - strerror (err)); - err = pthread_detach (pt); - if (err != 0) - GOMP_PLUGIN_fatal ("Failed to detach a thread to run HSA kernel " - "asynchronously: %s", strerror (err)); -} - -/* Deinitialize all information associated with MODULE and kernels within - it. Return TRUE on success. */ - -static bool -destroy_module (struct module_info *module) -{ - int i; - for (i = 0; i < module->kernel_count; i++) - if (pthread_mutex_destroy (&module->kernels[i].init_mutex)) - { - GOMP_PLUGIN_error ("Failed to destroy an HSA kernel initialization " - "mutex"); - return false; - } - return true; -} - -/* Part of the libgomp plugin interface. Unload BRIG module described by - struct brig_image_desc in TARGET_DATA from agent number N. Return - TRUE on success. */ - -bool -GOMP_OFFLOAD_unload_image (int n, unsigned version, const void *target_data) -{ - if (GOMP_VERSION_DEV (version) > GOMP_VERSION_HSA) - { - GOMP_PLUGIN_error ("Offload data incompatible with HSA plugin" - " (expected %u, received %u)", - GOMP_VERSION_HSA, GOMP_VERSION_DEV (version)); - return false; - } - - struct agent_info *agent; - agent = get_agent_info (n); - if (!agent) - return false; - - if (pthread_rwlock_wrlock (&agent->modules_rwlock)) - { - GOMP_PLUGIN_error ("Unable to write-lock an HSA agent rwlock"); - return false; - } - struct module_info *module = agent->first_module; - while (module) - { - if (module->image_desc == target_data) - break; - module = module->next; - } - if (!module) - { - GOMP_PLUGIN_error ("Attempt to unload an image that has never been " - "loaded before"); - return false; - } - - remove_module_from_agent (agent, module); - if (!destroy_module (module)) - return false; - free (module); - if (!destroy_hsa_program (agent)) - return false; - if (pthread_rwlock_unlock (&agent->modules_rwlock)) - { - GOMP_PLUGIN_error ("Unable to unlock an HSA agent rwlock"); - return false; - } - return true; -} - -/* Part of the libgomp plugin interface. Deinitialize all information and - status associated with agent number N. We do not attempt any - synchronization, assuming the user and libgomp will not attempt - deinitialization of a device that is in any way being used at the same - time. Return TRUE on success. */ - -bool -GOMP_OFFLOAD_fini_device (int n) -{ - struct agent_info *agent = get_agent_info (n); - if (!agent) - return false; - - if (!agent->initialized) - return true; - - struct module_info *next_module = agent->first_module; - while (next_module) - { - struct module_info *module = next_module; - next_module = module->next; - if (!destroy_module (module)) - return false; - free (module); - } - agent->first_module = NULL; - if (!destroy_hsa_program (agent)) - return false; - - release_agent_shared_libraries (agent); - - hsa_status_t status = hsa_fns.hsa_queue_destroy_fn (agent->command_q); - if (status != HSA_STATUS_SUCCESS) - return hsa_error ("Error destroying command queue", status); - status = hsa_fns.hsa_queue_destroy_fn (agent->kernel_dispatch_command_q); - if (status != HSA_STATUS_SUCCESS) - return hsa_error ("Error destroying kernel dispatch command queue", status); - if (pthread_mutex_destroy (&agent->prog_mutex)) - { - GOMP_PLUGIN_error ("Failed to destroy an HSA agent program mutex"); - return false; - } - if (pthread_rwlock_destroy (&agent->modules_rwlock)) - { - GOMP_PLUGIN_error ("Failed to destroy an HSA agent rwlock"); - return false; - } - agent->initialized = false; - return true; -} - -/* Part of the libgomp plugin interface. Not implemented as it is not required - for HSA. */ - -void * -GOMP_OFFLOAD_alloc (int ord, size_t size) -{ - GOMP_PLUGIN_error ("HSA GOMP_OFFLOAD_alloc is not implemented because " - "it should never be called"); - return NULL; -} - -/* Part of the libgomp plugin interface. Not implemented as it is not required - for HSA. */ - -bool -GOMP_OFFLOAD_free (int ord, void *ptr) -{ - GOMP_PLUGIN_error ("HSA GOMP_OFFLOAD_free is not implemented because " - "it should never be called"); - return false; -} - -/* Part of the libgomp plugin interface. Not implemented as it is not required - for HSA. */ - -bool -GOMP_OFFLOAD_dev2host (int ord, void *dst, const void *src, size_t n) -{ - GOMP_PLUGIN_error ("HSA GOMP_OFFLOAD_dev2host is not implemented because " - "it should never be called"); - return false; -} - -/* Part of the libgomp plugin interface. Not implemented as it is not required - for HSA. */ - -bool -GOMP_OFFLOAD_host2dev (int ord, void *dst, const void *src, size_t n) -{ - GOMP_PLUGIN_error ("HSA GOMP_OFFLOAD_host2dev is not implemented because " - "it should never be called"); - return false; -} - -/* Part of the libgomp plugin interface. Not implemented as it is not required - for HSA. */ - -bool -GOMP_OFFLOAD_dev2dev (int ord, void *dst, const void *src, size_t n) -{ - GOMP_PLUGIN_error ("HSA GOMP_OFFLOAD_dev2dev is not implemented because " - "it should never be called"); - return false; -} diff --git a/libgomp/testsuite/Makefile.in b/libgomp/testsuite/Makefile.in index 58ba1f319b0..26e925b4d34 100644 --- a/libgomp/testsuite/Makefile.in +++ b/libgomp/testsuite/Makefile.in @@ -1,7 +1,7 @@ -# Makefile.in generated by automake 1.15.1 from Makefile.am. +# Makefile.in generated by automake 1.16.1 from Makefile.am. # @configure_input@ -# Copyright (C) 1994-2017 Free Software Foundation, Inc. +# Copyright (C) 1994-2018 Free Software Foundation, Inc. # This Makefile.in is free software; the Free Software Foundation # gives unlimited permission to copy and/or distribute it, @@ -215,10 +215,6 @@ PLUGIN_GCN = @PLUGIN_GCN@ PLUGIN_GCN_CPPFLAGS = @PLUGIN_GCN_CPPFLAGS@ PLUGIN_GCN_LDFLAGS = @PLUGIN_GCN_LDFLAGS@ PLUGIN_GCN_LIBS = @PLUGIN_GCN_LIBS@ -PLUGIN_HSA = @PLUGIN_HSA@ -PLUGIN_HSA_CPPFLAGS = @PLUGIN_HSA_CPPFLAGS@ -PLUGIN_HSA_LDFLAGS = @PLUGIN_HSA_LDFLAGS@ -PLUGIN_HSA_LIBS = @PLUGIN_HSA_LIBS@ PLUGIN_NVPTX = @PLUGIN_NVPTX@ PLUGIN_NVPTX_CPPFLAGS = @PLUGIN_NVPTX_CPPFLAGS@ PLUGIN_NVPTX_LDFLAGS = @PLUGIN_NVPTX_LDFLAGS@ @@ -335,8 +331,8 @@ Makefile: $(srcdir)/Makefile.in $(top_builddir)/config.status *config.status*) \ cd $(top_builddir) && $(MAKE) $(AM_MAKEFLAGS) am--refresh;; \ *) \ - echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe)'; \ - cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__depfiles_maybe);; \ + echo ' cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__maybe_remake_depfiles)'; \ + cd $(top_builddir) && $(SHELL) ./config.status $(subdir)/$@ $(am__maybe_remake_depfiles);; \ esac; $(top_builddir)/config.status: $(top_srcdir)/configure $(CONFIG_STATUS_DEPENDENCIES) diff --git a/libgomp/testsuite/lib/libgomp.exp b/libgomp/testsuite/lib/libgomp.exp index 8ccb78f42c1..5d86e2ac095 100644 --- a/libgomp/testsuite/lib/libgomp.exp +++ b/libgomp/testsuite/lib/libgomp.exp @@ -233,9 +233,6 @@ proc libgomp_init { args } { # Disable caret lappend ALWAYS_CFLAGS "additional_flags=-fno-diagnostics-show-caret" - # Disable HSA warnings by default. - lappend ALWAYS_CFLAGS "additional_flags=-Wno-hsa" - # Disable color diagnostics lappend ALWAYS_CFLAGS "additional_flags=-fdiagnostics-color=never" @@ -325,9 +322,6 @@ proc offload_target_to_openacc_device_type { offload_target } { disable { return "host" } - hsa* { - return "" - } *-intelmic* { return "" } @@ -430,60 +424,6 @@ proc check_effective_target_openacc_host_selected { } { return [string match "host" $openacc_device_type] } -# Return 1 if the selected OMP device is actually a HSA device - -proc check_effective_target_hsa_offloading_selected_nocache {} { - global tool - - set src { - int main () { - int v = 1; - #pragma omp target map(from:v) - v = 0; - return v; - } - } - - set result [check_compile hsa_offloading_src executable $src] - set lines [lindex $result 0] - set exe [lindex $result 1] - - set ok 0 - if { [string match "" $lines] } { - # No error messages, let us switch on HSA debugging output and run it - set prev_HSA_DEBUG [getenv HSA_DEBUG] - setenv HSA_DEBUG "1" - set result [remote_load target "./$exe"] - if { [string match "" $prev_HSA_DEBUG] } { - unsetenv HSA_DEBUG - } else { - setenv HSA_DEBUG $prev_HSA_DEBUG - } - set status [lindex $result 0] - if { $status != "pass" } { - remote_file build delete $exe - verbose "HSA availability test failed" - return 0 - } - set output [lindex $result 1] - if { [string match "*HSA debug: Going to dispatch kernel*" $output] } { - verbose "HSA availability detected" - set ok 1 - } - } - remote_file build delete $exe - return $ok -} - -# Return 1 if the selected OMP device is actually a HSA device and -# cache the result - -proc check_effective_target_hsa_offloading_selected {} { - return [check_cached_effective_target hsa_offloading_selected { - check_effective_target_hsa_offloading_selected_nocache - }] -} - # Return 1 if at least one AMD GPU is accessible. proc check_effective_target_openacc_radeon_accel_present { } { diff --git a/libgomp/testsuite/libgomp.hsa.c/alloca-1.c b/libgomp/testsuite/libgomp.hsa.c/alloca-1.c deleted file mode 100644 index 48dca94a47f..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/alloca-1.c +++ /dev/null @@ -1,25 +0,0 @@ -#define size 10 -int i, j, k; - -int -main () -{ - char *s = __builtin_malloc (size + 1); - -#pragma omp target teams - { -#pragma omp distribute parallel for default(none) private(i) shared(s) - for (i = 0; i < size; ++i) - { - char *buffer = __builtin_alloca (10); - buffer[5] = 97 + i; - s[i] = buffer[5]; - } - } - - for (i = 0; i < size; ++i) - if (s[i] != 97 + i) - __builtin_abort (); - - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/bitfield-1.c b/libgomp/testsuite/libgomp.hsa.c/bitfield-1.c deleted file mode 100644 index 4dbf3481733..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/bitfield-1.c +++ /dev/null @@ -1,160 +0,0 @@ -#include <assert.h> - -#define ASSIGN_SX(N) \ - s##N.a1 = 1; \ - s##N.a2 = 2; \ - s##N.a3 = 3; \ - s##N.a4 = 4; \ - s##N.a5 = 5; \ - s##N.a6 = 6; \ - s##N.a7 = 7; \ - s##N.a8 = 8; \ - s##N.a9 = 9; \ - s##N.a10 = 10; - -#define ASSERT_SX(N) \ - assert (s##N.a1 == 1); \ - assert (s##N.a2 == 2); \ - assert (s##N.a3 == 3); \ - assert (s##N.a4 == 4); \ - assert (s##N.a5 == 5); \ - assert (s##N.a6 == 6); \ - assert (s##N.a7 == 7); \ - assert (s##N.a8 == 8); \ - assert (s##N.a9 == 9); \ - assert (s##N.a10 == 10); - -struct S1 -{ - unsigned a : 10; - unsigned b : 20; -}; - -struct S2 -{ - unsigned a1 : 10; - unsigned a2 : 10; - unsigned a3 : 10; - unsigned a4 : 10; - unsigned a5 : 10; - unsigned a6 : 10; - unsigned a7 : 10; - unsigned a8 : 10; - unsigned a9 : 10; - unsigned a10 : 10; -}; - -struct S3 -{ - unsigned a1 : 10; - unsigned a2 : 9; - unsigned a3 : 8; - unsigned a4 : 7; - unsigned a5 : 6; - unsigned a6 : 5; - unsigned a7 : 6; - unsigned a8 : 7; - unsigned a9 : 8; - unsigned a10 : 9; -}; - -struct S4 -{ - unsigned a1 : 10; - int a2 : 9; - unsigned a3 : 8; - int a4 : 7; - unsigned a5 : 6; - int a6 : 5; - unsigned a7 : 6; - int a8 : 7; - unsigned a9 : 8; - int a10 : 9; -}; - -struct S5 -{ - unsigned a1 : 31; - int a2 : 9; - unsigned a3 : 17; - int a4 : 7; - unsigned a5 : 6; - int a6 : 5; - unsigned long a7 : 55; - int a8 : 7; - unsigned a9 : 8; - int a10 : 9; -}; - -int -main () -{ - struct S1 s1; - -#pragma omp target map(to: s1) - { - s1.a = 2; - s1.b = 3; - } - - assert (s1.a == 2); - assert (s1.b == 3); - - struct S2 s2; - -#pragma omp target map(to: s2) - { - ASSIGN_SX (2) - } - - ASSERT_SX (2) - - struct S3 s3; - -#pragma omp target map(to: s3) - { - ASSIGN_SX (3) - } - - ASSERT_SX (3) - - struct S4 s4; - -#pragma omp target map(to: s4) - { - ASSIGN_SX (4) - } - - ASSERT_SX (4) - - struct S4 s5; - - s5.a1 = 0; - s5.a2 = 1; - s5.a3 = 2; - s5.a4 = 3; - s5.a5 = 4; - s5.a6 = 5; - s5.a7 = 6; - s5.a8 = 7; - s5.a9 = 8; - s5.a10 = 9; - -#pragma omp target map(to: s5) - { - s5.a1++; - s5.a2++; - s5.a3++; - s5.a4++; - s5.a5++; - s5.a6++; - s5.a7++; - s5.a8++; - s5.a9++; - s5.a10++; - } - - ASSERT_SX (5) - - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/bits-insns.c b/libgomp/testsuite/libgomp.hsa.c/bits-insns.c deleted file mode 100644 index 21cac72dac4..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/bits-insns.c +++ /dev/null @@ -1,73 +0,0 @@ -#include <math.h> - -#define N 12 - -int main() -{ - unsigned int arguments[N] = {0u, 1u, 2u, 3u, 111u, 333u, 444u, 0x80000000u, 0x0000ffffu, 0xf0000000u, 0xff000000u, 0xffffffffu}; - int clrsb[N] = {}; - int clz[N] = {}; - int ctz[N] = {}; - int ffs[N] = {}; - int parity[N] = {}; - int popcount[N] = {}; - - int ref_clrsb[N] = {}; - int ref_clz[N] = {}; - int ref_ctz[N] = {}; - int ref_ffs[N] = {}; - int ref_parity[N] = {}; - int ref_popcount[N] = {}; - - for (unsigned i = 0; i < N; i++) - { - ref_clrsb[i] = __builtin_clrsb (arguments[i]); - ref_clz[i] = __builtin_clz (arguments[i]); - ref_ctz[i] = __builtin_ctz (arguments[i]); - ref_ffs[i] = __builtin_ffs (arguments[i]); - ref_parity[i] = __builtin_parity (arguments[i]); - ref_popcount[i] = __builtin_popcount (arguments[i]); - } - - #pragma omp target map(from:clz, ctz, ffs, parity, popcount) - { - for (unsigned i = 0; i < N; i++) - { - clrsb[i] = __builtin_clrsb (arguments[i]); - clz[i] = __builtin_clz (arguments[i]); - ctz[i] = __builtin_ctz (arguments[i]); - ffs[i] = __builtin_ffs (arguments[i]); - parity[i] = __builtin_parity (arguments[i]); - popcount[i] = __builtin_popcount (arguments[i]); - } - } - - for (unsigned i = 0; i < N; i++) - if (ref_clrsb[i] != clrsb[i]) - __builtin_abort (); - - /* CLZ of zero is undefined for zero. */ - for (unsigned i = 1; i < N; i++) - if (ref_clz[i] != clz[i]) - __builtin_abort (); - - /* Likewise for ctz */ - for (unsigned i = 1; i < N; i++) - if (ref_ctz[i] != ctz[i]) - __builtin_abort (); - - for (unsigned i = 0; i < N; i++) - if (ref_ffs[i] != ffs[i]) - __builtin_abort (); - - for (unsigned i = 0; i < N; i++) - if (ref_parity[i] != parity[i]) - __builtin_abort (); - - for (unsigned i = 0; i < N; i++) - if (ref_popcount[i] != popcount[i]) - __builtin_abort (); - - return 0; -} - diff --git a/libgomp/testsuite/libgomp.hsa.c/builtins-1.c b/libgomp/testsuite/libgomp.hsa.c/builtins-1.c deleted file mode 100644 index e603c21afcd..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/builtins-1.c +++ /dev/null @@ -1,97 +0,0 @@ -/* { dg-additional-options "-ffast-math" } */ - -#include <assert.h> -#include <math.h> - -#define N 10 -#define N2 14 - -#define c1 1.2345f -#define c2 1.2345 - -#define DELTA 0.001 - -#define TEST_BIT_BUILTINS(T, S, S2) \ - { \ - T arguments[N2] \ - = {0##S, 1##S, 2##S, 3##S, \ - 111##S, 333##S, 444##S, 0x80000000##S, \ - 0x0000ffff##S, 0xf0000000##S, 0xff000000##S, 0xffffffff##S}; \ - int clrsb[N2] = {}; \ - int clz[N2] = {}; \ - int ctz[N2] = {}; \ - int ffs[N2] = {}; \ - int parity[N2] = {}; \ - int popcount[N2] = {}; \ - \ - _Pragma ("omp target map(to:clz[:N2], ctz[:N2], ffs[:N2], parity[:N2], popcount[:N2])") \ - { \ - for (unsigned i = 0; i < N2; i++) \ - { \ - clrsb[i] = __builtin_clrsb##S2 (arguments[i]); \ - clz[i] = __builtin_clz##S2 (arguments[i]); \ - ctz[i] = __builtin_ctz##S2 (arguments[i]); \ - ffs[i] = __builtin_ffs##S2 (arguments[i]); \ - parity[i] = __builtin_parity##S2 (arguments[i]); \ - popcount[i] = __builtin_popcount##S2 (arguments[i]); \ - } \ - } \ - \ - for (unsigned i = 0; i < N2; i++) \ - { \ - assert (clrsb[i] == __builtin_clrsb##S2 (arguments[i])); \ - if (arguments[0] != 0) \ - { \ - assert (clz[i] == __builtin_clz##S2 (arguments[i])); \ - assert (ctz[i] == __builtin_ctz##S2 (arguments[i])); \ - } \ - assert (ffs[i] == __builtin_ffs##S2 (arguments[i])); \ - assert (parity[i] == __builtin_parity##S2 (arguments[i])); \ - assert (popcount[i] == __builtin_popcount##S2 (arguments[i])); \ - } \ - } - -#define ASSERT(v1, v2) assert (fabs (v1 - v2) < DELTA) - -int -main () -{ - float f[N] = {}; - float d[N] = {}; - -/* 1) test direct mapping to HSA insns. */ - -#pragma omp target map(to: f[ : N], d[ : N]) - { - f[0] = sinf (c1); - f[1] = cosf (c1); - f[2] = exp2f (c1); - f[3] = log2f (c1); - f[4] = truncf (c1); - f[5] = sqrtf (c1); - - d[0] = trunc (c2); - d[1] = sqrt (c2); - } - - ASSERT (f[0], sinf (c1)); - ASSERT (f[1], cosf (c1)); - ASSERT (f[2], exp2f (c1)); - ASSERT (f[3], log2f (c1)); - ASSERT (f[4], truncf (c1)); - ASSERT (f[5], sqrtf (c1)); - - ASSERT (d[0], trunc (c2)); - ASSERT (d[1], sqrt (c2)); - - /* 2) test bit builtins for unsigned int. */ - TEST_BIT_BUILTINS (int, , ); - - /* 3) test bit builtins for unsigned long int. */ - TEST_BIT_BUILTINS (long, l, l); - - /* 4) test bit builtins for unsigned long long int. */ - TEST_BIT_BUILTINS (long long, ll, ll); - - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/c.exp b/libgomp/testsuite/libgomp.hsa.c/c.exp deleted file mode 100644 index 4614192320a..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/c.exp +++ /dev/null @@ -1,42 +0,0 @@ -if [info exists lang_library_path] then { - unset lang_library_path - unset lang_link_flags -} -if [info exists lang_test_file] then { - unset lang_test_file -} -if [info exists lang_include_flags] then { - unset lang_include_flags -} - -load_lib libgomp-dg.exp -load_gcc_lib gcc-dg.exp - -# Initialize dg. -dg-init - -# Turn on OpenMP. -lappend ALWAYS_CFLAGS "additional_flags=-fopenmp" - -set ld_library_path $always_ld_library_path -append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST] -set_ld_library_path_env_vars - -global DEFAULT_CFLAGS -if [info exists DEFAULT_CFLAGS] then { - set CFLAGS_list [list "-O0" $DEFAULT_CFLAGS] -} else { - set CFLAGS_list [list "-O0" "-O2"] -} - -if [check_effective_target_hsa_offloading_selected] { - foreach USE_CFLAGS $CFLAGS_list { - # Gather a list of all tests. - set tests [lsort [find $srcdir/$subdir *.c]] - # Main loop. - dg-runtest $tests "" [concat $USE_CFLAGS "-Whsa"] - } -} - -# All done. -dg-finish diff --git a/libgomp/testsuite/libgomp.hsa.c/complex-1.c b/libgomp/testsuite/libgomp.hsa.c/complex-1.c deleted file mode 100644 index 438c64a1593..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/complex-1.c +++ /dev/null @@ -1,65 +0,0 @@ -#include <assert.h> -#include <complex.h> -#include <math.h> - -#define uchar unsigned char -#define C 123 - -#define TEST(type) \ - type foo_##type (void) \ - { \ - _Complex type a = C + 45I; \ - return __real__ a; \ - } - -#pragma omp declare target -TEST (char) -TEST (uchar) -TEST (short) -TEST (int) - -float -bar (float a, float b) -{ - _Complex float c = a + b * I; - - c += 11.f + 12.f * I; - - _Complex float d = 2.f + 4.44f * I; - - return __real__(crealf (c + d) + cimag (d) * I); -} - -#pragma omp end declare target - -int -main (void) -{ - int v = 0; - float v2 = 0.0f; - -#pragma omp target map(to: v) - v = foo_char (); - - assert (v == C); - -#pragma omp target map(to: v) - v = foo_uchar (); - - assert (v == C); - -#pragma omp target map(to: v) - v = foo_short (); - - assert (v == C); - -#pragma omp target map(to: v) - v = foo_int (); - - assert (v == C); - -#pragma omp target map(to: v2) - v2 = bar (1.12f, 4.44f); - - assert (fabs (v2 - 14.12) < 0.0001f); -} diff --git a/libgomp/testsuite/libgomp.hsa.c/complex-align-2.c b/libgomp/testsuite/libgomp.hsa.c/complex-align-2.c deleted file mode 100644 index b2d7acff443..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/complex-align-2.c +++ /dev/null @@ -1,27 +0,0 @@ -#pragma omp declare target - _Complex int *g; -#pragma omp end declare target - - - -_Complex float f(void); - -int -main () -{ - _Complex int y; -#pragma omp target map(from:y) - { - _Complex int x; - g = &x; - __imag__ x = 1; - __real__ x = 2; - y = x; - } - - if ((__imag__ y != 1) - || (__real__ y != 2)) - __builtin_abort (); - return 0; -} - diff --git a/libgomp/testsuite/libgomp.hsa.c/formal-actual-args-1.c b/libgomp/testsuite/libgomp.hsa.c/formal-actual-args-1.c deleted file mode 100644 index 058a036d371..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/formal-actual-args-1.c +++ /dev/null @@ -1,83 +0,0 @@ -#include <assert.h> - -struct Cube -{ - int x; - int y; - int z; -}; - -#pragma omp declare target -int -foo (short a) -{ - switch (a) - { - case 1: - return 11; - break; - case 33: - return 333; - break; - case 55: - return 55; - break; - default: - return -1; - } -} - -int -bar (int a) -{ - int *ptr = &a; - - *ptr = 100; - return a + *ptr; -} - -struct Cube -baz (struct Cube c) -{ - c.x = 11; - return c; -} - -#pragma omp end declare target - -#define s 100 - -int -main (int argc) -{ - /* Test 1: argument types: char to short. */ - - int array[s]; -#pragma omp target map(tofrom : array[ : s]) - { - for (char i = 0; i < s; i++) - array[i] = foo (i); - } - - for (int i = 0; i < s; i++) - assert (array[i] == foo (i)); - - /* Test 2: argument address is taken. */ - int v = 2; - -#pragma omp target map(tofrom : v) - v = bar (v); - - assert (v == 200); - - /* Test 3: passing a structure as a function argument. */ - struct Cube r; - struct Cube c = {.x = 1, .y = 2, .z = 3}; - -#pragma omp target map(to : r) map(from : c) - r = baz (c); - - assert (r.x == 11); - assert (r.y == c.y); - assert (r.z == c.z); -} diff --git a/libgomp/testsuite/libgomp.hsa.c/function-call-1.c b/libgomp/testsuite/libgomp.hsa.c/function-call-1.c deleted file mode 100644 index 7f15dff96b9..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/function-call-1.c +++ /dev/null @@ -1,50 +0,0 @@ -#define size 8 - -#pragma omp declare target -int -identity (int x) -{ - return x; -} - -int -expx (int x, int n) -{ - for (int i = 0; i < n - 1; i++) - x *= x; - - return x; -} - -float -init (int x, int y) -{ - int x1 = identity (identity (identity (identity (x)))); - int y1 = identity (identity (identity (identity (y)))); - - int x2 = expx (x1, 2); - int y2 = expx (y1, 2); - - return (x2 + y2); -} -#pragma omp end declare target - -int -main () -{ - int i, j; - int a[size][size]; - -#pragma omp target teams map(to:a[:size][:size]) -#pragma omp distribute parallel for default(none) private(i, j) shared(a) - for (i = 0; i < size; ++i) - for (j = 0; j < size; ++j) - a[i][j] = init (i, j); - - for (i = 0; i < size; ++i) - for (j = 0; j < size; ++j) - if (i * i + j * j != a[i][j]) - __builtin_abort (); - - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/get-level-1.c b/libgomp/testsuite/libgomp.hsa.c/get-level-1.c deleted file mode 100644 index 81c9df00276..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/get-level-1.c +++ /dev/null @@ -1,26 +0,0 @@ -#include <omp.h> - -int -main () -{ - int i; - int level = -1; - -#pragma omp target map(tofrom : level) - { - level = omp_get_level (); - } - - if (level != 0) - __builtin_abort (); - -#pragma omp target teams map(tofrom : level) -#pragma omp distribute parallel for default(none) private(i) shared(level) - for (i = 0; i < 1; ++i) - level += omp_get_level (); - - if (level != 1) - __builtin_abort (); - - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/gridify-1.c b/libgomp/testsuite/libgomp.hsa.c/gridify-1.c deleted file mode 100644 index b670b9b654c..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/gridify-1.c +++ /dev/null @@ -1,26 +0,0 @@ -void __attribute__((noinline, noclone)) -foo (int n, int *a, int workgroup_size) -{ - int i; -#pragma omp target -#pragma omp teams thread_limit(workgroup_size) -#pragma omp distribute parallel for shared(a) firstprivate(n) private(i) - for (i = 0; i < n; i++) - a[i]++; -} - -int main (int argc, char **argv) -{ - int n = 32; - int *a = __builtin_malloc (sizeof (int) * n); - int i; - - __builtin_memset (a, 0, sizeof (int) * n); - foo (n, a, 32); - for (i = 0; i < n; i ++) - { - if (a[i] != 1) - __builtin_abort (); - } - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/gridify-2.c b/libgomp/testsuite/libgomp.hsa.c/gridify-2.c deleted file mode 100644 index 3692eb0d11c..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/gridify-2.c +++ /dev/null @@ -1,26 +0,0 @@ -void __attribute__((noinline, noclone)) -foo (int j, int n, int *a) -{ - int i; -#pragma omp target -#pragma omp teams -#pragma omp distribute parallel for shared(a) firstprivate(n) private(i) firstprivate(j) - for (i = j + 1; i < n; i++) - a[i] = i; -} - -int main (int argc, char **argv) -{ - int n = 32; - int *a = __builtin_malloc (sizeof (int) * n); - int i, j = 4; - - __builtin_memset (a, 0, sizeof (int) * n); - foo (j, n, a); - for (i = j + 1; i < n; i ++) - { - if (a[i] != i) - __builtin_abort (); - } - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/gridify-3.c b/libgomp/testsuite/libgomp.hsa.c/gridify-3.c deleted file mode 100644 index f881d81e18e..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/gridify-3.c +++ /dev/null @@ -1,39 +0,0 @@ -#define THE_LOOP \ - for (i = j + 1; i < n; i += 3) \ - a[i] = i - -void __attribute__((noinline, noclone)) -foo (int j, int n, int *a) -{ - int i; -#pragma omp target -#pragma omp teams -#pragma omp distribute parallel for shared(a) firstprivate(n) private(i) firstprivate(j) - THE_LOOP; -} - -void __attribute__((noinline, noclone)) -bar (int j, int n, int *a) -{ - int i; - THE_LOOP; -} - -int main (int argc, char **argv) -{ - int n = 32; - int *a = __builtin_malloc (sizeof (int) * n); - int *ref = __builtin_malloc (sizeof (int) * n); - int i, j = 4; - - __builtin_memset (a, 0, sizeof (int) * n); - __builtin_memset (ref, 0, sizeof (int) * n); - bar (j, n, ref); - foo (j, n, a); - for (i = 0; i < n; i ++) - { - if (a[i] != ref[i]) - __builtin_abort (); - } - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/gridify-4.c b/libgomp/testsuite/libgomp.hsa.c/gridify-4.c deleted file mode 100644 index c3fbdbf55d4..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/gridify-4.c +++ /dev/null @@ -1,45 +0,0 @@ -#define THE_LOOP \ - for (i = j + 1; i < n; i += 3) \ - a[i] = i - -void __attribute__((noinline, noclone)) -foo (int j, int n, int *a) -{ -#pragma omp parallel - { - #pragma omp single - { - int i; -#pragma omp target -#pragma omp teams -#pragma omp distribute parallel for shared(a) firstprivate(n) private(i) firstprivate(j) - THE_LOOP; - } - } -} - -void __attribute__((noinline, noclone)) -bar (int j, int n, int *a) -{ - int i; - THE_LOOP; -} - -int main (int argc, char **argv) -{ - int n = 32; - int *a = __builtin_malloc (sizeof (int) * n); - int *ref = __builtin_malloc (sizeof (int) * n); - int i, j = 4; - - __builtin_memset (a, 0, sizeof (int) * n); - __builtin_memset (ref, 0, sizeof (int) * n); - bar (j, n, ref); - foo (j, n, a); - for (i = 0; i < n; i ++) - { - if (a[i] != ref[i]) - __builtin_abort (); - } - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/memory-operations-1.c b/libgomp/testsuite/libgomp.hsa.c/memory-operations-1.c deleted file mode 100644 index a17be932111..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/memory-operations-1.c +++ /dev/null @@ -1,92 +0,0 @@ -#include <assert.h> - -#define C 55 - -int i, j, k; - -static void -test_bzero (unsigned size) -{ - unsigned bsize = size * sizeof (int); - int *x = __builtin_malloc (bsize); - __builtin_memset (x, C, bsize); - -#pragma omp target map(tofrom: x[:size]) map(from: bsize) - { - __builtin_bzero (x, bsize); - } - - char *buffer = (char *) x; - for (unsigned i = 0; i < bsize; ++i) - assert (buffer[i] == 0); -} - -static void -test_memcpy (unsigned size) -{ - unsigned bsize = size * sizeof (int); - int *x = __builtin_malloc (bsize); - __builtin_memset (x, C, bsize); - int *y = __builtin_malloc (bsize); - -#pragma omp target map(tofrom: x[:size], y[:size]) map(from: bsize) - { - __builtin_memcpy (y, x, bsize); - } - - char *buffer = (char *) y; - for (unsigned i = 0; i < bsize; ++i) - assert (buffer[i] == C); -} - -static void -test_mempcpy (unsigned size) -{ - unsigned bsize = size * sizeof (int); - int *x = __builtin_malloc (bsize); - __builtin_memset (x, C, bsize); - int *y = __builtin_malloc (bsize); - int *ptr = 0; - -#pragma omp target map(tofrom :x[:size], y[:size], ptr) map(from: bsize) - { - ptr = __builtin_mempcpy (y, x, bsize); - } - - char *buffer = (char *) y; - for (unsigned i = 0; i < bsize; ++i) - assert (buffer[i] == C); - - assert (ptr == y + size); -} - -static void -test_memset (unsigned size) -{ - unsigned bsize = size * sizeof (int); - int *x = __builtin_malloc (bsize); - __builtin_bzero (x, bsize); - -#pragma omp target map(tofrom : x[:size]) map(from: bsize) - { - __builtin_memset (x, C, bsize); - } - - char *buffer = (char *) x; - for (unsigned i = 0; i < bsize; ++i) - assert (buffer[i] == C); -} - -int -main (void) -{ - unsigned tests[] = {1, 2, 3, 4, 5, 8, 15, 17, 23, 33, 0}; - - for (unsigned i = 0; tests[i]; i++) - { - test_bzero (tests[i]); - test_memset (tests[i]); - test_memcpy (tests[i]); - test_mempcpy (tests[i]); - } -} diff --git a/libgomp/testsuite/libgomp.hsa.c/pr69568.c b/libgomp/testsuite/libgomp.hsa.c/pr69568.c deleted file mode 100644 index 6262eee3e71..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/pr69568.c +++ /dev/null @@ -1,41 +0,0 @@ -/* PR hsa/69568 */ - -typedef float float2 __attribute__ ((vector_size (8))); -float2 *output; - -void __attribute__((noinline, noclone)) -foo (int n, float2 *a, int workgroup_size) -{ - int i; -#pragma omp target map(from:a[:n]) firstprivate(n, workgroup_size) -#pragma omp teams thread_limit(workgroup_size) -#pragma omp distribute parallel for shared(a) firstprivate(n) private(i) - for (i = 0; i < n; i++) - { float2 v; - v[0] = i; - v[1] = 1+i; - a[i] = v; - } -} - -int main (int argc, char **argv) -{ - int n = 32; - float2 *a = __builtin_malloc (sizeof (float2) * n); - int i; - - __builtin_memset (a, 0, sizeof (float2) * n); - foo (n, a, 32); - for (i = 0; i < n; i++) - { - float2 v = a[i]; - if (__builtin_abs (v[0] - i) > 0.1 - || __builtin_abs (v[1] - i - 1) > 0.1) - { - __builtin_abort (); - return 1; - } - } - return 0; -} - diff --git a/libgomp/testsuite/libgomp.hsa.c/pr82416.c b/libgomp/testsuite/libgomp.hsa.c/pr82416.c deleted file mode 100644 index 40378ab12a5..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/pr82416.c +++ /dev/null @@ -1,43 +0,0 @@ -char __attribute__ ((noipa)) -toup (char X) -{ - if (X >= 97 && X <= 122) - return X - 32; - else - return X; -} - -char -target_toup_1 (char X) -{ - char r; -#pragma omp target map(to:X) map(from:r) - { - if (X >= 97 && X <= 122) - r = X - 32; - else - r = X; - } - return r; -} - -char __attribute__ ((noipa)) -target_toup (char X) -{ - return target_toup_1 (X); -} - -int main (int argc, char **argv) -{ - char a = 'a'; - if (toup (a) != target_toup (a)) - __builtin_abort (); - a = 'Z'; - if (toup (a) != target_toup (a)) - __builtin_abort (); - a = 5; - if (toup (a) != target_toup (a)) - __builtin_abort (); - - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/rotate-1.c b/libgomp/testsuite/libgomp.hsa.c/rotate-1.c deleted file mode 100644 index 494388bd10c..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/rotate-1.c +++ /dev/null @@ -1,39 +0,0 @@ -#include <assert.h> -#include <limits.h> - -#define T unsigned int -#define BITSIZE CHAR_BIT * sizeof (T) - -#define C1 123u - -#pragma omp declare target -T -rotate (T value, T shift) -{ - T r = (value << shift) | (value >> (BITSIZE - shift)); - return (r >> shift) | (r << (BITSIZE - shift)); -} -#pragma omp end declare target - -int -main (int argc) -{ - T v1, v2, v3, v4, v5; - -#pragma omp target map(to: v1, v2, v3, v4, v5) - { - v1 = rotate (C1, 10); - v2 = rotate (C1, 2); - v3 = rotate (C1, 5); - v4 = rotate (C1, 16); - v5 = rotate (C1, 32); - } - - assert (v1 == C1); - assert (v2 == C1); - assert (v3 == C1); - assert (v4 == C1); - assert (v5 == C1); - - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/staticvar.c b/libgomp/testsuite/libgomp.hsa.c/staticvar.c deleted file mode 100644 index 6d20c9aa328..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/staticvar.c +++ /dev/null @@ -1,23 +0,0 @@ -extern void abort (void); - -#pragma omp declare target -int -foo (void) -{ - static int s; - return ++s; -} -#pragma omp end declare target - -int -main () -{ - int r; - #pragma omp target map(from:r) - { - r = foo (); - } - if (r != 1) - abort (); - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/switch-1.c b/libgomp/testsuite/libgomp.hsa.c/switch-1.c deleted file mode 100644 index a180cf6cb7b..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/switch-1.c +++ /dev/null @@ -1,145 +0,0 @@ -#include <assert.h> - -#define s 100 - -#pragma omp declare target -int -switch1 (int a) -{ - switch (a) - { - case 1: - return 11; - case 33: - return 333; - case 55: - return 55; - default: - return -1; - } -} - -int -switch2 (int a) -{ - switch (a) - { - case 1 ... 11: - return 11; - break; - case 33: - return 333; - break; - case 55: - return 55; - break; - default: - return -1; - } -} - -int -switch3 (int a) -{ - switch (a) - { - case 1 ... 11: - return 11; - case 12 ... 22: - return 22; - case 23 ... 33: - return 33; - case 34 ... 44: - return 44; - default: - return 44; - } -} - -int -switch4 (int a, int b) -{ - switch (a) - { - case 1 ... 11: - return a; - case 12 ... 22: - return b; - case 23 ... 33: - return a; - case 34 ... 44: - return b; - default: - return 12345; - } -} - -int -switch5 (int a, int b) -{ - switch (a) - { - case 1 ... 2: - return 1; - case 3 ... 4: - return 2; - case 5 ... 6: - return 3; - case 7 ... 11: - return 4; - } - - return -1; -} -#pragma omp end declare target - -int -main (int argc) -{ - int array[s]; - -#pragma omp target map(tofrom : array[:s]) - { - for (int i = 0; i < s; i++) - array[i] = switch1 (i); - } - - for (int i = 0; i < s; i++) - assert (array[i] == switch1 (i)); - -#pragma omp target map(tofrom : array[:s]) - { - for (int i = 0; i < s; i++) - array[i] = switch2 (i); - } - - for (int i = 0; i < s; i++) - assert (array[i] == switch2 (i)); - -#pragma omp target map(tofrom : array[:s]) - { - for (int i = 0; i < s; i++) - array[i] = switch3 (i); - } - - for (int i = 0; i < s; i++) - assert (array[i] == switch3 (i)); - -#pragma omp target map(tofrom : array[:s]) - { - for (int i = 0; i < s; i++) - array[i] = switch4 (i, i + 1); - } - - for (int i = 0; i < s; i++) - assert (array[i] == switch4 (i, i + 1)); - -#pragma omp target map(tofrom : array[:s]) - { - for (int i = 0; i < s; i++) - array[i] = switch5 (i, i + 1); - } - - for (int i = 0; i < s; i++) - assert (array[i] == switch5 (i, i + 1)); -} diff --git a/libgomp/testsuite/libgomp.hsa.c/switch-branch-1.c b/libgomp/testsuite/libgomp.hsa.c/switch-branch-1.c deleted file mode 100644 index 9af1d6d0762..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/switch-branch-1.c +++ /dev/null @@ -1,116 +0,0 @@ -#include <assert.h> - -#define s 100 - -#pragma omp declare target -int -switch1 (unsigned a) -{ - switch (a) - { - case 1 ... 11: - return 11; - case 12 ... 13: - return 22; - default: - return 44; - } -} - -int -switch2 (unsigned a) -{ - switch (a) - { - case 1 ... 5: - return 1; - case 9 ... 11: - return a + 3; - case 12 ... 13: - return a + 3; - default: - return 44; - } -} - -#define OFFSET 12 - -int -switch3 (unsigned a) -{ - switch (a) - { - case (OFFSET + 0): - return 1; - case (OFFSET + 1)...(OFFSET + 11): - return 11; - case (OFFSET + 12)...(OFFSET + 13): - return (OFFSET + 22); - default: - return (OFFSET + 44); - } -} - -int -switch4 (unsigned a) -{ - switch (a) - { - case -2: - return 1; - case -1: - return a + 3; - case 3: - return a + 3; - default: - return 44; - } -} -#pragma omp end declare target - -#define low -33 -#define high 55 - -int -main (int argc) -{ - int array[s]; - -#pragma omp target map(tofrom : array[:s]) - { - for (int i = low; i < high; i++) - array[i - low] = switch1 (i); - } - - for (int i = low; i < high; i++) - assert (array[i - low] == switch1 (i)); - -#pragma omp target map(tofrom : array[:s]) - { - for (int i = low; i < high; i++) - array[i - low] = switch2 (i); - } - - for (int i = low; i < high; i++) - assert (array[i - low] == switch2 (i)); - -#pragma omp target map(tofrom : array[:s]) - { - for (int i = low; i < high; i++) - array[i - low] = switch3 (i); - } - - for (int i = low; i < high; i++) - assert (array[i - low] == switch3 (i)); - -#pragma omp target map(tofrom : array[:s]) - { - for (int i = low; i < high; i++) - array[i - low] = switch4 (i); - } - - for (int i = low; i < high; i++) - assert (array[i - low] == switch4 (i)); - - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/switch-sbr-2.c b/libgomp/testsuite/libgomp.hsa.c/switch-sbr-2.c deleted file mode 100644 index 06990d1c2c0..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/switch-sbr-2.c +++ /dev/null @@ -1,59 +0,0 @@ -/* { dg-additional-options "-fno-tree-switch-conversion" } */ - -#pragma omp declare target -int -foo (unsigned a) -{ - switch (a) - { - case 1 ... 5: - return 1; - case 9 ... 11: - return a + 3; - case 12 ... 13: - return a + 3; - default: - return 44; - } -} -#pragma omp end declare target - -#define s 100 - -void __attribute__((noinline, noclone)) -verify(int *a) -{ - if (a[0] != 44) - __builtin_abort (); - - for (int i = 1; i <= 5; i++) - if (a[i] != 1) - __builtin_abort (); - - for (int i = 6; i <= 8; i++) - if (a[i] != 44) - __builtin_abort (); - - for (int i = 9; i <= 13; i++) - if (a[i] != i + 3) - __builtin_abort (); - - for (int i = 14; i < s; i++) - if (a[i] != 44) - __builtin_abort (); -} - -int main(int argc) -{ - int array[s]; -#pragma omp target - { - for (int i = 0; i < s; i++) - { - int v = foo (i); - array[i] = v; - } - } - verify (array); - return 0; -} diff --git a/libgomp/testsuite/libgomp.hsa.c/tiling-1.c b/libgomp/testsuite/libgomp.hsa.c/tiling-1.c deleted file mode 100644 index 9149adc04e9..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/tiling-1.c +++ /dev/null @@ -1,212 +0,0 @@ -/* - - matmul.c : Matrix Multiplication with tiling for openmp4 example - -*/ - -#include <stdlib.h> -#include <math.h> - -#define BLOCK_SIZE 16 -/* - #define BLOCK_SIZE 32 -*/ -#define NSECPERSEC 1000000000L - -typedef struct { - int width; - int height; - int stride; - int hpad; - float* elements; -} Matrix; - -/* Correctly extract the number of nanoseconds from the two time structures */ -long int get_nanosecs( struct timespec start_time, struct timespec end_time) { - long int nanosecs; - if ((end_time.tv_nsec-start_time.tv_nsec)<0) nanosecs = - ((((long int) end_time.tv_sec- (long int) start_time.tv_sec )-1)*NSECPERSEC ) + - ( NSECPERSEC + (long int) end_time.tv_nsec - (long int) start_time.tv_nsec) ; - else nanosecs = - (((long int) end_time.tv_sec- (long int) start_time.tv_sec )*NSECPERSEC ) + - ( (long int) end_time.tv_nsec - (long int) start_time.tv_nsec ); - return nanosecs; -} - -void simple_sgemm_tt(const int M,const int N,const int K,const float alpha, const float* A,const int LDA, - const float* B,const int LDB, const float beta,float* C, const int LDC) ; -void simple_sgemm_tn(const int M,const int N,const int K,const float alpha, const float* A,const int LDA, - const float* B,const int LDB, const float beta,float* C, const int LDC) ; -void tiled_sgemm_tt(const int M,const int N,const int K,const float alpha, const float*A, const int LDA, - const float* B,const int LDB, const float beta,float* C, const int LDC) ; - -int verify(float* v_res, float* v_ref, int len) { - int passed = 1; - int i; - for (i = 0; i < len; ++i) { - if (fabs(v_res[i] - v_ref[i]) > 0.001*v_ref[i]) { - __builtin_abort (); - } - } - return passed; -} - - -int main(int argc, char* argv[]){ - - Matrix A,B,Bt,C,Cref; - int a1,a2,a3,i,j; - struct timespec start_time1, end_time1; - struct timespec start_time2, end_time2; - long int nanosecs,total_ops; - float gflopsTiled,gflopsCPU; - - a1 = 35; - a2 = 28; - a3 = 47; - - A.height = a1; - A.width = a2; - A.stride = (((A.width-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - A.hpad = (((A.height-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - A.elements = (float*)malloc(A.stride * A.hpad* sizeof(float)); - - B.height = a2; - B.width = a3; - B.stride = (((B.width-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - B.hpad = (((B.height-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - B.elements = (float*)malloc(B.stride * B.hpad * sizeof(float)); - - /* Bt is same as B but stored in column-major order */ - Bt.height = B.height; - Bt.width = B.width; - Bt.stride = B.stride; - Bt.hpad = B.hpad; - Bt.elements = (float*)malloc(Bt.stride * Bt.hpad * sizeof(float)); - - C.height = a1; - C.width = a3; - C.stride = (((C.width-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - C.hpad = (((C.height-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - C.elements = (float*)malloc(C.stride * C.hpad * sizeof(float)); - - Cref.height = a1; - Cref.width = a3; - Cref.stride = (((Cref.width-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - Cref.hpad = (((Cref.height-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - Cref.elements = (float*)malloc(Cref.stride * Cref.hpad * sizeof(float)); - - for(i = 0; i < A.hpad ; i++) - for(j = 0; j < A.stride; j++) { - if (( j<A.width ) && (i<A.height)) { - A.elements[i*A.stride + j] = (i % 3); - } else { - A.elements[i*A.stride + j] = 0.0; - } - } - - /* Initialize B and Bt */ - for(i = 0; i < B.hpad ; i++) - for(j = 0; j < B.stride; j++) { - if (( j<B.width ) && (i<B.height)) { - B.elements[i*B.stride+j] = (j % 2); - Bt.elements[j*Bt.stride+i] = B.elements[i*B.stride+j] ; - } else { - B.elements[i*B.stride+j] = 0.0; - Bt.elements[j*Bt.stride+i] = 0.0; - } - } - - /* zero C, and Cref */ - for(i = 0; i < C.hpad; i++) - for(j = 0; j < C.stride; j++) { - C.elements[i*C.stride+j] = 0.0; - Cref.elements[i*Cref.stride+j] = 0.0; - } - - simple_sgemm_tt(A.height,B.width,B.height,1.0,A.elements,A.stride,B.elements,B.stride,1.0,Cref.elements,Cref.stride); - tiled_sgemm_tt(A.height,B.width,B.height,1.0,A.elements,A.stride,B.elements,B.stride,1.0,C.elements,C.stride); - - verify(C.elements, Cref.elements, C.height * C.stride); - return 0; -} - -void simple_sgemm_tt(const int M,const int N,const int K,const float alpha, const float* A,const int LDA, -const float* B,const int LDB, const float beta,float* C, const int LDC) { - /* A,B, and C are in row-major order */ - int c_row,c_col,inner; - float sum; - for (c_col = 0 ; c_col<N; c_col++ ) { - for (c_row = 0 ; c_row<M; c_row++ ) { - sum = 0.0 ; - for (inner = 0 ; inner<K; inner++ ) { - sum += A[c_row*LDA + inner] * B[inner*LDB + c_col] ; - } - C[c_row*LDC + c_col] = alpha*sum + beta*C[ c_row*LDC + c_col] ; - } - } -} - -/*************************** - - tiled_sgemm_tt: Tiled matrix multiplication: - -***************************/ - -void tiled_sgemm_tt(const int M, const int N, const int K, const float alpha, const float*A, const int LDA, - const float*B, const int LDB, const float beta, float*C, const int LDC){ - -#pragma omp target teams map(to:A[M*K],B[K*N]) map(from:C[M*N]) -#pragma omp distribute collapse(2) - for (int C_row_start=0 ; C_row_start < M ; C_row_start+=BLOCK_SIZE) - for (int C_col_start=0 ; C_col_start < N ; C_col_start+=BLOCK_SIZE) - { -// Each team has a local copy of these mini matrices - float As[BLOCK_SIZE][BLOCK_SIZE]; - float Bs[BLOCK_SIZE][BLOCK_SIZE]; -#pragma omp parallel - { - int C_row, C_col; - float Cval = 0.0; - - for (int kblock = 0; kblock < K ; kblock += BLOCK_SIZE ) - { -#pragma omp for collapse(2) - for (int row=0 ; row < BLOCK_SIZE ; row++) - for (int col=0 ; col < BLOCK_SIZE ; col++) - { - C_row = C_row_start + row; - C_col = C_col_start + col; - if ((C_row < M) && (kblock + col < K)) - As[row][col] = A[(C_row*LDA)+ kblock + col]; - else - As[row][col] = 0; - if ((kblock + row < K) && C_col < N) - Bs[row][col] = B[((kblock+row)*LDB)+ C_col]; - else - Bs[row][col] = 0; - } - -#pragma omp for collapse(2) - for (int row=0 ; row < BLOCK_SIZE ; row++) - for (int col=0 ; col < BLOCK_SIZE ; col++) - { - for (int e = 0; e < BLOCK_SIZE; ++e) - Cval += As[row][e] * Bs[e][col]; - } - } /* End for kblock .. */ - - -#pragma omp for collapse(2) - for (int row=0 ; row < BLOCK_SIZE ; row++) - for (int col=0 ; col < BLOCK_SIZE ; col++) - { - C_row = C_row_start + row; - C_col = C_col_start + col; - if ((C_row < M) && (C_col < N)) - C[(C_row*LDC)+C_col] = alpha*Cval + beta*C[(C_row*LDC)+C_col]; - - } - } /* end parallel */ - } /* end target teams distribute */ -} diff --git a/libgomp/testsuite/libgomp.hsa.c/tiling-2.c b/libgomp/testsuite/libgomp.hsa.c/tiling-2.c deleted file mode 100644 index 2756d14ca82..00000000000 --- a/libgomp/testsuite/libgomp.hsa.c/tiling-2.c +++ /dev/null @@ -1,258 +0,0 @@ -/* - - matmul.c : Matrix Multiplication with tiling for openmp4 example - -*/ - -#include <stdlib.h> -#include <math.h> - -#define BLOCK_SIZE 16 -/* - #define BLOCK_SIZE 32 -*/ -#define NSECPERSEC 1000000000L - -typedef struct { - int width; - int height; - int stride; - int hpad; - float* elements; -} Matrix; - -/* Correctly extract the number of nanoseconds from the two time structures */ -long int get_nanosecs( struct timespec start_time, struct timespec end_time) { - long int nanosecs; - if ((end_time.tv_nsec-start_time.tv_nsec)<0) nanosecs = - ((((long int) end_time.tv_sec- (long int) start_time.tv_sec )-1)*NSECPERSEC ) + - ( NSECPERSEC + (long int) end_time.tv_nsec - (long int) start_time.tv_nsec) ; - else nanosecs = - (((long int) end_time.tv_sec- (long int) start_time.tv_sec )*NSECPERSEC ) + - ( (long int) end_time.tv_nsec - (long int) start_time.tv_nsec ); - return nanosecs; -} - -void simple_sgemm_tt(const int M,const int N,const int K,const float alpha, const float* A,const int LDA, - const float* B,const int LDB, const float beta,float* C, const int LDC) ; -void simple_sgemm_tn(const int M,const int N,const int K,const float alpha, const float* A,const int LDA, - const float* B,const int LDB, const float beta,float* C, const int LDC) ; -void tiled_sgemm_tt(const int M,const int N,const int K,const float alpha, const float*A, const int LDA, - const float* B,const int LDB, const float beta,float* C, const int LDC) ; - -int verify(float* v_res, float* v_ref, int len) { - int passed = 1; - int i; - for (i = 0; i < len; ++i) { - if (fabs(v_res[i] - v_ref[i]) > 0.001*v_ref[i]) { - __builtin_abort (); - } - } - return passed; -} - - -int main(int argc, char* argv[]){ - - Matrix A,B,Bt,C,Cref; - int a1,a2,a3,i,j; - struct timespec start_time1, end_time1; - struct timespec start_time2, end_time2; - long int nanosecs,total_ops; - float gflopsTiled,gflopsCPU; - - a1 = 35; - a2 = 28; - a3 = 47; - - A.height = a1; - A.width = a2; - A.stride = (((A.width-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - A.hpad = (((A.height-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - A.elements = (float*)malloc(A.stride * A.hpad* sizeof(float)); - - B.height = a2; - B.width = a3; - B.stride = (((B.width-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - B.hpad = (((B.height-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - B.elements = (float*)malloc(B.stride * B.hpad * sizeof(float)); - - /* Bt is same as B but stored in column-major order */ - Bt.height = B.height; - Bt.width = B.width; - Bt.stride = B.stride; - Bt.hpad = B.hpad; - Bt.elements = (float*)malloc(Bt.stride * Bt.hpad * sizeof(float)); - - C.height = a1; - C.width = a3; - C.stride = (((C.width-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - C.hpad = (((C.height-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - C.elements = (float*)malloc(C.stride * C.hpad * sizeof(float)); - - Cref.height = a1; - Cref.width = a3; - Cref.stride = (((Cref.width-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - Cref.hpad = (((Cref.height-1)/BLOCK_SIZE)+1) * BLOCK_SIZE; - Cref.elements = (float*)malloc(Cref.stride * Cref.hpad * sizeof(float)); - - for(i = 0; i < A.hpad ; i++) - for(j = 0; j < A.stride; j++) { - if (( j<A.width ) && (i<A.height)) { - A.elements[i*A.stride + j] = (i % 3); - } else { - A.elements[i*A.stride + j] = 0.0; - } - } - - /* Initialize B and Bt */ - for(i = 0; i < B.hpad ; i++) - for(j = 0; j < B.stride; j++) { - if (( j<B.width ) && (i<B.height)) { - B.elements[i*B.stride+j] = (j % 2); - Bt.elements[j*Bt.stride+i] = B.elements[i*B.stride+j] ; - } else { - B.elements[i*B.stride+j] = 0.0; - Bt.elements[j*Bt.stride+i] = 0.0; - } - } - - /* zero C, and Cref */ - for(i = 0; i < C.hpad; i++) - for(j = 0; j < C.stride; j++) { - C.elements[i*C.stride+j] = 0.0; - Cref.elements[i*Cref.stride+j] = 0.0; - } - - simple_sgemm_tt(A.height,B.width,B.height,1.0,A.elements,A.stride,B.elements,B.stride,1.0,Cref.elements,Cref.stride); - tiled_sgemm_tt(A.height,B.width,B.height,1.0,A.elements,A.stride,B.elements,B.stride,1.0,C.elements,C.stride); - - verify(C.elements, Cref.elements, C.height * C.stride); - return 0; -} - -void simple_sgemm_tt(const int M,const int N,const int K,const float alpha, const float* A,const int LDA, -const float* B,const int LDB, const float beta,float* C, const int LDC) { - /* A,B, and C are in row-major order */ - int c_row,c_col,inner; - float sum; - for (c_col = 0 ; c_col<N; c_col++ ) { - for (c_row = 0 ; c_row<M; c_row++ ) { - sum = 0.0 ; - for (inner = 0 ; inner<K; inner++ ) { - sum += A[c_row*LDA + inner] * B[inner*LDB + c_col] ; - } - C[c_row*LDC + c_col] = alpha*sum + beta*C[ c_row*LDC + c_col] ; - } - } -} - -/*************************** - - tiled_sgemm_tt: Tiled matrix multiplication: - -***************************/ - -void tiled_sgemm_tt(const int M, const int N, const int K, const float alpha, const float*A, const int LDA, - const float*B, const int LDB, const float beta, float*C, const int LDC){ - -#pragma omp target teams map(to:A[M*K],B[K*N]) map(from:C[M*N]) -#pragma omp distribute collapse(2) - for (int C_row_start=0 ; C_row_start < M ; C_row_start+=BLOCK_SIZE) { - for (int C_col_start=0 ; C_col_start < N ; C_col_start+=BLOCK_SIZE) { - -// We now have M/BLOCK_SIZE * N/BLOCK_SIZE teams = (M*N)/(BLOCK_SIZE*BLOCK_SIZE) -// The grid global dimensions are M,N,1 -// The grid local dimensions are BLOCK_SIZE,BLOCK_SIZE,1 - -// ------------------------------------------------------------------- -// The rest of this code forms the HSAIL kernel with the -// pairs of "parallel for collapse(2)" loops replaced with a barrier. -// The kernel initializes these values -// C_row_start = get_group_id(0) * BLOCK_SIZE -// C_col_start = get_group_id(1) * BLOCK_SIZE -// row=get_local_id(0) -// col=get_local_id(1) -// ------------------------------------------------------------------- - -// Each team has a local copy of these mini matrices - float As[BLOCK_SIZE][BLOCK_SIZE]; - float Bs[BLOCK_SIZE][BLOCK_SIZE]; - float Cs[BLOCK_SIZE][BLOCK_SIZE]; - int C_row, C_col; - - /* Zero Cs for this BLOCK */ -// - - - - - - - - - - - - - - - - - - - - -// REPLACE NEXT THREE LINES WITH A BARRIER -#pragma omp parallel for collapse(2) - for (int row=0 ; row < BLOCK_SIZE ; row++) { - for (int col=0 ; col < BLOCK_SIZE ; col++) { -// END BARRIER -// - - - - - - - - - - - - - - - - - - - - - Cs[row][col] = 0.0; - } - } - - // This kblock loop is run on the master thread of each team - for (int kblock = 0; kblock < K ; kblock += BLOCK_SIZE ) { - - // Copy global memory values to local memory -// - - - - - - - - - - - - - - - - - - - - -// REPLACE NEXT THREE LINES WITH A BARRIER -#pragma omp parallel for collapse(2) - for (int row=0 ; row < BLOCK_SIZE ; row++) { - for (int col=0 ; col < BLOCK_SIZE ; col++) { -// END BARRIER -// - - - - - - - - - - - - - - - - - - - - - C_row = C_row_start + row; - C_col = C_col_start + col; - if ((C_row < M) && (kblock + col < K)) - As[row][col] = A[(C_row*LDA)+ kblock + col]; - else - As[row][col] = 0; - if ((kblock + row < K) && C_col < N) - Bs[row][col] = B[((kblock+row)*LDB)+ C_col]; - else - Bs[row][col] = 0; - } - } - - // Calculate Cs <- Sum(As X Bs) across all kblocks -// - - - - - - - - - - - - - - - - - - - - -// REPLACE NEXT THREE LINES WITH A BARRIER -#pragma omp parallel for collapse(2) - for (int row=0 ; row < BLOCK_SIZE ; row++) { - for (int col=0 ; col < BLOCK_SIZE ; col++) { -// END BARRIER -// - - - - - - - - - - - - - - - - - - - - - for (int e = 0; e < BLOCK_SIZE; ++e) - Cs[row][col] += As[row][e] * Bs[e][col]; - } - } - - } /* End for kblock .. */ - - - // Scale Update actual C from Cs -// - - - - - - - - - - - - - - - - - - - - -// REPLACE NEXT THREE LINES WITH A BARRIER -#pragma omp parallel for collapse(2) - for (int row=0 ; row < BLOCK_SIZE ; row++) { - for (int col=0 ; col < BLOCK_SIZE ; col++) { -// END BARRIER -// - - - - - - - - - - - - - - - - - - - - - C_row = C_row_start + row; - C_col = C_col_start + col; - if ((C_row < M) && (C_col < N)) { - C[(C_row*LDC)+C_col] = alpha*Cs[row][col] + beta*C[(C_row*LDC)+C_col]; - } - } - } - -// ------------------------------------------------------------------- -// This is the end of the kernel - - } - } - -} |