diff options
Diffstat (limited to 'doc/source/reference')
-rw-r--r-- | doc/source/reference/arrays.interface.rst | 4 | ||||
-rw-r--r-- | doc/source/reference/c-api/iterator.rst | 60 | ||||
-rw-r--r-- | doc/source/reference/distutils.rst | 8 | ||||
-rw-r--r-- | doc/source/reference/distutils_status_migration.rst | 6 | ||||
-rw-r--r-- | doc/source/reference/random/index.rst | 3 | ||||
-rw-r--r-- | doc/source/reference/random/parallel.rst | 83 | ||||
-rw-r--r-- | doc/source/reference/routines.other.rst | 1 |
7 files changed, 158 insertions, 7 deletions
diff --git a/doc/source/reference/arrays.interface.rst b/doc/source/reference/arrays.interface.rst index 904d0132b..74432c8a7 100644 --- a/doc/source/reference/arrays.interface.rst +++ b/doc/source/reference/arrays.interface.rst @@ -125,7 +125,7 @@ This approach to the interface consists of the object having an **Default**: ``[('', typestr)]`` **data** (optional) - A 2-tuple whose first argument is a :doc:`Python integer <c-api/long>` + A 2-tuple whose first argument is a :doc:`Python integer <python:c-api/long>` that points to the data-area storing the array contents. .. note:: @@ -253,7 +253,7 @@ flag is present. .. note:: :obj:`__array_struct__` is considered legacy and should not be used for new - code. Use the :py:doc:`buffer protocol <c-api/buffer>` or the DLPack protocol + code. Use the :doc:`buffer protocol <python:c-api/buffer>` or the DLPack protocol `numpy.from_dlpack` instead. diff --git a/doc/source/reference/c-api/iterator.rst b/doc/source/reference/c-api/iterator.rst index b4adaef9b..07187e7f1 100644 --- a/doc/source/reference/c-api/iterator.rst +++ b/doc/source/reference/c-api/iterator.rst @@ -203,6 +203,66 @@ is used to control the memory layout of the allocated result, typically } +Multi Index Tracking Example +---------------------------- + +This example shows you how to work with the :c:data:`NPY_ITER_MULTI_INDEX` flag. For simplicity, we assume the argument is a two-dimensional array. + +.. code-block:: c + + int PrintMultiIndex(PyArrayObject *arr) { + NpyIter *iter; + NpyIter_IterNextFunc *iternext; + npy_intp multi_index[2]; + + iter = NpyIter_New( + arr, NPY_ITER_READONLY | NPY_ITER_MULTI_INDEX | NPY_ITER_REFS_OK, + NPY_KEEPORDER, NPY_NO_CASTING, NULL); + if (iter == NULL) { + return -1; + } + if (NpyIter_GetNDim(iter) != 2) { + NpyIter_Deallocate(iter); + PyErr_SetString(PyExc_ValueError, "Array must be 2-D"); + return -1; + } + if (NpyIter_GetIterSize(iter) != 0) { + iternext = NpyIter_GetIterNext(iter, NULL); + if (iternext == NULL) { + NpyIter_Deallocate(iter); + return -1; + } + NpyIter_GetMultiIndexFunc *get_multi_index = + NpyIter_GetGetMultiIndex(iter, NULL); + if (get_multi_index == NULL) { + NpyIter_Deallocate(iter); + return -1; + } + + do { + get_multi_index(iter, multi_index); + printf("multi_index is [%" NPY_INTP_FMT ", %" NPY_INTP_FMT "]\n", + multi_index[0], multi_index[1]); + } while (iternext(iter)); + } + if (!NpyIter_Deallocate(iter)) { + return -1; + } + return 0; + } + +When called with a 2x3 array, the above example prints: + +.. code-block:: sh + + multi_index is [0, 0] + multi_index is [0, 1] + multi_index is [0, 2] + multi_index is [1, 0] + multi_index is [1, 1] + multi_index is [1, 2] + + Iterator Data Types --------------------- diff --git a/doc/source/reference/distutils.rst b/doc/source/reference/distutils.rst index ff1ba3b0d..a5991f2c1 100644 --- a/doc/source/reference/distutils.rst +++ b/doc/source/reference/distutils.rst @@ -9,6 +9,14 @@ Packaging (:mod:`numpy.distutils`) ``numpy.distutils`` is deprecated, and will be removed for Python >= 3.12. For more details, see :ref:`distutils-status-migration` +.. warning:: + + Note that ``setuptools`` does major releases often and those may contain + changes that break ``numpy.distutils``, which will *not* be updated anymore + for new ``setuptools`` versions. It is therefore recommended to set an + upper version bound in your build configuration for the last known version + of ``setuptools`` that works with your build. + NumPy provides enhanced distutils functionality to make it easier to build and install sub-packages, auto-generate code, and extension modules that use Fortran-compiled libraries. To use features of NumPy diff --git a/doc/source/reference/distutils_status_migration.rst b/doc/source/reference/distutils_status_migration.rst index f5f4dbb29..eda4790b5 100644 --- a/doc/source/reference/distutils_status_migration.rst +++ b/doc/source/reference/distutils_status_migration.rst @@ -32,7 +32,7 @@ recommend: If you have modest needs (only simple Cython/C extensions, and perhaps nested ``setup.py`` files) and have been happy with ``numpy.distutils`` so far, you can also consider switching to ``setuptools``. Note that most functionality of -``numpy.disutils`` is unlikely to be ported to ``setuptools``. +``numpy.distutils`` is unlikely to be ported to ``setuptools``. Moving to Meson @@ -111,8 +111,8 @@ For more details, see the .. _numpy-setuptools-interaction: -Interaction of ``numpy.disutils`` with ``setuptools`` ------------------------------------------------------ +Interaction of ``numpy.distutils`` with ``setuptools`` +------------------------------------------------------ It is recommended to use ``setuptools < 60.0``. Newer versions may work, but are not guaranteed to. The reason for this is that ``setuptools`` 60.0 enabled diff --git a/doc/source/reference/random/index.rst b/doc/source/reference/random/index.rst index 674799d47..83a27d80c 100644 --- a/doc/source/reference/random/index.rst +++ b/doc/source/reference/random/index.rst @@ -216,9 +216,10 @@ Parallel Generation ~~~~~~~~~~~~~~~~~~~ The included generators can be used in parallel, distributed applications in -one of three ways: +a number of ways: * :ref:`seedsequence-spawn` +* :ref:`sequence-of-seeds` * :ref:`independent-streams` * :ref:`parallel-jumped` diff --git a/doc/source/reference/random/parallel.rst b/doc/source/reference/random/parallel.rst index bff955948..b625d34b7 100644 --- a/doc/source/reference/random/parallel.rst +++ b/doc/source/reference/random/parallel.rst @@ -1,7 +1,7 @@ Parallel Random Number Generation ================================= -There are three strategies implemented that can be used to produce +There are four main strategies implemented that can be used to produce repeatable pseudo-random numbers across multiple processes (local or distributed). @@ -109,6 +109,87 @@ territory ([2]_). .. _`not unique to numpy`: https://www.iro.umontreal.ca/~lecuyer/myftp/papers/parallel-rng-imacs.pdf +.. _sequence-of-seeds: + +Sequence of Integer Seeds +------------------------- + +As discussed in the previous section, `~SeedSequence` can not only take an +integer seed, it can also take an arbitrary-length sequence of (non-negative) +integers. If one exercises a little care, one can use this feature to design +*ad hoc* schemes for getting safe parallel PRNG streams with similar safety +guarantees as spawning. + +For example, one common use case is that a worker process is passed one +root seed integer for the whole calculation and also an integer worker ID (or +something more granular like a job ID, batch ID, or something similar). If +these IDs are created deterministically and uniquely, then one can derive +reproducible parallel PRNG streams by combining the ID and the root seed +integer in a list. + +.. code-block:: python + + # default_rng() and each of the BitGenerators use SeedSequence underneath, so + # they all accept sequences of integers as seeds the same way. + from numpy.random import default_rng + + def worker(root_seed, worker_id): + rng = default_rng([worker_id, root_seed]) + # Do work ... + + root_seed = 0x8c3c010cb4754c905776bdac5ee7501 + results = [worker(root_seed, worker_id) for worker_id in range(10)] + +.. end_block + +This can be used to replace a number of unsafe strategies that have been used +in the past which try to combine the root seed and the ID back into a single +integer seed value. For example, it is common to see users add the worker ID to +the root seed, especially with the legacy `~RandomState` code. + +.. code-block:: python + + # UNSAFE! Do not do this! + worker_seed = root_seed + worker_id + rng = np.random.RandomState(worker_seed) + +.. end_block + +It is true that for any one run of a parallel program constructed this way, +each worker will have distinct streams. However, it is quite likely that +multiple invocations of the program with different seeds will get overlapping +sets of worker seeds. It is not uncommon (in the author's self-experience) to +change the root seed merely by an increment or two when doing these repeat +runs. If the worker seeds are also derived by small increments of the worker +ID, then subsets of the workers will return identical results, causing a bias +in the overall ensemble of results. + +Combining the worker ID and the root seed as a list of integers eliminates this +risk. Lazy seeding practices will still be fairly safe. + +This scheme does require that the extra IDs be unique and deterministically +created. This may require coordination between the worker processes. It is +recommended to place the varying IDs *before* the unvarying root seed. +`~SeedSequence.spawn` *appends* integers after the user-provided seed, so if +you might be mixing both this *ad hoc* mechanism and spawning, or passing your +objects down to library code that might be spawning, then it is a little bit +safer to prepend your worker IDs rather than append them to avoid a collision. + +.. code-block:: python + + # Good. + worker_seed = [worker_id, root_seed] + + # Less good. It will *work*, but it's less flexible. + worker_seed = [root_seed, worker_id] + +.. end_block + +With those caveats in mind, the safety guarantees against collision are about +the same as with spawning, discussed in the previous section. The algorithmic +mechanisms are the same. + + .. _independent-streams: Independent Streams diff --git a/doc/source/reference/routines.other.rst b/doc/source/reference/routines.other.rst index bb0be7137..e980406eb 100644 --- a/doc/source/reference/routines.other.rst +++ b/doc/source/reference/routines.other.rst @@ -45,6 +45,7 @@ Utility get_include show_config + show_runtime deprecate deprecate_with_doc broadcast_shapes |