diff options
Diffstat (limited to 'doc/source/reference')
-rw-r--r-- | doc/source/reference/arrays.nditer.rst | 22 | ||||
-rw-r--r-- | doc/source/reference/c-api/data_memory.rst | 84 | ||||
-rw-r--r-- | doc/source/reference/c-api/deprecations.rst | 4 | ||||
-rw-r--r-- | doc/source/reference/c-api/types-and-structures.rst | 172 | ||||
-rw-r--r-- | doc/source/reference/distutils_guide.rst | 2 | ||||
-rw-r--r-- | doc/source/reference/global_state.rst | 10 | ||||
-rw-r--r-- | doc/source/reference/index.rst | 2 | ||||
-rw-r--r-- | doc/source/reference/internals.code-explanations.rst | 2 | ||||
-rw-r--r-- | doc/source/reference/maskedarray.baseclass.rst | 14 | ||||
-rw-r--r-- | doc/source/reference/random/bit_generators/index.rst | 2 | ||||
-rw-r--r-- | doc/source/reference/random/compatibility.rst | 87 | ||||
-rw-r--r-- | doc/source/reference/random/index.rst | 309 | ||||
-rw-r--r-- | doc/source/reference/random/legacy.rst | 15 | ||||
-rw-r--r-- | doc/source/reference/random/new-or-different.rst | 56 | ||||
-rw-r--r-- | doc/source/reference/routines.ctypeslib.rst | 2 | ||||
-rw-r--r-- | doc/source/reference/routines.datetime.rst | 4 | ||||
-rw-r--r-- | doc/source/reference/routines.io.rst | 2 |
17 files changed, 433 insertions, 356 deletions
diff --git a/doc/source/reference/arrays.nditer.rst b/doc/source/reference/arrays.nditer.rst index 8cabc1a06..fb1c91a07 100644 --- a/doc/source/reference/arrays.nditer.rst +++ b/doc/source/reference/arrays.nditer.rst @@ -3,7 +3,7 @@ .. _arrays.nditer: ********************* -Iterating Over Arrays +Iterating over arrays ********************* .. note:: @@ -23,7 +23,7 @@ can accelerate the inner loop in Cython. Since the Python exposure of iterator API, these ideas will also provide help working with array iteration from C or C++. -Single Array Iteration +Single array iteration ====================== The most basic task that can be done with the :class:`nditer` is to @@ -64,7 +64,7 @@ namely the order they are stored in memory, whereas the elements of `a.T.copy(order='C')` get visited in a different order because they have been put into a different memory layout. -Controlling Iteration Order +Controlling iteration order --------------------------- There are times when it is important to visit the elements of an array @@ -88,7 +88,7 @@ order='C' for C order and order='F' for Fortran order. .. _nditer-context-manager: -Modifying Array Values +Modifying array values ---------------------- By default, the :class:`nditer` treats the input operand as a read-only @@ -128,7 +128,7 @@ note that prior to 1.15, :class:`nditer` was not a context manager and did not have a `close` method. Instead it relied on the destructor to initiate the writeback of the buffer. -Using an External Loop +Using an external loop ---------------------- In all the examples so far, the elements of `a` are provided by the @@ -161,7 +161,7 @@ elements each. ... [0 3] [1 4] [2 5] -Tracking an Index or Multi-Index +Tracking an index or multi-index -------------------------------- During iteration, you may want to use the index of the current @@ -210,7 +210,7 @@ raise an exception. File "<stdin>", line 1, in <module> ValueError: Iterator flag EXTERNAL_LOOP cannot be used if an index or multi-index is being tracked -Alternative Looping and Element Access +Alternative looping and element access -------------------------------------- To make its properties more readily accessible during iteration, @@ -246,7 +246,7 @@ produce identical results to the ones in the previous section. array([[ 0, 1, 2], [-1, 0, 1]]) -Buffering the Array Elements +Buffering the array elements ---------------------------- When forcing an iteration order, we observed that the external loop @@ -274,7 +274,7 @@ is enabled. ... [0 3 1 4 2 5] -Iterating as a Specific Data Type +Iterating as a specific data type --------------------------------- There are times when it is necessary to treat an array as a different @@ -382,7 +382,7 @@ would violate the casting rule. File "<stdin>", line 2, in <module> TypeError: Iterator requested dtype could not be cast from dtype('float64') to dtype('int64'), the operand 0 dtype, according to the rule 'same_kind' -Broadcasting Array Iteration +Broadcasting array iteration ============================ NumPy has a set of rules for dealing with arrays that have differing @@ -417,7 +417,7 @@ which includes the input shapes to help diagnose the problem. ... ValueError: operands could not be broadcast together with shapes (2,) (2,3) -Iterator-Allocated Output Arrays +Iterator-allocated output arrays -------------------------------- A common case in NumPy functions is to have outputs allocated based diff --git a/doc/source/reference/c-api/data_memory.rst b/doc/source/reference/c-api/data_memory.rst index 2084ab5d0..cab587efc 100644 --- a/doc/source/reference/c-api/data_memory.rst +++ b/doc/source/reference/c-api/data_memory.rst @@ -159,3 +159,87 @@ A better technique would be to use a ``PyCapsule`` as a base object: return NULL; } ... + +Example of memory tracing with ``np.lib.tracemalloc_domain`` +------------------------------------------------------------ + +Note that since Python 3.6 (or newer), the builtin ``tracemalloc`` module can be used to +track allocations inside NumPy. NumPy places its CPU memory allocations into the +``np.lib.tracemalloc_domain`` domain. +For additional information, check: `https://docs.python.org/3/library/tracemalloc.html`. + +Here is an example on how to use ``np.lib.tracemalloc_domain``: + +.. code-block:: python + + """ + The goal of this example is to show how to trace memory + from an application that has NumPy and non-NumPy sections. + We only select the sections using NumPy related calls. + """ + + import tracemalloc + import numpy as np + + # Flag to determine if we select NumPy domain + use_np_domain = True + + nx = 300 + ny = 500 + + # Start to trace memory + tracemalloc.start() + + # Section 1 + # --------- + + # NumPy related call + a = np.zeros((nx,ny)) + + # non-NumPy related call + b = [i**2 for i in range(nx*ny)] + + snapshot1 = tracemalloc.take_snapshot() + # We filter the snapshot to only select NumPy related calls + np_domain = np.lib.tracemalloc_domain + dom_filter = tracemalloc.DomainFilter(inclusive=use_np_domain, + domain=np_domain) + snapshot1 = snapshot1.filter_traces([dom_filter]) + top_stats1 = snapshot1.statistics('traceback') + + print("================ SNAPSHOT 1 =================") + for stat in top_stats1: + print(f"{stat.count} memory blocks: {stat.size / 1024:.1f} KiB") + print(stat.traceback.format()[-1]) + + # Clear traces of memory blocks allocated by Python + # before moving to the next section. + tracemalloc.clear_traces() + + # Section 2 + #---------- + + # We are only using NumPy + c = np.sum(a*a) + + snapshot2 = tracemalloc.take_snapshot() + top_stats2 = snapshot2.statistics('traceback') + + print() + print("================ SNAPSHOT 2 =================") + for stat in top_stats2: + print(f"{stat.count} memory blocks: {stat.size / 1024:.1f} KiB") + print(stat.traceback.format()[-1]) + + tracemalloc.stop() + + print() + print("============================================") + print("\nTracing Status : ", tracemalloc.is_tracing()) + + try: + print("\nTrying to Take Snapshot After Tracing is Stopped.") + snap = tracemalloc.take_snapshot() + except Exception as e: + print("Exception : ", e) + diff --git a/doc/source/reference/c-api/deprecations.rst b/doc/source/reference/c-api/deprecations.rst index 5b1abc6f2..d805421f2 100644 --- a/doc/source/reference/c-api/deprecations.rst +++ b/doc/source/reference/c-api/deprecations.rst @@ -58,3 +58,7 @@ On compilers which support a #warning mechanism, NumPy issues a compiler warning if you do not define the symbol NPY_NO_DEPRECATED_API. This way, the fact that there are deprecations will be flagged for third-party developers who may not have read the release notes closely. + +Note that defining NPY_NO_DEPRECATED_API is not sufficient to make your +extension ABI compatible with a given NumPy version. See +:ref:`for-downstream-package-authors`. diff --git a/doc/source/reference/c-api/types-and-structures.rst b/doc/source/reference/c-api/types-and-structures.rst index 1fe7c0e93..cff1b3d38 100644 --- a/doc/source/reference/c-api/types-and-structures.rst +++ b/doc/source/reference/c-api/types-and-structures.rst @@ -285,83 +285,16 @@ PyArrayDescr_Type and PyArray_Descr array like behavior. Each bit in this member is a flag which are named as: - .. c:member:: int alignment - - Non-NULL if this type is an array (C-contiguous) of some other type - - -.. - dedented to allow internal linking, pending a refactoring - -.. c:macro:: NPY_ITEM_REFCOUNT - - Indicates that items of this data-type must be reference - counted (using :c:func:`Py_INCREF` and :c:func:`Py_DECREF` ). - - .. c:macro:: NPY_ITEM_HASOBJECT - - Same as :c:data:`NPY_ITEM_REFCOUNT`. - -.. - dedented to allow internal linking, pending a refactoring - -.. c:macro:: NPY_LIST_PICKLE - - Indicates arrays of this data-type must be converted to a list - before pickling. - -.. c:macro:: NPY_ITEM_IS_POINTER - - Indicates the item is a pointer to some other data-type - -.. c:macro:: NPY_NEEDS_INIT - - Indicates memory for this data-type must be initialized (set - to 0) on creation. - -.. c:macro:: NPY_NEEDS_PYAPI - - Indicates this data-type requires the Python C-API during - access (so don't give up the GIL if array access is going to - be needed). - -.. c:macro:: NPY_USE_GETITEM - - On array access use the ``f->getitem`` function pointer - instead of the standard conversion to an array scalar. Must - use if you don't define an array scalar to go along with - the data-type. - -.. c:macro:: NPY_USE_SETITEM - - When creating a 0-d array from an array scalar use - ``f->setitem`` instead of the standard copy from an array - scalar. Must use if you don't define an array scalar to go - along with the data-type. - - .. c:macro:: NPY_FROM_FIELDS - - The bits that are inherited for the parent data-type if these - bits are set in any field of the data-type. Currently ( - :c:data:`NPY_NEEDS_INIT` \| :c:data:`NPY_LIST_PICKLE` \| - :c:data:`NPY_ITEM_REFCOUNT` \| :c:data:`NPY_NEEDS_PYAPI` ). - - .. c:macro:: NPY_OBJECT_DTYPE_FLAGS - - Bits set for the object data-type: ( :c:data:`NPY_LIST_PICKLE` - \| :c:data:`NPY_USE_GETITEM` \| :c:data:`NPY_ITEM_IS_POINTER` \| - :c:data:`NPY_ITEM_REFCOUNT` \| :c:data:`NPY_NEEDS_INIT` \| - :c:data:`NPY_NEEDS_PYAPI`). - - .. c:function:: int PyDataType_FLAGCHK(PyArray_Descr *dtype, int flags) - - Return true if all the given flags are set for the data-type - object. - - .. c:function:: int PyDataType_REFCHK(PyArray_Descr *dtype) - - Equivalent to :c:func:`PyDataType_FLAGCHK` (*dtype*, - :c:data:`NPY_ITEM_REFCOUNT`). + * :c:macro:`NPY_ITEM_REFCOUNT` + * :c:macro:`NPY_ITEM_HASOBJECT` + * :c:macro:`NPY_LIST_PICKLE` + * :c:macro:`NPY_ITEM_IS_POINTER` + * :c:macro:`NPY_NEEDS_INIT` + * :c:macro:`NPY_NEEDS_PYAPI` + * :c:macro:`NPY_USE_GETITEM` + * :c:macro:`NPY_USE_SETITEM` + * :c:macro:`NPY_FROM_FIELDS` + * :c:macro:`NPY_OBJECT_DTYPE_FLAGS` .. c:member:: int type_num @@ -452,6 +385,73 @@ PyArrayDescr_Type and PyArray_Descr Currently unused. Reserved for future use in caching hash values. +.. c:macro:: NPY_ITEM_REFCOUNT + + Indicates that items of this data-type must be reference + counted (using :c:func:`Py_INCREF` and :c:func:`Py_DECREF` ). + +.. c:macro:: NPY_ITEM_HASOBJECT + + Same as :c:data:`NPY_ITEM_REFCOUNT`. + +.. c:macro:: NPY_LIST_PICKLE + + Indicates arrays of this data-type must be converted to a list + before pickling. + +.. c:macro:: NPY_ITEM_IS_POINTER + + Indicates the item is a pointer to some other data-type + +.. c:macro:: NPY_NEEDS_INIT + + Indicates memory for this data-type must be initialized (set + to 0) on creation. + +.. c:macro:: NPY_NEEDS_PYAPI + + Indicates this data-type requires the Python C-API during + access (so don't give up the GIL if array access is going to + be needed). + +.. c:macro:: NPY_USE_GETITEM + + On array access use the ``f->getitem`` function pointer + instead of the standard conversion to an array scalar. Must + use if you don't define an array scalar to go along with + the data-type. + +.. c:macro:: NPY_USE_SETITEM + + When creating a 0-d array from an array scalar use + ``f->setitem`` instead of the standard copy from an array + scalar. Must use if you don't define an array scalar to go + along with the data-type. + +.. c:macro:: NPY_FROM_FIELDS + + The bits that are inherited for the parent data-type if these + bits are set in any field of the data-type. Currently ( + :c:data:`NPY_NEEDS_INIT` \| :c:data:`NPY_LIST_PICKLE` \| + :c:data:`NPY_ITEM_REFCOUNT` \| :c:data:`NPY_NEEDS_PYAPI` ). + +.. c:macro:: NPY_OBJECT_DTYPE_FLAGS + + Bits set for the object data-type: ( :c:data:`NPY_LIST_PICKLE` + \| :c:data:`NPY_USE_GETITEM` \| :c:data:`NPY_ITEM_IS_POINTER` \| + :c:data:`NPY_ITEM_REFCOUNT` \| :c:data:`NPY_NEEDS_INIT` \| + :c:data:`NPY_NEEDS_PYAPI`). + +.. c:function:: int PyDataType_FLAGCHK(PyArray_Descr *dtype, int flags) + + Return true if all the given flags are set for the data-type + object. + +.. c:function:: int PyDataType_REFCHK(PyArray_Descr *dtype) + + Equivalent to :c:func:`PyDataType_FLAGCHK` (*dtype*, + :c:data:`NPY_ITEM_REFCOUNT`). + .. c:type:: PyArray_ArrFuncs Functions implementing internal features. Not all of these @@ -973,7 +973,7 @@ PyUFunc_Type and PyUFuncObject Some fallback support for this slot exists, but will be removed eventually. A universal function that relied on this will have to be ported eventually. - See ref:`NEP 41 <NEP41>` and ref:`NEP 43 <NEP43>` + See :ref:`NEP 41 <NEP41>` and :ref:`NEP 43 <NEP43>` .. c:member:: void *reserved2 @@ -1456,22 +1456,6 @@ memory management. These types are not accessible directly from Python, and are not exposed to the C-API. They are included here only for completeness and assistance in understanding the code. - -.. c:type:: PyUFuncLoopObject - - A loose wrapper for a C-structure that contains the information - needed for looping. This is useful if you are trying to understand - the ufunc looping code. The :c:type:`PyUFuncLoopObject` is the associated - C-structure. It is defined in the ``ufuncobject.h`` header. - -.. c:type:: PyUFuncReduceObject - - A loose wrapper for the C-structure that contains the information - needed for reduce-like methods of ufuncs. This is useful if you are - trying to understand the reduce, accumulate, and reduce-at - code. The :c:type:`PyUFuncReduceObject` is the associated C-structure. It - is defined in the ``ufuncobject.h`` header. - .. c:type:: PyUFunc_Loop1d A simple linked-list of C-structures containing the information needed diff --git a/doc/source/reference/distutils_guide.rst b/doc/source/reference/distutils_guide.rst index a72a9c4d6..6f927c231 100644 --- a/doc/source/reference/distutils_guide.rst +++ b/doc/source/reference/distutils_guide.rst @@ -1,6 +1,6 @@ .. _distutils-user-guide: -NumPy Distutils - Users Guide +NumPy distutils - users guide ============================= .. warning:: diff --git a/doc/source/reference/global_state.rst b/doc/source/reference/global_state.rst index d73da4244..a898450af 100644 --- a/doc/source/reference/global_state.rst +++ b/doc/source/reference/global_state.rst @@ -1,7 +1,7 @@ .. _global_state: ************ -Global State +Global state ************ NumPy has a few import-time, compile-time, or runtime options @@ -11,10 +11,10 @@ purposes and will not be interesting to the vast majority of users. -Performance-Related Options +Performance-related options =========================== -Number of Threads used for Linear Algebra +Number of threads used for Linear Algebra ----------------------------------------- NumPy itself is normally intentionally limited to a single thread @@ -56,10 +56,10 @@ Setting ``NPY_DISABLE_CPU_FEATURES`` will exclude simd features at runtime. See :ref:`runtime-simd-dispatch`. -Debugging-Related Options +Debugging-related options ========================= -Relaxed Strides Checking +Relaxed strides checking ------------------------ The *compile-time* environment variable:: diff --git a/doc/source/reference/index.rst b/doc/source/reference/index.rst index 8e4a81436..4756af5fa 100644 --- a/doc/source/reference/index.rst +++ b/doc/source/reference/index.rst @@ -3,7 +3,7 @@ .. _reference: ############### -NumPy Reference +NumPy reference ############### :Release: |version| diff --git a/doc/source/reference/internals.code-explanations.rst b/doc/source/reference/internals.code-explanations.rst index d34314610..db3b3b452 100644 --- a/doc/source/reference/internals.code-explanations.rst +++ b/doc/source/reference/internals.code-explanations.rst @@ -1,7 +1,7 @@ :orphan: ************************* -NumPy C Code Explanations +NumPy C code explanations ************************* .. This document has been moved to ../dev/internals.code-explanations.rst. diff --git a/doc/source/reference/maskedarray.baseclass.rst b/doc/source/reference/maskedarray.baseclass.rst index fcd310faa..7121914b9 100644 --- a/doc/source/reference/maskedarray.baseclass.rst +++ b/doc/source/reference/maskedarray.baseclass.rst @@ -72,19 +72,19 @@ Attributes and properties of masked arrays .. seealso:: :ref:`Array Attributes <arrays.ndarray.attributes>` -.. autoattribute:: MaskedArray.data +.. autoattribute:: numpy::ma.MaskedArray.data -.. autoattribute:: MaskedArray.mask +.. autoattribute:: numpy::ma.MaskedArray.mask -.. autoattribute:: MaskedArray.recordmask +.. autoattribute:: numpy::ma.MaskedArray.recordmask -.. autoattribute:: MaskedArray.fill_value +.. autoattribute:: numpy::ma.MaskedArray.fill_value -.. autoattribute:: MaskedArray.baseclass +.. autoattribute:: numpy::ma.MaskedArray.baseclass -.. autoattribute:: MaskedArray.sharedmask +.. autoattribute:: numpy::ma.MaskedArray.sharedmask -.. autoattribute:: MaskedArray.hardmask +.. autoattribute:: numpy::ma.MaskedArray.hardmask As :class:`MaskedArray` is a subclass of :class:`~numpy.ndarray`, a masked array also inherits all the attributes and properties of a :class:`~numpy.ndarray` instance. diff --git a/doc/source/reference/random/bit_generators/index.rst b/doc/source/reference/random/bit_generators/index.rst index 14c19a6bd..e523b4339 100644 --- a/doc/source/reference/random/bit_generators/index.rst +++ b/doc/source/reference/random/bit_generators/index.rst @@ -1,5 +1,7 @@ .. currentmodule:: numpy.random +.. _random-bit-generators: + Bit Generators ============== diff --git a/doc/source/reference/random/compatibility.rst b/doc/source/reference/random/compatibility.rst new file mode 100644 index 000000000..138f03ca3 --- /dev/null +++ b/doc/source/reference/random/compatibility.rst @@ -0,0 +1,87 @@ +.. _random-compatibility: + +.. currentmodule:: numpy.random + +Compatibility Policy +==================== + +`numpy.random` has a somewhat stricter compatibility policy than the rest of +NumPy. Users of pseudorandomness often have use cases for being able to +reproduce runs in fine detail given the same seed (so-called "stream +compatibility"), and so we try to balance those needs with the flexibility to +enhance our algorithms. :ref:`NEP 19 <NEP19>` describes the evolution of this +policy. + +The main kind of compatibility that we enforce is stream-compatibility from run +to run under certain conditions. If you create a `Generator` with the same +`BitGenerator`, with the same seed, perform the same sequence of method calls +with the same arguments, on the same build of ``numpy``, in the same +environment, on the same machine, you should get the same stream of numbers. +Note that these conditions are very strict. There are a number of factors +outside of NumPy's control that limit our ability to guarantee much more than +this. For example, different CPUs implement floating point arithmetic +differently, and this can cause differences in certain edge cases that cascade +to the rest of the stream. `Generator.multivariate_normal`, for another +example, uses a matrix decomposition from ``numpy.linalg``. Even on the same +platform, a different build of ``numpy`` may use a different version of this +matrix decomposition algorithm from the LAPACK that it links to, causing +`Generator.multivariate_normal` to return completely different (but equally +valid!) results. We strive to prefer algorithms that are more resistant to +these effects, but this is always imperfect. + +.. note:: + + Most of the `Generator` methods allow you to draw multiple values from + a distribution as arrays. The requested size of this array is a parameter, + for the purposes of the above policy. Calling ``rng.random()`` 5 times is + not *guaranteed* to give the same numbers as ``rng.random(5)``. We reserve + the ability to decide to use different algorithms for different-sized + blocks. In practice, this happens rarely. + +Like the rest of NumPy, we generally maintain API source +compatibility from version to version. If we *must* make an API-breaking +change, then we will only do so with an appropriate deprecation period and +warnings, according to :ref:`general NumPy policy <NEP23>`. + +Breaking stream-compatibility in order to introduce new features or +improve performance in `Generator` or `default_rng` will be *allowed* with +*caution*. Such changes will be considered features, and as such will be no +faster than the standard release cadence of features (i.e. on ``X.Y`` releases, +never ``X.Y.Z``). Slowness will not be considered a bug for this purpose. +Correctness bug fixes that break stream-compatibility can happen on bugfix +releases, per usual, but developers should consider if they can wait until the +next feature release. We encourage developers to strongly weight user’s pain +from the break in stream-compatibility against the improvements. One example +of a worthwhile improvement would be to change algorithms for a significant +increase in performance, for example, moving from the `Box-Muller transform +<https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform>`_ method of +Gaussian variate generation to the faster `Ziggurat algorithm +<https://en.wikipedia.org/wiki/Ziggurat_algorithm>`_. An example of +a discouraged improvement would be tweaking the Ziggurat tables just a little +bit for a small performance improvement. + +.. note:: + + In particular, `default_rng` is allowed to change the default + `BitGenerator` that it uses (again, with *caution* and plenty of advance + warning). + +In general, `BitGenerator` classes have stronger guarantees of +version-to-version stream compatibility. This allows them to be a firmer +building block for downstream users that need it. Their limited API surface +makes it easier for them to maintain this compatibility from version to version. See +the docstrings of each `BitGenerator` class for their individual compatibility +guarantees. + +The legacy `RandomState` and the :ref:`associated convenience functions +<functions-in-numpy-random>` have a stricter version-to-version +compatibility guarantee. For reasons outlined in :ref:`NEP 19 <NEP19>`, we had made +stronger promises about their version-to-version stability early in NumPy's +development. There are still some limited use cases for this kind of +compatibility (like generating data for tests), so we maintain as much +compatibility as we can. There will be no more modifications to `RandomState`, +not even to fix correctness bugs. There are a few gray areas where we can make +minor fixes to keep `RandomState` working without segfaulting as NumPy's +internals change, and some docstring fixes. However, the previously-mentioned +caveats about the variability from machine to machine and build to build still +apply to `RandomState` just as much as it does to `Generator`. diff --git a/doc/source/reference/random/index.rst b/doc/source/reference/random/index.rst index 83a27d80c..38555b133 100644 --- a/doc/source/reference/random/index.rst +++ b/doc/source/reference/random/index.rst @@ -7,210 +7,124 @@ Random sampling (:mod:`numpy.random`) ===================================== -Numpy's random number routines produce pseudo random numbers using -combinations of a `BitGenerator` to create sequences and a `Generator` -to use those sequences to sample from different statistical distributions: - -* BitGenerators: Objects that generate random numbers. These are typically - unsigned integer words filled with sequences of either 32 or 64 random bits. -* Generators: Objects that transform sequences of random bits from a - BitGenerator into sequences of numbers that follow a specific probability - distribution (such as uniform, Normal or Binomial) within a specified - interval. - -Since Numpy version 1.17.0 the Generator can be initialized with a -number of different BitGenerators. It exposes many different probability -distributions. See `NEP 19 <https://www.numpy.org/neps/ -nep-0019-rng-policy.html>`_ for context on the updated random Numpy number -routines. The legacy `RandomState` random number routines are still -available, but limited to a single BitGenerator. See :ref:`new-or-different` -for a complete list of improvements and differences from the legacy -``RandomState``. - -For convenience and backward compatibility, a single `RandomState` -instance's methods are imported into the numpy.random namespace, see -:ref:`legacy` for the complete list. - .. _random-quick-start: Quick Start ----------- -Call `default_rng` to get a new instance of a `Generator`, then call its -methods to obtain samples from different distributions. By default, -`Generator` uses bits provided by `PCG64` which has better statistical -properties than the legacy `MT19937` used in `RandomState`. - -.. code-block:: python - - # Do this (new version) - from numpy.random import default_rng - rng = default_rng() - vals = rng.standard_normal(10) - more_vals = rng.standard_normal(10) - - # instead of this (legacy version) - from numpy import random - vals = random.standard_normal(10) - more_vals = random.standard_normal(10) - -`Generator` can be used as a replacement for `RandomState`. Both class -instances hold an internal `BitGenerator` instance to provide the bit -stream, it is accessible as ``gen.bit_generator``. Some long-overdue API -cleanup means that legacy and compatibility methods have been removed from -`Generator` - -=================== ============== ============ -`RandomState` `Generator` Notes -------------------- -------------- ------------ -``random_sample``, ``random`` Compatible with `random.random` -``rand`` -------------------- -------------- ------------ -``randint``, ``integers`` Add an ``endpoint`` kwarg -``random_integers`` -------------------- -------------- ------------ -``tomaxint`` removed Use ``integers(0, np.iinfo(np.int_).max,`` - ``endpoint=False)`` -------------------- -------------- ------------ -``seed`` removed Use `SeedSequence.spawn` -=================== ============== ============ - -See :ref:`new-or-different` for more information. - -Something like the following code can be used to support both ``RandomState`` -and ``Generator``, with the understanding that the interfaces are slightly -different - -.. code-block:: python - - try: - rng_integers = rng.integers - except AttributeError: - rng_integers = rng.randint - a = rng_integers(1000) - -Seeds can be passed to any of the BitGenerators. The provided value is mixed -via `SeedSequence` to spread a possible sequence of seeds across a wider -range of initialization states for the BitGenerator. Here `PCG64` is used and -is wrapped with a `Generator`. - -.. code-block:: python - - from numpy.random import Generator, PCG64 - rng = Generator(PCG64(12345)) - rng.standard_normal() - -Here we use `default_rng` to create an instance of `Generator` to generate a -random float: - ->>> import numpy as np ->>> rng = np.random.default_rng(12345) ->>> print(rng) -Generator(PCG64) ->>> rfloat = rng.random() ->>> rfloat -0.22733602246716966 ->>> type(rfloat) -<class 'float'> - -Here we use `default_rng` to create an instance of `Generator` to generate 3 -random integers between 0 (inclusive) and 10 (exclusive): - ->>> import numpy as np ->>> rng = np.random.default_rng(12345) ->>> rints = rng.integers(low=0, high=10, size=3) ->>> rints -array([6, 2, 7]) ->>> type(rints[0]) -<class 'numpy.int64'> - -Introduction ------------- -The new infrastructure takes a different approach to producing random numbers -from the `RandomState` object. Random number generation is separated into -two components, a bit generator and a random generator. - -The `BitGenerator` has a limited set of responsibilities. It manages state -and provides functions to produce random doubles and random unsigned 32- and -64-bit values. - -The `random generator <Generator>` takes the -bit generator-provided stream and transforms them into more useful -distributions, e.g., simulated normal random values. This structure allows -alternative bit generators to be used with little code duplication. - -The `Generator` is the user-facing object that is nearly identical to the -legacy `RandomState`. It accepts a bit generator instance as an argument. -The default is currently `PCG64` but this may change in future versions. -As a convenience NumPy provides the `default_rng` function to hide these -details: - ->>> from numpy.random import default_rng ->>> rng = default_rng(12345) ->>> print(rng) -Generator(PCG64) ->>> print(rng.random()) -0.22733602246716966 - -One can also instantiate `Generator` directly with a `BitGenerator` instance. - -To use the default `PCG64` bit generator, one can instantiate it directly and -pass it to `Generator`: - ->>> from numpy.random import Generator, PCG64 ->>> rng = Generator(PCG64(12345)) ->>> print(rng) -Generator(PCG64) - -Similarly to use the older `MT19937` bit generator (not recommended), one can -instantiate it directly and pass it to `Generator`: - ->>> from numpy.random import Generator, MT19937 ->>> rng = Generator(MT19937(12345)) ->>> print(rng) -Generator(MT19937) - -What's New or Different -~~~~~~~~~~~~~~~~~~~~~~~ +The :mod:`numpy.random` module implements pseudo-random number generators +(PRNGs or RNGs, for short) with the ability to draw samples from a variety of +probability distributions. In general, users will create a `Generator` instance +with `default_rng` and call the various methods on it to obtain samples from +different distributions. + +:: + + >>> import numpy as np + >>> rng = np.random.default_rng() + # Generate one random float uniformly distributed over the range [0, 1) + >>> rng.random() #doctest: +SKIP + 0.06369197489564249 # may vary + # Generate an array of 10 numbers according to a unit Gaussian distribution. + >>> rng.standard_normal(10) #doctest: +SKIP + array([-0.31018314, -1.8922078 , -0.3628523 , -0.63526532, 0.43181166, # may vary + 0.51640373, 1.25693945, 0.07779185, 0.84090247, -2.13406828]) + # Generate an array of 5 integers uniformly over the range [0, 10). + >>> rng.integers(low=0, high=10, size=5) #doctest: +SKIP + array([8, 7, 6, 2, 0]) # may vary + +Our RNGs are deterministic sequences and can be reproduced by specifying a seed integer to +derive its initial state. By default, with no seed provided, `default_rng` will create +seed the RNG from nondeterministic data from the operating system and therefore +generate different numbers each time. The pseudo-random sequences will be +independent for all practical purposes, at least those purposes for which our +pseudo-randomness was good for in the first place. + +:: + + >>> rng1 = np.random.default_rng() + >>> rng1.random() #doctest: +SKIP + 0.6596288841243357 # may vary + >>> rng2 = np.random.default_rng() + >>> rng2.random() #doctest: +SKIP + 0.11885628817151628 # may vary + .. warning:: - The Box-Muller method used to produce NumPy's normals is no longer available - in `Generator`. It is not possible to reproduce the exact random - values using Generator for the normal distribution or any other - distribution that relies on the normal such as the `RandomState.gamma` or - `RandomState.standard_t`. If you require bitwise backward compatible - streams, use `RandomState`. - -* The Generator's normal, exponential and gamma functions use 256-step Ziggurat - methods which are 2-10 times faster than NumPy's Box-Muller or inverse CDF - implementations. -* Optional ``dtype`` argument that accepts ``np.float32`` or ``np.float64`` - to produce either single or double precision uniform random variables for - select distributions -* Optional ``out`` argument that allows existing arrays to be filled for - select distributions -* All BitGenerators can produce doubles, uint64s and uint32s via CTypes - (`PCG64.ctypes`) and CFFI (`PCG64.cffi`). This allows the bit generators - to be used in numba. -* The bit generators can be used in downstream projects via - :ref:`Cython <random_cython>`. -* `Generator.integers` is now the canonical way to generate integer - random numbers from a discrete uniform distribution. The ``rand`` and - ``randn`` methods are only available through the legacy `RandomState`. - The ``endpoint`` keyword can be used to specify open or closed intervals. - This replaces both ``randint`` and the deprecated ``random_integers``. -* `Generator.random` is now the canonical way to generate floating-point - random numbers, which replaces `RandomState.random_sample`, - `RandomState.sample`, and `RandomState.ranf`. This is consistent with - Python's `random.random`. -* All BitGenerators in numpy use `SeedSequence` to convert seeds into - initialized states. -* The addition of an ``axis`` keyword argument to methods such as - `Generator.choice`, `Generator.permutation`, and `Generator.shuffle` - improves support for sampling from and shuffling multi-dimensional arrays. - -See :ref:`new-or-different` for a complete list of improvements and -differences from the traditional ``Randomstate``. + The pseudo-random number generators implemented in this module are designed + for statistical modeling and simulation. They are not suitable for security + or cryptographic purposes. See the :py:mod:`secrets` module from the + standard library for such use cases. + +Seeds should be large positive integers. `default_rng` can take positive +integers of any size. We recommend using very large, unique numbers to ensure +that your seed is different from anyone else's. This is good practice to ensure +that your results are statistically independent from theirs unless you are +intentionally *trying* to reproduce their result. A convenient way to get +such a seed number is to use :py:func:`secrets.randbits` to get an +arbitrary 128-bit integer. + +:: + + >>> import secrets + >>> import numpy as np + >>> secrets.randbits(128) #doctest: +SKIP + 122807528840384100672342137672332424406 # may vary + >>> rng1 = np.random.default_rng(122807528840384100672342137672332424406) + >>> rng1.random() + 0.5363922081269535 + >>> rng2 = np.random.default_rng(122807528840384100672342137672332424406) + >>> rng2.random() + 0.5363922081269535 + +See the documentation on `default_rng` and `SeedSequence` for more advanced +options for controlling the seed in specialized scenarios. + +`Generator` and its associated infrastructure was introduced in NumPy version +1.17.0. There is still a lot of code that uses the older `RandomState` and the +functions in `numpy.random`. While there are no plans to remove them at this +time, we do recommend transitioning to `Generator` as you can. The algorithms +are faster, more flexible, and will receive more improvements in the future. +For the most part, `Generator` can be used as a replacement for `RandomState`. +See :ref:`legacy` for information on the legacy infrastructure, +:ref:`new-or-different` for information on transitioning, and :ref:`NEP 19 +<NEP19>` for some of the reasoning for the transition. + +Design +------ + +Users primarily interact with `Generator` instances. Each `Generator` instance +owns a `BitGenerator` instance that implements the core RNG algorithm. The +`BitGenerator` has a limited set of responsibilities. It manages state and +provides functions to produce random doubles and random unsigned 32- and 64-bit +values. + +The `Generator` takes the bit generator-provided stream and transforms them +into more useful distributions, e.g., simulated normal random values. This +structure allows alternative bit generators to be used with little code +duplication. + +NumPy implements several different `BitGenerator` classes implementing +different RNG algorithms. `default_rng` currently uses `~PCG64` as the +default `BitGenerator`. It has better statistical properties and performance +than the `~MT19937` algorithm used in the legacy `RandomState`. See +:ref:`random-bit-generators` for more details on the supported BitGenerators. + +`default_rng` and BitGenerators delegate the conversion of seeds into RNG +states to `SeedSequence` internally. `SeedSequence` implements a sophisticated +algorithm that intermediates between the user's input and the internal +implementation details of each `BitGenerator` algorithm, each of which can +require different amounts of bits for its state. Importantly, it lets you use +arbitrary-sized integers and arbitrary sequences of such integers to mix +together into the RNG state. This is a useful primitive for constructing +a :ref:`flexible pattern for parallel RNG streams <seedsequence-spawn>`. + +For backward compatibility, we still maintain the legacy `RandomState` class. +It continues to use the `~MT19937` algorithm by default, and old seeds continue +to reproduce the same results. The convenience :ref:`functions-in-numpy-random` +are still aliases to the methods on a single global `RandomState` instance. See +:ref:`legacy` for the complete details. See :ref:`new-or-different` for +a detailed comparison between `Generator` and `RandomState`. Parallel Generation ~~~~~~~~~~~~~~~~~~~ @@ -235,6 +149,7 @@ Concepts Legacy Generator (RandomState) <legacy> BitGenerators, SeedSequences <bit_generators/index> Upgrading PCG64 with PCG64DXSM <upgrading-pcg64> + compatibility Features -------- diff --git a/doc/source/reference/random/legacy.rst b/doc/source/reference/random/legacy.rst index b1fce49a1..00921c477 100644 --- a/doc/source/reference/random/legacy.rst +++ b/doc/source/reference/random/legacy.rst @@ -52,7 +52,7 @@ using the state of the `RandomState`: :exclude-members: __init__ Seeding and State ------------------ +================= .. autosummary:: :toctree: generated/ @@ -62,7 +62,7 @@ Seeding and State ~RandomState.seed Simple random data ------------------- +================== .. autosummary:: :toctree: generated/ @@ -75,7 +75,7 @@ Simple random data ~RandomState.bytes Permutations ------------- +============ .. autosummary:: :toctree: generated/ @@ -83,7 +83,7 @@ Permutations ~RandomState.permutation Distributions -------------- +============== .. autosummary:: :toctree: generated/ @@ -123,8 +123,10 @@ Distributions ~RandomState.weibull ~RandomState.zipf +.. _functions-in-numpy-random: + Functions in `numpy.random` ---------------------------- +=========================== Many of the RandomState methods above are exported as functions in `numpy.random` This usage is discouraged, as it is implemented via a global `RandomState` instance which is not advised on two counts: @@ -133,8 +135,7 @@ Many of the RandomState methods above are exported as functions in - It uses a `RandomState` rather than the more modern `Generator`. -For backward compatible legacy reasons, we cannot change this. See -:ref:`random-quick-start`. +For backward compatible legacy reasons, we will not change this. .. autosummary:: :toctree: generated/ diff --git a/doc/source/reference/random/new-or-different.rst b/doc/source/reference/random/new-or-different.rst index 8f4a70540..9b5bf38e5 100644 --- a/doc/source/reference/random/new-or-different.rst +++ b/doc/source/reference/random/new-or-different.rst @@ -5,17 +5,9 @@ What's New or Different ----------------------- -.. warning:: - - The Box-Muller method used to produce NumPy's normals is no longer available - in `Generator`. It is not possible to reproduce the exact random - values using ``Generator`` for the normal distribution or any other - distribution that relies on the normal such as the `Generator.gamma` or - `Generator.standard_t`. If you require bitwise backward compatible - streams, use `RandomState`, i.e., `RandomState.gamma` or - `RandomState.standard_t`. - -Quick comparison of legacy :ref:`mtrand <legacy>` to the new `Generator` +NumPy 1.17.0 introduced `Generator` as an improved replacement for +the :ref:`legacy <legacy>` `RandomState`. Here is a quick comparison of the two +implementations. ================== ==================== ============= Feature Older Equivalent Notes @@ -44,21 +36,17 @@ Feature Older Equivalent Notes ``high`` interval endpoint ================== ==================== ============= -And in more detail: - -* Simulate from the complex normal distribution - (`~.Generator.complex_normal`) * The normal, exponential and gamma generators use 256-step Ziggurat methods which are 2-10 times faster than NumPy's default implementation in `~.Generator.standard_normal`, `~.Generator.standard_exponential` or - `~.Generator.standard_gamma`. - + `~.Generator.standard_gamma`. Because of the change in algorithms, it is not + possible to reproduce the exact random values using ``Generator`` for these + distributions or any distribution method that relies on them. .. ipython:: python - from numpy.random import Generator, PCG64 import numpy.random - rng = Generator(PCG64()) + rng = np.random.default_rng() %timeit -n 1 rng.standard_normal(100000) %timeit -n 1 numpy.random.standard_normal(100000) @@ -74,18 +62,25 @@ And in more detail: * `~.Generator.integers` is now the canonical way to generate integer - random numbers from a discrete uniform distribution. The ``rand`` and - ``randn`` methods are only available through the legacy `~.RandomState`. - This replaces both ``randint`` and the deprecated ``random_integers``. -* The Box-Muller method used to produce NumPy's normals is no longer available. + random numbers from a discrete uniform distribution. This replaces both + ``randint`` and the deprecated ``random_integers``. +* The ``rand`` and ``randn`` methods are only available through the legacy + `~.RandomState`. +* `Generator.random` is now the canonical way to generate floating-point + random numbers, which replaces `RandomState.random_sample`, + `sample`, and `ranf`, all of which were aliases. This is consistent with + Python's `random.random`. * All bit generators can produce doubles, uint64s and uint32s via CTypes (`~PCG64.ctypes`) and CFFI (`~PCG64.cffi`). This allows these bit generators to be used in numba. * The bit generators can be used in downstream projects via Cython. +* All bit generators use `SeedSequence` to :ref:`convert seed integers to + initialized states <seeding_and_entropy>`. * Optional ``dtype`` argument that accepts ``np.float32`` or ``np.float64`` to produce either single or double precision uniform random variables for - select distributions + select distributions. `~.Generator.integers` accepts a ``dtype`` argument + with any signed or unsigned integer dtype. * Uniforms (`~.Generator.random` and `~.Generator.integers`) * Normals (`~.Generator.standard_normal`) @@ -94,9 +89,10 @@ And in more detail: .. ipython:: python - rng = Generator(PCG64(0)) - rng.random(3, dtype='d') - rng.random(3, dtype='f') + rng = np.random.default_rng() + rng.random(3, dtype=np.float64) + rng.random(3, dtype=np.float32) + rng.integers(0, 256, size=3, dtype=np.uint8) * Optional ``out`` argument that allows existing arrays to be filled for select distributions @@ -111,6 +107,7 @@ And in more detail: .. ipython:: python + rng = np.random.default_rng() existing = np.zeros(4) rng.random(out=existing[:2]) print(existing) @@ -121,9 +118,12 @@ And in more detail: .. ipython:: python - rng = Generator(PCG64(123456789)) + rng = np.random.default_rng() a = np.arange(12).reshape((3, 4)) a rng.choice(a, axis=1, size=5) rng.shuffle(a, axis=1) # Shuffle in-place a + +* Added a method to sample from the complex normal distribution + (`~.Generator.complex_normal`) diff --git a/doc/source/reference/routines.ctypeslib.rst b/doc/source/reference/routines.ctypeslib.rst index c6127ca64..fe5ffb5a8 100644 --- a/doc/source/reference/routines.ctypeslib.rst +++ b/doc/source/reference/routines.ctypeslib.rst @@ -1,7 +1,7 @@ .. module:: numpy.ctypeslib *********************************************************** -C-Types Foreign Function Interface (:mod:`numpy.ctypeslib`) +C-Types foreign function interface (:mod:`numpy.ctypeslib`) *********************************************************** .. currentmodule:: numpy.ctypeslib diff --git a/doc/source/reference/routines.datetime.rst b/doc/source/reference/routines.datetime.rst index 966ed5a47..72513882b 100644 --- a/doc/source/reference/routines.datetime.rst +++ b/doc/source/reference/routines.datetime.rst @@ -1,6 +1,6 @@ .. _routines.datetime: -Datetime Support Functions +Datetime support functions ************************** .. currentmodule:: numpy @@ -12,7 +12,7 @@ Datetime Support Functions datetime_data -Business Day Functions +Business day functions ====================== .. currentmodule:: numpy diff --git a/doc/source/reference/routines.io.rst b/doc/source/reference/routines.io.rst index 2542b336f..1ec2ccb5e 100644 --- a/doc/source/reference/routines.io.rst +++ b/doc/source/reference/routines.io.rst @@ -83,7 +83,7 @@ Data sources DataSource -Binary Format Description +Binary format description ------------------------- .. autosummary:: :toctree: generated/ |