summaryrefslogtreecommitdiff
path: root/doc/source/reference
diff options
context:
space:
mode:
authormattip <matti.picus@gmail.com>2018-10-23 11:00:36 +0300
committermattip <matti.picus@gmail.com>2018-10-23 11:00:36 +0300
commitfc60b27257924ed7f3c140bed00c13a35b057dc8 (patch)
tree2307f8b03031305df6eb549790ea9627ab359af4 /doc/source/reference
parent73151451437fa6ce0d8b5f033c1e005885f63cf8 (diff)
downloadnumpy-fc60b27257924ed7f3c140bed00c13a35b057dc8.tar.gz
DOC: add missing docs
Diffstat (limited to 'doc/source/reference')
-rw-r--r--doc/source/reference/alignment.rst96
-rw-r--r--doc/source/reference/distutils.rst3
-rw-r--r--doc/source/reference/distutils_guide.rst7
-rw-r--r--doc/source/reference/index.rst4
-rw-r--r--doc/source/reference/internals.rst1
-rw-r--r--doc/source/reference/style_guide.rst8
6 files changed, 116 insertions, 3 deletions
diff --git a/doc/source/reference/alignment.rst b/doc/source/reference/alignment.rst
new file mode 100644
index 000000000..f067f0d03
--- /dev/null
+++ b/doc/source/reference/alignment.rst
@@ -0,0 +1,96 @@
+.. _alignment:
+
+
+Numpy Alignment Goals
+=====================
+
+There are three use-cases related to memory alignment in numpy (as of 1.14):
+
+ 1. Creating structured datatypes with fields aligned like in a C-struct.
+ 2. Speeding up copy operations by using uint assignment in instead of memcpy
+ 3. Guaranteeing safe aligned access for ufuncs/setitem/casting code
+
+Numpy uses two different forms of alignment to achieve these goals:
+"True alignment" and "Uint alignment".
+
+"True" alignment refers to the architecture-dependent alignment of an
+equivalent C-type in C. For example, in x64 systems ``numpy.float64`` is
+equivalent to ``double`` in C. On most systems this has either an alignment of
+4 or 8 bytes (and this can be controlled in gcc by the option
+``malign-double``). A variable is aligned in memory if its memory offset is a
+multiple of its alignment. On some systems (eg sparc) memory alignment is
+required, on others it gives a speedup.
+
+"Uint" alignment depends on the size of a datatype. It is defined to be the
+"True alignment" of the uint used by numpy's copy-code to copy the datatype, or
+undefined/unaligned if there is no equivalent uint. Currently numpy uses uint8,
+uint16, uint32, uint64 and uint64 to copy data of size 1,2,4,8,16 bytes
+respectively, and all other sized datatypes cannot be uint-aligned.
+
+For example, on a (typical linux x64 gcc) system, the numpy ``complex64``
+datatype is implemented as ``struct { float real, imag; }``. This has "true"
+alignment of 4 and "uint" alignment of 8 (equal to the true alignment of
+``uint64``).
+
+Variables in Numpy which control and describe alignment
+=======================================================
+
+There are 4 relevant uses of the word ``align`` used in numpy:
+
+ * The ``dtype.alignment`` attribute (``descr->alignment`` in C). This is meant
+ to reflect the "true alignment" of the type. It has arch-dependent default
+ values for all datatypes, with the exception of structured types created
+ with ``align=True`` as described below.
+ * The ``ALIGNED`` flag of an ndarray, computed in ``IsAligned`` and checked
+ by ``PyArray_ISALIGNED``. This is computed from ``dtype.alignment``.
+ It is set to ``True`` if every item in the array is at a memory location
+ consistent with ``dtype.alignment``, which is the case if the data ptr and
+ all strides of the array are multiples of that alignment.
+ * The ``align`` keyword of the dtype constructor, which only affects structured
+ arrays. If the structure's field offsets are not manually provided numpy
+ determines offsets automatically. In that case, ``align=True`` pads the
+ structure so that each field is "true" aligned in memory and sets
+ ``dtype.alignment`` to be the largest of the field "true" alignments. This
+ is like what C-structs usually do. Otherwise if offsets or itemsize were
+ manually provided ``align=True`` simply checks that all the fields are
+ "true" aligned and that the total itemsize is a multiple of the largest
+ field alignment. In either case ``dtype.isalignedstruct`` is also set to
+ True.
+ * ``IsUintAligned`` is used to determine if an ndarray is "uint aligned" in
+ an analagous way to how ``IsAligned`` checks for true-alignment.
+
+Consequences of alignment
+=========================
+
+Here is how the variables above are used:
+
+ 1. Creating aligned structs: In order to know how to offset a field when
+ ``align=True``, numpy looks up ``field.dtype.alignment``. This includes
+ fields which are nested structured arrays.
+ 2. Ufuncs: If the ``ALIGNED`` flag of an array is False, ufuncs will
+ buffer/cast the array before evaluation. This is needed since ufunc inner
+ loops access raw elements directly, which might fail on some archs if the
+ elements are not true-aligned.
+ 3. Getitem/setitem/copyswap function: Similar to ufuncs, these functions
+ generally have two code paths. If ``ALIGNED`` is False they will
+ use a code path that buffers the arguments so they are true-aligned.
+ 4. Strided copy code: Here, "uint alignment" is used instead. If the itemsize
+ of an array is equal to 1, 2, 4, 8 or 16 bytes and the array is uint
+ aligned then instead numpy will do ``*(uintN*)dst) = *(uintN*)src)`` for
+ appropriate N. Otherwise numpy copies by doing ``memcpy(dst, src, N)``.
+ 5. Nditer code: Since this often calls the strided copy code, it must
+ check for "uint alignment".
+ 6. Cast code: if the array is "uint aligned" this will essentially do
+ ``*dst = CASTFUNC(*src)``. If not, it does
+ ``memmove(srcval, src); dstval = CASTFUNC(srcval); memmove(dst, dstval)``
+ where dstval/srcval are aligned.
+
+Note that in principle, only "true alignment" is required for casting code.
+However, because the casting code and copy code are deeply intertwined they
+both use "uint" alignment. This should be safe assuming uint alignment is
+always larger than true alignment, though it can cause unnecessary buffering if
+an array is "true aligned" but not "uint aligned". If there is ever a big
+rewrite of this code it would be good to allow them to use different
+alignments.
+
+
diff --git a/doc/source/reference/distutils.rst b/doc/source/reference/distutils.rst
index 289822909..88e533832 100644
--- a/doc/source/reference/distutils.rst
+++ b/doc/source/reference/distutils.rst
@@ -13,8 +13,7 @@ distutils, use the :func:`setup <core.setup>` command from
:mod:`numpy.distutils.misc_util` that can make it easier to construct
keyword arguments to pass to the setup function (by passing the
dictionary obtained from the todict() method of the class). More
-information is available in the NumPy Distutils Users Guide in
-``<site-packages>/numpy/doc/DISTUTILS.txt``.
+information is available in the :ref:`distutils-user-guide`.
.. index::
diff --git a/doc/source/reference/distutils_guide.rst b/doc/source/reference/distutils_guide.rst
new file mode 100644
index 000000000..081719d16
--- /dev/null
+++ b/doc/source/reference/distutils_guide.rst
@@ -0,0 +1,7 @@
+.. _distutils-user-guide:
+
+NumPy Distutils - Users Guide
+=============================
+
+.. include:: ../../DISTUTILS.rst.txt
+ :start-line: 6
diff --git a/doc/source/reference/index.rst b/doc/source/reference/index.rst
index 2140c57f7..7ba50b882 100644
--- a/doc/source/reference/index.rst
+++ b/doc/source/reference/index.rst
@@ -23,8 +23,10 @@ For learning how to use NumPy, see also :ref:`user`.
ufuncs
routines
distutils
+ distutils_guide
c-api
internals
+ style_guide
swig
@@ -35,4 +37,4 @@ Large parts of this manual originate from Travis E. Oliphant's book
`Guide to NumPy <https://archive.org/details/NumPyBook>`__ (which generously
entered Public Domain in August 2008). The reference documentation for many of
the functions are written by numerous contributors and developers of
-NumPy. \ No newline at end of file
+NumPy.
diff --git a/doc/source/reference/internals.rst b/doc/source/reference/internals.rst
index e1d6644a6..03d081bf9 100644
--- a/doc/source/reference/internals.rst
+++ b/doc/source/reference/internals.rst
@@ -5,5 +5,6 @@ NumPy internals
.. toctree::
internals.code-explanations
+ alignment
.. automodule:: numpy.doc.internals
diff --git a/doc/source/reference/style_guide.rst b/doc/source/reference/style_guide.rst
new file mode 100644
index 000000000..bede3424d
--- /dev/null
+++ b/doc/source/reference/style_guide.rst
@@ -0,0 +1,8 @@
+.. _style_guide:
+
+===================
+NumPy C Style Guide
+===================
+
+.. include:: ../../C_STYLE_GUIDE.rst.txt
+ :start-line: 4