diff options
author | Andrew Martin <andrew.thaddeus@gmail.com> | 2019-07-17 09:51:47 -0400 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2019-10-08 13:24:52 -0400 |
commit | 77f3ba23b9fdde95a73c689791122070332733dc (patch) | |
tree | f055bd9813b078a22509396683db0ed941bb0aad | |
parent | 9ac3bcbb3605fa8ff2481806a3a87a0d9654cd87 (diff) | |
download | haskell-77f3ba23b9fdde95a73c689791122070332733dc.tar.gz |
Rephrase a bunch of things in the unlifted ffi types documentation. Add a section on pinned byte arrays.
-rw-r--r-- | docs/users_guide/ffi-chap.rst | 114 |
1 files changed, 71 insertions, 43 deletions
diff --git a/docs/users_guide/ffi-chap.rst b/docs/users_guide/ffi-chap.rst index 944ceb478c..f0b9e3fd34 100644 --- a/docs/users_guide/ffi-chap.rst +++ b/docs/users_guide/ffi-chap.rst @@ -43,10 +43,10 @@ moving heap-allocated Haskell values around arbitrarily. This greatly constrains library authors since it implies that it is not safe to pass any heap object reference to a ``safe`` foreign function call. For -instance, it is often desirable to pass an unpinned ``ByteArray#``\s directly -to native code to avoid making an otherwise-unnecessary copy. However, this can -only be done safely if the array is guaranteed not to be moved by the garbage -collector in the middle of the call. +instance, it is often desirable to pass an :ref:`unpinned <pinned-byte-arrays>` +``ByteArray#``\s directly to native code to avoid making an otherwise-unnecessary +copy. However, this can only be done safely if the array is guaranteed not to be +moved by the garbage collector in the middle of the call. The Chapter does *not* require implementations to refrain from doing the same for ``unsafe`` calls, so strictly Haskell 2010-conforming programs @@ -82,35 +82,53 @@ Unlifted FFI Types The following unlifted unboxed types may be used as basic foreign types (see FFI Chapter, Section 8.6) for both ``safe`` and ``unsafe`` foreign calls: ``Int#``, ``Word#``, ``Char#``, ``Float#``, -``Double#``, ``Addr#``, and ``StablePtr# a``. The following unlifted -boxed types may be used as arguments to (not results of) ``unsafe`` -foreign calls: ``Array#``, ``MutableArray#``, ``SmallArray#``, -``SmallMutableArray#``, ``ArrayArray#``, ``MutableArrayArray#``, -``ByteArray#``, and ``MutableByteArray#``. Additionally, ``ByteArray#`` -and ``MutableByteArray#`` can be passed to ``safe`` foreign calls -if the object is pinned. (Such can be ascertained by judicious use of -``isByteArrayPinned#``, ``isMutableByteArrayPinned#``, or -``newPinnedByteArray#``.) Passing an unpinned argument to an ``safe`` -foreign call results in undefined behavior. This table sums up the +``Double#``, ``Addr#``, and ``StablePtr# a``. Several unlifted boxed +types may be used as arguments to FFI calls, subject to these restrictions: -+--------------+-----------------------+----------------------------------+ -| Type | Safe FFI Argument | Unsafe FFI Argument | -+--------------+-----------------------+----------------------------------+ -| Array# | No | Yes, but not useful with C calls | -| SmallArray# | No | Yes, but not useful with C calls | -| ArrayArray# | No | Yes | -| ByteArray# | Yes, only when pinned | Yes | -+--------------+-----------------------+----------------------------------+ +* Valid arguments for ``foreign import unsafe`` FFI calls: ``Array#``, + ``SmallArray#``, ``ArrayArray#``, ``ByteArray#``, and the mutable + counterparts of these types. +* Valid arguments for ``foreign import safe`` FFI calls: ``ByteArray#`` + and ``MutableByteArray#``. The byte array must be + :ref:`pinned <pinned-byte-arrays>`. +* Mutation: In both ``foreign import unsafe`` and ``foreign import safe`` + FFI calls, it is safe to mutate a ``MutableByteArray``. Mutating any + other type of array leads to undefined behavior. Reason: Mutable arrays + of heap objects record writes for the purpose of garbage collection. + An array of heap objects is passed to a foreign C function, the + runtime does not record any writes. Consequently, it is not safe to + write to an array of heap objects in a foreign function. + Since the runtime has no facilities for tracking mutation of a + ``MutableByteArray#``, these can be safely mutated in any foreign + function. + +None of these restrictions are enforced at compile time. Failure +to heed these restrictions will lead to runtime errors that can be +very difficult to track down. (The errors likely will not manifest +until garbage collection happens.) In tabular form, these restrictions +are: + ++------------------+----------------------------------------------------+ +| | When value is used as argument to FFI call that is | ++------------------+-----------------------+----------------------------+ +| Type | Safe | Unsafe | ++------------------+-----------------------+----------------------------+ +| ``Array#`` | Unsound | Sound, not useful | +| ``SmallArray#`` | Unsound | Sound, not useful | +| ``ArrayArray#`` | Unsound | Sound | +| ``ByteArray#`` | Sound if pinned | Sound | ++------------------+-----------------------+----------------------------+ When passing any of the unlifted array types as an argument to a foreign C call, a foreign function sees a pointer that refers to the payload of the array, not to the ``StgArrBytes``/``StgMutArrPtrs``/``StgSmallMutArrPtrs`` heap object -containing it [1]_. (By contrast, a foreign Cmm call sees the heap object, -not just the payload.) This means that, in some situations, the foreign C -function might not need any knowledge of the RTS closure types. The -following example sums the first three bytes in a +containing it [1]_. By contrast, a foreign Cmm call, introduced by +``foreign import prim``, sees the heap object, not just the payload. +This means that, in some situations, the foreign C function might not +need any knowledge of the RTS closure types. The following example +sums the first three bytes in a ``MutableByteArray#`` [2]_ without using anything from ``Rts.h``:: // C source @@ -127,7 +145,8 @@ closure types. The following example sums the first element of each ``ByteArray#`` (interpreting the bytes as an array of ``CInt``) element of an ``ArrayArray##`` [3]_:: - // C source, must include the RTS + // C source, must include the RTS to make the struct StgArrBytes + // available along with its fields: ptrs and payload. #include "Rts.h" int sum_first (StgArrBytes **bufs) { StgArrBytes **bufs = (StgArrBytes**)bufsTmp; @@ -138,29 +157,18 @@ element of an ``ArrayArray##`` [3]_:: return res; } - -- Haskell source, all elements in the array must be - -- either ByteArray# or MutableByteArray#. This is not - -- enforced by the type system in this example. + -- Haskell source, all elements in the argument array must be + -- either ByteArray# or MutableByteArray#. This is not enforced + -- by the type system in this example since ArrayArray is untyped. foreign import ccall unsafe "sum_first" sumFirst :: ArrayArray# -> IO CInt -Mutable arrays of heap objects record writes for the purpose of -garbage collection. ``MutableArray#`` uses a card table, and -``SmallMutableArray#`` uses only a dirty bit. When passing -an array of heap objects into a foreign function, GHC assumes -that the foreign import does not modify the contents. Consequently, -it is not safe to write to an array of heap objects in a foreign -function. Foreign functions must treat such arrays as read-only. -However, note that the runtime has no facilities for tracking -mutation of a ``MutableByteArray#``. It is safe to mutate these -in a foreign function. - Although GHC allows the user to pass all unlifted boxed types to foreign functions, some of them are not amenable to useful work. Although ``Array#`` is unlifted, the elements in its payload are lifted, and a foreign C function cannot safely force thunks. Consequently, -a foreign C function do anything with the elements of an ``Array#`` -other checking pointer equality as a shortcut. +a foreign C function cannot dereference any of the addresses that comprise +the payload of the ``Array#``. .. _ffi-newtype-io: @@ -966,6 +974,26 @@ to the floating point state, so that if you really need to use - It is safe to modify the floating-point unit state temporarily during a foreign call, because foreign calls are never pre-empted by GHC. +.. _pinned-byte-arrays: + +Pinned Byte Arrays +~~~~~~~~~~~~~~~~~~ + +A pinned byte array is one that the garbage collector is not allowed +to move. Consequently, it has a stable address that can be safely +requested with ``byteArrayContents#``. There are a handful of +primitive functions in :ghc-prim-ref:`GHC.Prim <GHC-Prim.html>` +used to enforce or check for pinnedness: ``isByteArrayPinned#``, +``isMutableByteArrayPinned#``, and ``newPinnedByteArray#``. A +byte array can be pinned as a result of three possible causes: + +1. It was allocated by ``newPinnedByteArray#``. +2. It is large. Currently, GHC defines large object to be one + that is at least as large as 80% of a 4KB block (i.e. at + least 3277 bytes). +3. It has been copied into a compact region. The documentation + for ``ghc-compact`` and ``compact`` describes this process. + .. [1] Prior to GHC 8.10, when passing an ``ArrayArray#`` argument to a foreign function, the foreign function would see a pointer to the ``StgMutArrPtrs`` rather than just the payload. |