.. image:: https://pypip.in/d/pysendfile/badge.png :target: https://crate.io/packages/pysendfile/ :alt: Download this month .. image:: https://pypip.in/v/pysendfile/badge.png :target: https://pypi.python.org/pypi/pysendfile/ :alt: Latest version .. image:: https://pypip.in/license/pysendfile/badge.png :target: https://pypi.python.org/pypi/pysendfile/ :alt: License .. image:: https://api.travis-ci.org/giampaolo/pysendfile.png?branch=master :target: https://travis-ci.org/giampaolo/pysendfile :alt: Travis =========== Quick links =========== - `Home page `_ - `Mailing list `_ - `Blog `_ - `What's new `_ ===== About ===== `sendfile(2) `__ is a system call which provides a "zero-copy" way of copying data from one file descriptor to another (a socket). The phrase "zero-copy" refers to the fact that all of the copying of data between the two descriptors is done entirely by the kernel, with no copying of data into userspace buffers. This is particularly useful when sending a file over a socket (e.g. FTP). The normal way of sending a file over a socket involves reading data from the file into a userspace buffer, then write that buffer to the socket via `send() `__ or `sendall() `__: .. code-block:: python # how a file is tipically sent import socket file = open("somefile", "rb") sock = socket.socket() sock.connect(("127.0.0.1", 8021)) while True: chunk = file.read(65536) if not chunk: break # EOF sock.sendall(chunk) This copying of the data twice (once into the userland buffer, and once out from that userland buffer) imposes some performance and resource penalties. `sendfile(2) `__ syscall avoids these penalties by avoiding any use of userland buffers; it also results in a single system call (and thus only one context switch), rather than the series of `read(2) `__ / `write(2) `__ system calls (each system call requiring a context switch) used internally for the data copying. .. code-block:: python import socket from sendfile import sendfile file = open("somefile", "rb") blocksize = os.path.getsize("somefile") sock = socket.socket() sock.connect(("127.0.0.1", 8021)) offset = 0 while True: sent = sendfile(sock.fileno(), file.fileno(), offset, blocksize) if sent == 0: break # EOF offset += sent ================== A simple benchmark ================== This `benchmark script `__ implements the two examples above and compares plain socket.send() and sendfile() performances in terms of CPU time spent and bytes transmitted per second resulting in sendfile() being about **2.5x faster**. These are the results I get on my Linux 2.6.38 box, AMD dual-core 1.6 GHz: *send()* +---------------+-----------------+ | CPU time | 28.84 usec/pass | +---------------+-----------------+ | transfer rate | 359.38 MB/sec | +---------------+-----------------+ *sendfile()* +---------------+-----------------+ | CPU time | 11.28 usec/pass | +---------------+-----------------+ | transfer rate | 860.88 MB/sec | +---------------+-----------------+ =========================== When do you want to use it? =========================== Basically any application sending files over the network can take advantage of sendfile(2). HTTP and FTP servers are a typical example. `proftpd `__ and `vsftpd `__ are known to use it, so is `pyftpdlib `__. ================= API documentation ================= sendfile module provides a single function: sendfile(). - ``sendfile.sendfile(out, in, offset, nbytes, header="", trailer="", flags=0)`` Copy *nbytes* bytes from file descriptor *in* (a regular file) to file descriptor *out* (a socket) starting at *offset*. Return the number of bytes just being sent. When the end of file is reached return 0. On Linux, if *offset* is given as *None*, the bytes are read from the current position of *in* and the position of *in* is updated. *headers* and *trailers* are strings that are written before and after the data from *in* is written. In cross platform applications their usage is discouraged (`send() `__ or `sendall() `__ can be used instead). On Solaris, _out_ may be the file descriptor of a regular file or the file descriptor of a socket. On all other platforms, *out* must be the file descriptor of an open socket. *flags* argument is only supported on FreeBSD. - ``sendfile.SF_NODISKIO`` - ``sendfile.SF_MNOWAIT`` - ``sendfile.SF_SYNC`` Parameters for the _flags_ argument, if the implementation supports it. They are available on FreeBSD platforms. See `FreeBSD's man sendfile(2) `__. ======================= Differences with send() ======================= - sendfile(2) works with regular (mmap-like) files only (e.g. you can't use it with a `StringIO `__ object). - Also, it must be clear that the file can only be sent "as is" (e.g. you can't modify the content while transmitting). There might be problems with non regular filesystems such as NFS, SMBFS/Samba and CIFS. For this please refer to `proftpd documentation `__. - `OSError `__ is raised instead of `socket.error `__. The accompaining `error codes `__ have the same meaning though: EAGAIN, EWOULDBLOCK, EBUSY meaning you are supposed to retry, ECONNRESET, ENOTCONN, ESHUTDOWN, ECONNABORTED in case of disconnection. Some examples: `benchmark script `__, `test suite `__, `pyftpdlib wrapper `__. =================== Supported platforms =================== This module works with Python versions from **2.5** to **3.4**. The supported platforms are: - **Linux** - **Mac OSX** - **FreeBSD** - **Dragon Fly BSD** - **Sun OS** - **AIX** (not properly tested) ======= Support ======= Feel free to mail me at *g.rodola [AT] gmail [DOT] com* or post on the the mailing list: http://groups.google.com/group/py-sendfile. ====== Status ====== As of now the code includes a solid `test suite `__ and its ready for production use. It's been included in `pyftpdlib `__ project and used in production environments for almost a year now without any problem being reported so far. ======= Authors ======= pysendfile was originally written by *Ben Woolley* including Linux, FreeBSD and DragonFly BSD support. Later on *Niklas Edmundsson* took over maintenance and added AIX support. After a couple of years of project stagnation `Giampaolo Rodola' `__ took over maintenance and rewrote it from scratch adding support for: - Python 3 - non-blocking sockets - `large file `__ support - Mac OSX - Sun OS - FreeBSD flag argument - multiple threads (release GIL) - a simple benchmark suite - unit tests - documentation