System V IPC for Python - Semaphores, Shared Memory and Message Queues


RSS

This describes the sysv_ipc module which gives Python access to System V inter-process semaphores, shared memory and message queues on most (all?) *nix flavors. Examples include OS X, Linux, FreeBSD, OpenSolaris 2008.11, and AIX 5.2. It might also work under Windows with a library like Cygwin.

It works with Python 2.4 – 3.x. It's released under a BSD license.

You can download sysv_ipc version 0.6.8 ([md5 sum], [sha1 sum]) which contains the source code, setup.py, installation instructions and sample code. You can read about all of the changes in this version.

You might also want to read about some known bugs.

You might be interested in the very similar module posix_ipc which provides Python access to POSIX IPC primitives. POSIX IPC is a little easier to use than SysV IPC, but not all operating systems support it completely.

Module sysv_ipc

Jump to semaphores, shared memory, or message queues.

Module Functions

attach(id, [address = None, [flags = 0]])
Attaches the (existing) shared memory that has the given id and returns a new SharedMemory object. See SharedMemory.attach() for details on the address and flags parameters.

This method is useful only under fairly unusual circumstances. You probably don't need it.

ftok(path, id, [silence_warning = False])
Calls ftok(path, id). Note that ftok() has limitations, and this function will issue a warning to that effect unless silence_warning is True.
remove_semaphore(id)
Removes the semaphore with the given id.
remove_shared_memory(id)
Removes the shared memory with the given id.
remove_message_queue(id)
Removes the message queue with the given id.

Module Constants

IPC_CREAT, IPC_EXCL and IPC_CREX
IPC_CREAT and IPC_EXCL are flags used when creating IPC objects. They're bitwise unique and can be ORed together. IPC_CREX is shorthand for IPC_CREAT | IPC_EXCL.

When passed to an IPC object's constructor, IPC_CREAT indicates that you want to create a new object or open an existing one. If you want the call to fail if an object with that key already exists, specify the IPC_EXCL flag, too.

IPC_PRIVATE
This is a special value that can be passed in place of a key. It implies that the IPC object should be available only to the creating process or its child processes (e.g. those created with fork()).
KEY_MIN and KEY_MAX
Denote the range of keys that this module accepts. Your OS might limit keys to a smaller range depending on the typedef of key_t.

Keys randomly generated by this module are in the range 1 <e; key <e; SHRT_MAX. That's type-safe unless your OS has a very bizarre definition of key_t.

SEMAPHORE_VALUE_MAX
The maximum value of a semaphore.
PAGE_SIZE
The operating system's memory page size, in bytes. It's probably a good idea to make shared memory segments some multiple of this size.
SEMAPHORE_TIMEOUT_SUPPORTED
True if the platform supports timed semaphore waits, False otherwise.
SHM_RND
You probably don't need this, but it can be used when attaching shared memory to force the address to be rounded down to SHMLBA. See your system's man page for shmat() for more information.
SHM_HUGETLB, SHM_NORESERVE and SHM_REMAP
You probably don't need these. They're Linux-specific flags that can be passed to the SharedMemory constructor, or to the .attach() function in the case of SHM_REMAP. See your system's man page for shmget() and shmat() for more information.

Module Errors

In addition to standard Python errors (e.g. ValueError), this module raises custom errors. These errors cover situations specific to IPC.

Error
The base error class for all the custom errors in this module. This error is occasionally raised on its own but you'll almost always see a more specific error.
InternalError
Indicates that something has gone very wrong in the module code. Please report this to the maintainer.
PermissionsError
Indicates that you've attempted something that the permissions on the IPC object don't allow.
ExistentialError
Indicates an error related to the existence or non-existence of an IPC object.
BusyError
Raised when a semaphore call to .P() or .Z() either times out or would be forced to wait when its block attribute is False.
NotAttachedError
Raised when a process attempts to read from or write to a shared memory segment to which it is not attached.

The Semaphore Class

This is a handle to a semaphore.

Methods

Semaphore(key, [flags = 0, [mode = 0600, [initial_value = 0]]])
Creates a new semaphore or opens an existing one.

key must be None, IPC_PRIVATE or an integer > KEY_MIN and ≤ KEY_MAX. If the key is None, the module chooses a random unused key.

The flags specify whether you want to create a new semaphore or open an existing one.

When opening an existing semaphore, mode is ignored.

acquire([timeout = None, [delta = 1]])
Waits (conditionally) until the semaphore's value is > 0 and then returns, decrementing the semaphore.

The timeout (which can be a float) specifies how many seconds this call should wait, if at all.

The semantics of the timeout changed a little in version 0.3.

When the call returns, the semaphore's value decreases by delta (or more precisely, abs(delta)) which defaults to 1.

On platforms that don't support the semtimedop() API call, all timeouts (including zero) are treated as infinite. The call will not return until its wait condition is satisfied.

Most platforms provide semtimedop(). OS X is a notable exception. The module's Boolean constant SEMAPHORE_TIMEOUT_SUPPORTED is True on platforms that support semtimedop().

release([delta = 1])
Releases (increments) the semaphore.

The semaphore's value increases by delta (or more precisely, abs(delta)) which defaults to 1.

P()
A synonym for .acquire() that takes the same parameters.

"P" stands for prolaag or probeer te verlagen (try to decrease), the original name given by Edsger Dijkstra.

V()
A synonym for .release() that takes the same parameters.

"V" stands for verhoog (increase), the original name given by Edsger Dijkstra.

Z([timeout = None])
Blocks until zee zemaphore is zero.

Timeout has the same meaning as described in .acquire().

remove()
Removes (deletes) the semaphore from the system.

As far as I can tell, the effect of deleting a semaphore that other processes are still using is OS-dependent. Check your system's man pages for semctl(IPC_RMID).

Attributes

key (read-only)
The key passed in the call to the constructor.
id (read-only)
The id assigned to this semaphore by the OS.
value
The integer value of the semaphore.
undo
Defaults to False.

When True, operations that change the semaphore's value will be undone (reversed) when the process exits. Note that when a process exits, an undo operation may imply that a semaphore's value should become negative or exceed its maximum. Behavior in this case is system-dependent, which means that using this flag can make your code non-portable.

block
Defaults to True, which means that calls to acquire() and release() will not return until their wait conditions are satisfied.

When False, these calls will not block but will instead raise an error if they are unable to return immediately.

mode
The semaphore's permission bits.

Tip: the following Python code will display the mode in octal:
print int(str(my_sem.mode), 8)

uid
The semaphore's user id.
gid
The semaphore's group id.
cuid (read-only)
The semaphore creator's user id.
cgid (read-only)
The semaphore creator's group id.
last_pid (read-only)
The PID of the process that last called semop() (.P(), .V() or .Z()) on this semaphore.
waiting_for_nonzero (read-only)
The number of processes waiting for the value of the semaphore to become non-zero (i.e. the number waiting in a call to .P()).
waiting_for_zero (read-only)
The number of processes waiting for the value of the semaphore to become zero (i.e. the number waiting in a call to .Z()).
o_time (read-only)
The last time semop() (i.e. .P(), .V() or .Z()) was called on this semaphore.

Context Manager Support

These semaphores provide __enter__() and __exit__() methods so they can be used in context managers. For instance --

with sysv_ipc.Semaphore(name) as sem:
    # Do something...

Entering the context acquires the semaphore, exiting the context releases the semaphore. See demo4/child.py for a complete example.

The SharedMemory Class

This is a handle to a shared memory segment.

Methods

SharedMemory(key, [flags = 0, [mode = 0600, [size = 0 or PAGE_SIZE, [init_character = ' ']]]])
Creates a new shared memory segment or opens an existing one. The memory is automatically attached.

key must be None, IPC_PRIVATE or an integer > 0 and ≤ KEY_MAX. If the key is None, the module chooses a random unused key.

The flags specify whether you want to create a new shared memory segment or open an existing one.

The value of size depends on whether one is opening an existing segment or creating a new one.

This module supplies a default size of PAGE_SIZE when IPC_CREX is specified and 0 otherwise.

attach([address = None, [flags = 0]])
Attaches this process to the shared memory. The memory must be attached before calling .read() or .write(). Note that the constructor automatically attaches the memory so you won't need to call this method unless you explicitly detach it and then want to use it again.

The address parameter allows one to specify (as a Python long) a memory address at which to attach the segment. Passing None (the default) is equivalent to passing NULL to shmat(). See that function's man page for details.

The flags are mostly only relevant if one specifies a specific address. One exception is the flag SHM_RDONLY which, surprisingly, attaches the segment read-only.

Note that on some (and perhaps all) platforms, each call to .attach() increments the system's "attached" count. Thus, if each call to .attach() isn't paired with a call to .detach(), the system's "attached" count for the shared memory segment will not go to zero when the process exits. As a result, the shared memory segment may not disappear even when its creator calls .remove() and exits.

detach()
Detaches this process from the shared memory.
read([byte_count = 0, [offset = 0]])
Reads up to byte_count bytes from the shared memory segment starting at offset and returns them as a string under Python 2 or as a bytes object under Python 3.

If byte_count is zero (the default) the entire buffer is returned.

This method will never attempt to read past the end of the shared memory segment, even when offset + byte_count exceeds the memory segment's size. In that case, the bytes from offset to the end of the segment are returned.

write(s, [offset = 0])
Writes the string s to the shared memory, starting at offset.

At most n bytes will be written, where n = the segment's size minus offset.

The string may contain embedded NULL bytes ('\0').

remove()
Removes (destroys) the shared memory. Note that actual destruction of the segment only occurs when all processes have detached.

Attributes

key (read-only)
The key provided in the constructor.
id (read-only)
The id assigned to this semaphore by the OS.
size (read-only)
The size of the segment in bytes.
address (read-only)
The address of the segment as Python long.
attached (read-only)
If True, this segment is currently attached.
last_attach_time (read-only)
The last time a process attached this segment.
last_detach_time (read-only)
The last time a process detached this segment.
last_change_time (read-only)
The last time a process changed the uid, gid or mode on this segment.
creator_pid (read-only)
The PID of the process that created this segment.
last_pid (read-only)
The PID of the most last process to attach or detach this segment.
number_attached (read-only)
The number of processes attached to this segment.
uid
The segment's user id.
gid
The segment's group id.
mode
The shared memory's permission bits.

Tip: the following Python code will display the mode in octal:
print int(str(my_mem.mode), 8)

cuid (read-only)
The segment creator's user id.
cgid (read-only)
The segment creator's group id.

The MessageQueue Class

This is a handle to a message queue.

Methods

MessageQueue(key, [flags = 0, [mode = 0600, [max_message_size = 2048]]])
Creates a new message queue or opens an existing one.

key must be None, IPC_PRIVATE or an integer > 0 and ≤ KEY_MAX. If the key is None, the module chooses a random unused key.

The flags specify whether you want to create a new queue or open an existing one.

The max_message_size can be increased from the default, but be aware of the issues discussed in Message Queue Limits.

send(message, [block = True, [type = 1]])
Puts a message on the queue.

The message string can contain embedded NULLs (ASCII 0x00).

The block flag specifies whether or not the call should wait if the message can't be sent (if, for example, the queue is full). When block is False, the call will raise a BusyError if the message can't be sent immediately.

The type is associated with the message and is relevant when calling receive(). It must be > 0.

receive([block = True, [type = 0]])
Receives a message from the queue, returning a tuple of (message, type). Under Python 3, the message is a bytes object.

The block flag specifies whether or not the call should wait if there's no messages of the specified type to retrieve. When block is False, the call will raise a BusyError if a message can't be received immediately.

The type permits some control over which messages are retrieved.

remove()
Removes (deletes) the message queue.

Attributes

key (read-only)
The key provided in the constructor.
id (read-only)
The id assigned to this queue by the OS.
max_size
The maximum size of the queue in bytes. Only a process with "appropriate privileges" can increase this value, and on some systems even that won't work. See Message Queue Limits for details.
last_send_time (read-only)
The last time a message was placed on the queue.
last_receive_time (read-only)
The last time a message was received from the queue.
last_change_time (read-only)
The last time a process changed the queue's attributes.
last_send_pid (read-only)
The id of the most recent process to send a message.
last_receive_pid (read-only)
The id of the most recent process to receive a message.
current_messages (read-only)
The number of messages currently in the queue.
uid
The queue's user id.
gid
The queue's group id.
mode
The queue's permission bits.

Tip: the following Python code will display the mode in octal:
print int(str(my_mem.mode), 8)

cuid (read-only)
The queue creator's user id.
cgid (read-only)
The queue creator's group id.

Supported Features and Differences from SHM

This module is almost, but not quite, a superset of shm. Some of the additional features are the ability to override the block flag on a per-call basis, the ability to change the semaphore's value in increments > 1 when calling .P() and .V() and exposure of sem_otime.

Differences that might trip you up are listed below.

Usage Tips

Sample Code

This module comes with two demonstration apps. The first (in the directory demo) shows how to use shared memory and semaphores. The second (in the directory demo2) shows how to use message queues.

The Weakness of ftok()

Most System V IPC sample code recommends ftok() for generating an integer key that's more-or-less random. It does not, however, guarantee that the key it generates is unused. If ftok() gives your application a key that some other application is already using, your app is in trouble unless it has a reliable second mechanism for generating a key. And if that's the case, why not just abandon ftok() and use the second mechanism exclusively?

This is the weakness of ftok() -- it isn't guaranteed to give you what you want. The BSD man page for ftok says it is "quite possible for the routine to return duplicate keys". The term "quite possible" isn't quantified, but suppose it means one-tenth of one percent. Who wants to have 1-in-1000 odds of a catastrophic failure in their program, or even 1-in-10000?

This module obviates the need for ftok() by generating random keys for you. If your application can't use sysv_ipc's automatically generated keys because it needs to know the key in advance, hardcoding a random number like 123456 in your app might be no worse than using ftok() and has the advantage of not hiding its limitations.

This module provides ftok() in case you want to experiment with it. However, to emphasize its weakness, this version of ftok() raises a warning with every call unless you explicitly pass a flag to silence it.

This package also provides ftok_experiment.py so that you can observe how often ftok() generates duplicate keys on your system.

Semaphore Initialization

When a System V sempahore is created at the C API level, the OS is not required to initialize the semaphore's value. (This per the SUSv3 standard for semget().) Some (most? all?) operating systems initialize it to zero, but this behavior is non-standard and therefore can't be relied upon.

If sempahore creation happens in an predictable, orderly fashion, this isn't a problem. But a race condition arises when multiple processes vie to create/open the same semaphore. The problem lies in the fact that when an application calls semget() with only the IPC_CREAT flag, the caller can't tell whether or not he has created a new semaphore or opened an existing one. This makes it difficult to create reliable code without using IPC_EXCL. W. Richard Stevens' Unix Network Programming Volume 2 calls this "a fatal flaw in the design of System V semaphores" (p 284).

For instance, imagine processes P1 and P2. They're executing the same code, and that code intends to share a binary semaphore. Consider the following sequence of events at the startup of P1 and P2 –

  1. P1 calls semget(IPC_CREAT) to create the semaphore S.
  2. P2 calls semget(IPC_CREAT) to open S.
  3. P1 initializes the semaphore's value to 1.
  4. P1 calls acquire(), decrementing the value to 0.
  5. P2, assuming S is a newly-created semaphore that needs to be initialized, incorrectly sets the semaphore's value to 1.
  6. P2 calls acquire(), decrementing the value to 0. Both processes now think they own the lock.

W. Richard Stevens' solution for this race condition is to check the value of sem_otime (an element in the semid_ds struct that's populated on the call to semctl(IPC_STAT) and which is exposed to Python by this module) which is initialized to zero when the semaphore is created and otherwise holds the time of the last call to semop() (which is called by P()/acquire(), V()/release(), and Z()).

In Python, each process would run something like this:

try:
    sem = sysv_ipc.Semaphore(42, sysv_ipc.IPC_CREX)
except sysv_ipc.ExistentialError:
    # One of my peers created the semaphore already
    sem = sysv_ipc.Semaphore(42)
    # Waiting for that peer to do the first acquire or release
    while not sem.o_time:
        time.sleep(.1)
else:
    # Initializing sem.o_time to nonzero value
    sem.release()
# Now the semaphore is safe to use.

Shared Memory Initialization

With shared memory, using the IPC_CREAT flag without IPC_EXCL is problematic unless you know the size of the segment you're potentially opening.

Why? Because when creating a new segment, many (most? all?) operating systems demand a non-zero size. However, when opening an existing segment, zero is the only guaranteed safe value (again, assuming one doesn't know the size of the segment in advance). Since IPC_CREAT can open or create a segment, there's no safe value for the size under this circumstance.

As a (sort of) side note, the SUSv3 specification for shmget() says only that the size of a new segment must not be less than "the system-imposed minimum". I gather that at one time, some systems set the minimum at zero despite the fact that it doesn't make much sense to create a zero-length shared memory segment. I think most modern systems do the sensible thing and insist on a minimum length of 1.

Message Queue Limits

Python programmers can usually remain blissfully ignorant of memory allocation issues. Unfortunately, a combination of factors makes them relevant when dealing with System V message queues.

Some implementations impose extremely stingy limits. For instance, many BSDish systems (OS X, FreeBSD, NetBSD, and OpenBSD) limit queues to 2048 bytes. Note that that's the total queue size, not the message size. Two 1k messages would fill the queue.

Those limits can be very difficult to change. At best, only privileged processes can increase the limit. At worst, the limit is a kernel parameter and requires a kernel change via a tunable or a recompile.

This module can't figure out what the limits are, so it can't cushion them or even report them to you. On some systems the limits are expressed in header files, on others they're available through kernel interfaces (like FreeBSD's sysctl). Under OS X and to some extent OpenSolaris I can't figure out where they're defined and what I report here is the result of experimentation and educated guesses formed by Googling.

The good news is that this module will still behave as advertised no matter what these limits are. Nevertheless you might be surprised when a call to .send() get stuck because a queue is full even though you've only put 2048 bytes of messages in it.

Here are the limits I've been able to find under my test operating systems, ordered from best (most generous) to worst (most stingy). This information was current as of 2009 when I wrote the message queue code. It's getting pretty stale now. I hope the situation has improved over the 2009 numbers I describe below.

Under OpenSolaris 2008.05 each queue's maximum size defaults to 64k. A privileged process (e.g. root) can change this through the max_size attribute of a sysv_ipc.MessageQueue object. I was able to increase it to 16M and successfully sent sixteen 1M messages to the queue.

Under Ubuntu 8.04 (and perhaps other Linuxes) each queue's maximum size defaults to 16k. As with OpenSolaris, I was able to increase this to 16M, but only for a privileged process.

Under FreeBSD 7 and I think NetBSD and OpenBSD, each queue's maximum size defaults to 2048 bytes. Furthermore, one can (as root) set max_size to something larger and FreeBSD doesn't complain, but it also ignores the change.

OS X is the worst of the lot. Each queue is limited to 2048 bytes and OS X silently ignores attempts to increase this (just like FreeBSD). To add insult to injury, there appears to be no way to increase this limit short of recompiling the kernel. I'm guessing at this based on the Darwin message queue limits.

If you want to search for these limits on your operating system, the key constants are MSGSEG, MSGSSZ, MSGTQL, MSGMNB, MSGMNI and MSGMAX. Under BSD, sysctl kern.ipc should tell you what you need to know and may allow you to change these parameters.

Nobody Likes a Mr. Messy

Semaphores and especially shared memory are a little different from most Python objects and therefore require a little more care on the part of the programmer. When a program creates a semaphore or shared memory object, it creates something that resides outside of its own process, just like a file on a hard drive. It won't go away when your process ends unless you explicitly remove it.

In short, remember to clean up after yourself.

Consult Your Local man Pages

The sysv_ipc module is just a wrapper around your system's API. If your system's implementation has quirks, the man pages for semget, semctl, semop shmget, shmat, shmdt and shmctl will probably cover them.

Interesting Tools

Many systems (although not some older versions of OS X) come with ipcs and ipcrm. The former shows existing shared memory, semaphores and message queues on your system and the latter allows you to remove them.

Last But Not Least

For Pythonistas –

Known Bugs

Bugs? My code never has bugs! There are, however, some suboptimal anomalies...

Version History

Future Features/Changes

These are features that may or may not be added depending on technical difficulty, user interest and so forth.

I don't plan on adding support for semaphore sets.