summaryrefslogtreecommitdiff
path: root/Documentation/networking/af_bus.txt
blob: a0b078f9fe3f8cbe9b4ecc7415a82aedd34ab8ce (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
			The AF_BUS socket address family
			================================

Introduction
------------

AF_BUS is a message oriented inter process communication system.

The principle features are:

 - Reliable datagram based communication (all sockets are of type
   SOCK_SEQPACKET)

 - Multicast message delivery (one to many, unicast as a subset)

 - Strict ordering (messages are delivered to every client in the same order)

 - Ability to pass file descriptors

 - Ability to pass credentials

The basic concept is to provide a virtual bus on which multiple
processes can communicate and policy is imposed by a "bus master".

A process can create buses to which other processes can connect and
communicate with each other by sending messages. Processes' addresses
are automatically assigned by the bus on connect and are
unique. Messages can be sent either to a process' unique address or to
a bus multicast addresses.

Netfilter rules or Berkeley Packet Filter can be used to restrict the
messages that each peer is allowed to receive. This is especially
important when sending to multicast addresses.

Besides messages, process can send and receive ancillary data (i.e.,
SCM_RIGHTS for passing file descriptors or SCM_CREDENTIALS for passing
Unix credentials). In the case of a multicast message all recipients
of a message may obtain a copy a file descriptor or credentials.

A bus is created by processes connecting on an AF_BUS socket. The
"bus master" binds itself instead of connecting to the NULL address.

The socket address is made up of a path component and a numeric
component. The path component is either a pathname or an abstract
socket similar to a unix socket. The numeric component is used to
uniquely identify each connection to the bus. Thus the path identifies
a specific bus and the numeric component the attachment to that bus.

The process that calls bind(2) on the socket is the owner of the bus
and is called the bus master. The master is a special client of the
bus and has some responsibility for the bus' operation. The master is
assigned a fixed address with all the bits zero (0x0000000000000000).

Each process connected to an AF_BUS socket has one or more addresses
within that bus. These addresses are 64-bit unsigned integers,
interpreted by splitting the address into two parts: the most
significant 16 bits are a prefix identifying the type of address, and
the remaining 48 bits are the actual client address within that
prefix, as shown in this figure:

Bit:  0             15 16                                            63
     +----------------+------------------------------------------------+
     |  Type prefix   |                Client address                  |
     +----------------+------------------------------------------------+

The prefix with all bits zero is reserved for use by the kernel, which
automatically assigns one address from this prefix to each client on
connection.  The address in this prefix with all bits zero is always
assigned to the bus master. Addresses on the prefix 0x0000 are unique
and will never repeat for the lifetime of the bus master.

A client may have multiple addresses. When data is sent to other
clients, those clients will always see the sender address that is in
the prefix 0x0000 address space when calling recvmsg(2) or
recvfrom(2). Similarly, the prefix 0x0000 address is returned by calls
to getsockname(2) and getpeername(2).

For each prefix, the address where the least significant 48 bits are
all 1 (i.e., 0xffffffffffff) is also reserved, and can be used to send
multicast messages to all the peers on a prefix.

The non-reserved addresses in each of the remaining prefixes are
managed by the bus master, which may assign additional addresses to
any other connected socket.

Having different name-spaces has two advantages:

  - Clients can have addresses on different mutually-exclusive
    scopes. This permits sending multicast packets to only clients
    that have addresses on a given prefix.

  - The addressing scheme can be more flexible. The kernel will only
    assign unique addresses on the all-bits-zero prefix (0x0000) and
    allows the bus master process to assign additional addresses to
    clients on other prefixes.  By having different prefixes, the
    kernel and bus master assignments will not collide.

AF_BUS transport can support two network topologies. When a process
first connects to the bus master, it can only communicate with the bus
master. The process can't send and receive packets from other peers on
the bus. So, from the client process point of view the network
topology is point-to-point.

The bus master can allow the connected peer to be part of the bus and
start to communicate with other peers by setting a socket option with
the setsockopt(2) system call using the accepted socket descriptor. At
this point, the topology becomes a bus to the client process.

Packets whose destination address is not assigned to any client are
routed by default to the bus master (the client accepted socket
descriptor).


Semantics
---------

Bus features:

 - Unicast and multicast addressing scheme.
 - Ability to assign addresses from user-space with different prefixes.
 - Automatic address assignment.
 - Ordered packets delivery (FIFO, total ordering).
 - File descriptor and credentials passing.
 - Support for both point-to-point and bus network topologies.
 - Bus control access managed from user-space.
 - Netfilter hooks for packet sending, routing and receiving.

A process (the "bus master") can create an AF_BUS bus with socket(2)
and use bind(2) to assign an address to the bus. Then it can listen(2)
on the created socket to start accepting incoming connections with
accept(2).

Processes can connect to the bus by creating a socket with socket(2)
and using connect(2). The kernel will assign a unique address to each
connection and messages can be sent and received by using BSD socket
primitives.

This uses the connect(2) semantic in a non-traditional way, with
AF_BUS sockets, it's not possible to connect "my" socket to a specific
peer socket whereas the traditional BSD sockets API usage, connect(2)
either connects to stream sockets, or assigns a peer address to a
datagram socket (so that send(2) can be used instead of sendto()).

An AF_BUS socket address is represented as a combination of a bus
address and a bus path name. Address are unique within a path. The
unique bus address is further subdivided into a prefix and a client
address. Thus the path identifies a specific bus and the numeric
component the attachment to that bus.

#define BUS_PATH_MAX    108

/* Bus address */
struct bus_addr {
	uint64_t    s_addr; 	/* 16-bit prefix + 48-bit client address */
};

/* Structure describing an AF_BUS socket address. */
struct sockaddr_bus {
	sa_family_t     sbus_family; 	   	  /* AF_BUS */
	struct bus_addr sbus_addr;                /* bus address */
	char 		sbus_path[BUS_PATH_MAX];  /* pathname */
};

A process becomes a bus master for a given struct sockaddr_bus by
calling bind(2) on an AF_BUS addresses. The argument must be { AF_BUS,
0, path }. 

AF_BUS supports both abstract and non-abstract path names. Abstract
names are distinguished by the fact that sbus_path[0] == '\0' and they
don't represent file system paths while non-abstract paths are bound
to a file system path name. (See the unix(7) man page for a discussion
of abstract socket addresses in the AF_UNIX address family.)

Then the process calls listen(2) to accept incoming connections. If
that process calls getsockname(2), the returned address will be {
AF_BUS, 0, path }.

The conventional string form of the full address is path + ":" +
prefix + "/" + client address. Prefix and client address are
represented in hex.

For example the address:

struct sockaddr_bus addr;
addr.sbus_family = AF_BUS;
strcpy(addr.sbus_path, "/tmp/test");
addr.sbus_addr.s_addr   = 0x0002f00ddeadbeef;

would be represented using the string /tmp/test:0002/f00ddeadbeef.

If the bus_addr is 0, then both the prefix and client address may be
omitted from the string form.  To connect to a bus as a client it is
sufficient to specify the path, since the listening address always has
bus_addr == 0. it is not meanigful to specify 'bus_addr' as other than
0 on connect()

The AF_BUS implementation will automatically assign a unique address
to each client but the bus master can assign additional addresses on a
different prefix by means of the setsockopt(2) system call. For
example:

struct bus_addr addr;
addr.s_addr = 0x0001deadfee1dead;
ret = setsockopt(afd, SOL_BUS, BUS_ADD_ADDR, &addr, sizeof(addr));

where afd is the accepted socket descriptor in the daemon. To show graphically:

	  L          The AF_BUS listening socket  }
       /  |  \                                    }-- listener process
     A1  A2  A3      The AF_BUS accepted sockets  }
      |   |   |
     C1  C2  C3      The AF_BUS connected sockets }-- client processes

So if setsockopt(A1, SOL_BUS, BUS_ADD_ADDR, &addr, sizeof(addr)) is
called, C1 will get the new address.

The inverse operation is BUS_DEL_ADDR, which the bus master can use to
remove a client socket AF_BUS address:

ret = setsockopt(afd, SOL_BUS, BUS_DEL_ADDR, &addr, sizeof(addr));

Besides assigning additional addresses, the bus master has to allow a
client process to communicate with other peers on the bus using a
setsockopt(2):

ret = setsockopt(afd, SOL_BUS, BUS_JOIN_BUS, NULL, 0);

Clients are not meant to send messages to each other until the master
tells them (in a protocol-specific way) that the BUS_JOIN_BUS
setsockopt(2) call was made.

If a client sends a message to a destination other than the bus
master's all-zero address before joining the bus, a EHOSTUNREACH (No
route to host) error is returned since the only host that exists in
the point-to-point network before the client joins the bus are the
client and the bus master.  

A EHOSTUNREACH is returned if a client that joined a bus tries to send
a packet to a client from another bus. Cross-bus communication is not
permited.

When a process wants to send a unicast message to a peer, it fills a
sockaddr structure and performs a socket operation (i.e., sendto(2))

struct sockaddr_bus addr;
char *msg = "Hello world";

addr.sbus_family 	   = AF_BUS;
strcpy(addr.sbus_path, "/tmp/test");
addr.sbus_addr.s_addr   = 0x0001f00ddeadbeef;

ret = sendto(sockfd, "Hello world", strlen("Hello world"), 0,
	    (struct sockaddr*)&addr, sizeof(addr));

The current implementation requires that the addr.sbus_path component
match the one used to conenct() to the bus but in future this
requirement will be removed.

The kernel will first check that the socket is connected and that the
bus path of the socket correspond with the destination, then it will
extract the prefix and client address from the bus address using a
fixed 16 -bit bitmask.

prefix 		= bus address >> 48 & 0xffff
client address 	= bus address & 0xffff

If the client address is not all bits one, then the message is unicast
and is delivered to the socket with that assigned address
(0x0001f00ddeadbeef).  Otherwise the message is multicast and is
delivered to all the peers with this address prefix (0x0001 in this
case).

So, when a process wants to send a multicast message, it just has to
fill the address structure with the address prefix + 0xffffffffffff:

struct sockaddr_bus addr;
char *msg = "Hello world";

addr.bus_family = AF_BUS;
strcpy(addr.sbus_path, "/tmp/test");
addr.bus_addr   = 0x0001ffffffffffff;

ret = sendto(sockfd, "Hello world", strlen("Hello world"), 0,
	    (struct sockaddr*)&addr, sizeof(addr));

The kernel, will apply the binary and operation, learn that the
address is 0xffffffffffff and send the message to all the peers on
this prefix (0x0001).

Socket transmit queued bytes are limited by a maximum send buffer size
(sysctl_wmem_max) defined in the kernel and can be modified at runtime
using the sysctl interface on /proc/sys/net/core/wmem_default. This
parameter is global for all the sockets families in a Linux system.

AF_BUS permits the definition of a per-bus maximum send buffer size
using the BUS_SET_SENDBUF socket option. The bus master can call the
setsockopt(2) system call using as a parameter the listening socket.
The command sets a maximum write buffer that will be imposed on each
new socket that connects to the bus:

ret = setsockopt(serverfd, SOL_BUS, BUS_SET_SENDBUF, &sndbuf,
sizeof(int));

In the transmission path both Berkeley Packet Filters and Netfilter
hooks are available, so they can be used to filter sending packets.


Using this addressing scheme with D-Bus
---------------------------------------

As an example of a use case for AF_BUS, let's analyze how the D-Bus
IPC system can be implemented on top of it.

We define a new D-Bus address type "afbus".

A D-Bus client may connect to an address of the form "afbus:path=X"
where X is a string. This means that it connect()s to { AF_BUS, 0, X }.

For example: afbus:path=/tmp/test connects to { AF_BUS, 0, /tmp/test }.

A D-Bus daemon may listen on the address "afbus:", which means that it
binds to { AF_BUS, 0, /tmp/test }. It will advertise an address of the
form "afbus:path=/tmp/test" to clients, for instance via the
--print-address option, or via dbus-launch setting the
DBUS_SESSION_BUS_ADDRESS environment variable.  For instance, "afbus:"
is an appropriate default listening address for the session bus,
resulting in dbus-launch setting the DBUS_SESSION_BUS_ADDRESS
environment variable to something like
"afbus:path=/tmp/test,guid=...".

A D-Bus daemon may listen on the address "afbus:file=/some/file",
which means that it will do as above, then write its path into the
given well-known file.  For instance,
"afbus:file=/run/dbus/system_bus.afbus" is an appropriate listening
address for the system bus. Only processes with suitable privileges to
write to that file can impersonate the system bus.

D-Bus clients wishing to connect to the well-known system bus should
attempt to connect to afbus:file=/run/dbus/system_bus.afbus, falling
back to unix:path=/var/run/dbus/system_bus_socket if that fails. On
Linux systems, the well-known system bus daemon should attempt to
listen on both of those addresses.

The D-Bus daemon will serve as bus master as well since it will be the
process that creates and listens on the AF_BUS socket.

D-Bus clients will use the fixed bus master address (all zero bits) to
send messages to the D-Bus daemon and the client's unique address to
send messages to other D-Bus clients using the bus.

When initially connected, D-Bus clients will only be able to
communicate with the D-Bus daemon and will send authentication
information (AUTH message and SCM_CREDENTIALS ancillary
messages). Since the D-Bus daemon is also the bus master, it can allow
D-Bus clients to join the bus and be able to send and receive D-Bus
messages from other peers.

On connection, the kernel will assign to each client an address in the
prefix 0x0000. If a client attempts to send messages to clients other
than the bus master, this is considered to be an error, and is
prevented by the kernel.

When the D-Bus daemon has authenticated a client and determined that
it is authorized to be on this bus, it uses a setsockopt(2) call to
tell the kernel that this client has permission to send messages. The
D-Bus daemon then tells the client by sending the Hello() reply that
it has made the setsockopt(2) call and that now is able to send
messages to other peers on the bus.

Well-known names are represented by addresses in the 0x0001, ... prefixes.

Addresses in prefix 0x0000 must be mapped to D-Bus unique names in a
way that can't collide with unique names allocated by the dbus-daemon
for legacy clients.

In order to be consistent with current D-Bus unique naming, the AF_BUS
addresses can be mapped directly to D-Bus unique names, for example
(0000/0000deadbeef to ":0.deadbeef"). Leading zeroes can be suppressed
since the common case should be relatively small numbers (the kernel
allocates client addresses sequentially, and machines could be
rebooted occasionally).

By having both AF_BUS and legacy D-Bus clients use the same address
space, the D-Bus daemon can act as a proxy between clients and can be
sure that D-Bus unique names will be unique for both AF_BUS and legacy
clients.

To act as a proxy between AF_BUS and legacy clients, each time the
D-Bus daemon accepts a legacy connection (i.e., AF_UNIX), it will
create an AF_BUS socket and establish a connection with itself. It
will then associate this newly created connection with the legacy one.

To explain it graphically:

	  L          The AF_BUS listening socket  }
       /  |  \                                    }-- listener process
     A1  A2  A3      The AF_BUS accepted sockets  }
      |   |   |
     C1  C2  C3      The AF_BUS connected sockets, where:
      |                    * C1 belongs to the listener process
      |                    * C2 and C3 belongs to the client processes
      |
 L2--A4       The AF_UNIX listening and accepted sockets \
      |                            in the listener process
     C4       The AF_UNIX connected socket in the legacy client process


where C2 and C3 are normal AF_BUS clients and C4 is a legacy
client. The D-Bus daemon after accepting the connection using the
legacy transport (A4), will create an AF_BUS socket pair (C1, A1)
associated with the legacy client.

Legacy clients will send messages to the D-Bus daemon using their
legacy socket and the D-Bus daemon will extract the destination
address, resolve to the corresponding AF_BUS address and use this to
send the message to the right peer.  

Conversely, when an AF_BUS client sends a D-Bus message to a legacy
client, it will use the AF_BUS address of the connection associated
with that client. The D-Bus daemon will receive the message, modify
the message's content to set SENDER headers based on the AF_BUS source
address and use the legacy transport to send the D-Bus message to the
legacy client.

As a special case, the bus daemon's all-zeroes address maps to
"org.freedesktop.DBus" and vice versa.

When a D-Bus client receives an AF_BUS message from the bus master
(0/0), it must use the SENDER header field in the D-Bus message, as
for any other D-Bus transport, to determine whether the message is
actually from the D-Bus daemon (the SENDER is "org.freedesktop.DBus"
or missing), or from another client (the SENDER starts with ":"). It
is valid for messages from another AF_BUS client to be received via
the D-Bus daemon; if they are, the SENDER header field will always be
set.

Besides its unique name, D-Bus services can have well-known names such
as org.gnome.Keyring or org.freedesktop.Telepathy. These well-known
names can also be used as a D-Bus message destination
address. Well-known names are not numeric and AF_BUS is not able to
parse D-Bus messages.

To solve this, the D-Bus daemon will assign an additional AF_BUS
address to each D-Bus client that owns a well-known name. The mapping
between well-known names and AF_BUS address is maintained by the D-Bus
daemon on a persistent data structure.

D-Bus client libraries will maintain a cache of these mappings so they
can send messages to services with well-known names using their mapped
AF_BUS address.

If a client intending to send a D-Bus message to a given well-known
name does not have that well-known name in its cache, it must send the
AF_BUS message to the listener (0000/000000000000) instead. 

The listener must forward the D-Bus message to the owner of that
well-known name, setting the SENDER header field if necessary. It may
also send this AF_BUS-specific D-Bus signal to the sender, so that the
sender can update its cache:

     org.freedesktop.DBus.AF_BUS.Forwarded (STRING well_known_name,
	 UINT64 af_bus_client)

	 Emitted by the D-Bus daemon with sender "org.freedesktop.DBus"
	 and object path "/org/freedesktop/DBus" to indicate that
	 the well-known name well_known_name is represented by the
	 AF_BUS address { AF_BUS, af_bus_client, path } where
	 path is the path name used by this bus.

	 For instance, if the well-known name "org.gnome.Keyring"
	 is represented by AF_BUS address 0001/0000deadbeef,
	 the signal would have arguments ("org.gnome.Keyring",
	 0x00010000deadbeef), corresponding to the AF_BUS
	 address { AF_BUS, 0x00010000deadbeef, /tmp/test }.

If the D-Bus service for that well-known name is not active, then the
D-Bus daemon will first do the service activation, assign an
additional address to the recently activated service, store the
well-known service to numeric address mapping on its persistent cache,
and then send the AF_BUS.Forwarded signal back to the client.

Once the mapping has been made, the AF_BUS address associated with a
well-known name cannot be reused for the lifetime of the D-Bus daemon
(which is the same as the lifetime of the socket). 

Nevertheless the AF_BUS address associated with a well-known name can
change, for example if a service goes away and a new instance gets
activated. This new instance can have a different AF_BUS address.  The
D-Bus daemon will maintain a list of the mappings that are currently
valid so it can send the AF_BUS.

Forwarded signal with the mapping information to the clients. Client
libraries will maintain a fixed-size Last Recently Used (LRU) cache
with previous mappings sent by the D-Bus daemon.

If the clients overwrite a mapping due to the LRU replace policy and
later want to send a D-Bus message to the overwritten well-known name,
they will send the D-Bus message back to the D-Bus daemon and this
will send the signal with the mapping information. 

If a service goes away or if the service AF_BUS address changed and
the client still has the old AF_BUS address in its cache, it will send
the D-Bus message to the old destination. 

Since packets whose destination AF_BUS addresses are not assigned to
any process are routed by default to the bus master, the D-Bus daemon
will receive these D-bus messages and send an AF_BUS.

Forwarded signal back to the client with the new AF_BUS address so it
can update its cache with the new mapping.

For well-known names, the D-Bus daemon will use a different address
prefix (0x0001) so it doesn't conflict with the D-Bus unique names
address prefix (0x0000).

Besides D-Bus method call messages which are unicast, D-Bus allows
clients to send multicast messages (D-Bus signals). Clients can send
signals messages using the bus unique name prefix multicast address
(0x0001ffffffffffff).

A netfilter hook is used to filter these multicast messages and only
deliver to the correct peers based on match rules.


D-Bus aware netfilter module
----------------------------

AF_BUS is designed to be a generic bus transport supporting both
unicast and multicast communications.

In order for D-Bus to operate efficiently, the transport method has to
know the D-Bus message wire-protocol and D-Bus message structure. But
adding this D-Bus specific knowledge to AF_BUS will break one of the
fundamental design principles of any network protocol stack, namely
layer-independence: layer n must not make any assumptions about the
payload in layer n + 1.

So, in order to have a clean protocol design but be able to allow the
transport to analyze the D-Bus messages, netfilter hooks are used to
do the filtering based on match rules.

The kernel module has to maintain the match rules and the D-Bus daemon
is responsible for managing this information. Every time an add match
rule message is processed by the D-Bus daemon, this will update the
netfilter module match rules set so the netfilter hook function can
use that information to do the match rules based filtering.

The D-Bus daemon and the netfilter module will use the generic netlink
subsystem to do the kernel-to-user-space communication. Netlink is
already used by most of the networking subsystem in Linux
(iptables/netfilter, ip/routing, etc).

We enforce a security scheme so only the bus master's user ID can
update the netfilter module match rules set.

The advantage of using the netfilter subsystem is that we decouple the
mechanism from the policy. AF_BUS will only add a set of hook points
and external modules will be used to enforce a given policy.