1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
|
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE erlref SYSTEM "erlref.dtd">
<!-- %ExternalCopyright% -->
<erlref>
<header>
<copyright>
<year>2021</year><year>2021</year>
<holder>Maxim Fedorov, WhatsApp Inc.</holder>
</copyright>
<legalnotice>
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
</legalnotice>
<title>peer</title>
<prepared>maximfca@gmail.com</prepared>
<docno></docno>
<date></date>
<rev></rev>
<file>peer.xml</file>
</header>
<module since="OTP @OTP-17720@">peer</module>
<modulesummary>Start and control linked Erlang nodes.
</modulesummary>
<description>
<p>
This module provides functions for starting linked Erlang nodes.
The node spawning new nodes is called <em>origin</em>, and newly started
nodes are <em>peer</em> nodes, or peers. A peer node automatically
terminates when it loses the <em>control connection</em> to the origin. This
connection could be an Erlang distribution connection, or an alternative -
TCP or standard I/O. The alternative connection provides a way to execute
remote procedure calls even when Erlang Distribution is not available,
allowing to test the distribution itself.
</p>
<p>
Peer node terminal input/output is relayed through the origin.
If a standard I/O alternative connection is requested, console output
also goes via the origin, allowing debugging of node startup and boot
script execution (see <seecom marker="erts:erl#init_debug">
<c>-init_debug</c></seecom>). File I/O is not redirected, contrary to
<seeerl marker="slave"><c>slave(3)</c></seeerl> behaviour.
</p>
<p>
The peer node can start on the same or a different host (via <c>ssh</c>)
or in a separate container (for example Docker).
When the peer starts on the same host as the origin, it inherits
the current directory and environment variables from the origin.
</p>
<note>
<p>
This module is designed to facilitate multi-node testing with Common Test.
Use the <c>?CT_PEER()</c> macro to start a linked peer node according to
Common Test conventions: crash dumps written to specific location, node
name prefixed with module name, calling function, and origin OS process
ID). Use <seemfa marker="#random_name/1"><c>random_name/1</c></seemfa> to
create sufficiently unique node names if you need more control.
</p>
<p>
A peer node started without alternative connection behaves similarly
to <seeerl marker="slave"><c>slave(3)</c></seeerl>. When an alternative
connection is requested, the behaviour is similar to
<c>test_server:start_node(Name, peer, Args).</c>
</p> </note>
</description>
<section>
<title>Example</title>
<p>
The following example implements a test suite starting extra Erlang nodes.
It employs a number of techniques to speed up testing and reliably shut
down peer nodes:
</p>
<list>
<item>peers start linked to test runner process. If the test case fails,
the peer node is stopped automatically, leaving no rogue nodes running in
the background</item>
<item>arguments used to start the peer are saved in the control process
state for manual analysis. If the test case fails, the CRASH REPORT contains
these arguments</item>
<item>multiple test cases can run concurrently speeding up overall testing
process, peer node names are unique even when there are multiple instances
of the same test suite running in parallel</item>
</list>
<code type="erl">
-module(my_SUITE).
-behaviour(ct_suite).
-export([all/0, groups/0]).
-export([basic/1, args/1, named/1, restart_node/1, multi_node/1]).
-include_lib("common_test/include/ct.hrl").
groups() ->
[{quick, [parallel],
[basic, args, named, restart_node, multi_node]}].
all() ->
[{group, quick}].
basic(Config) when is_list(Config) ->
{ok, Peer, _Node} = ?CT_PEER(),
peer:stop(Peer).
args(Config) when is_list(Config) ->
%% specify additional arguments to the new node
{ok, Peer, _Node} = ?CT_PEER(["-emu_flavor", "smp"]),
peer:stop(Peer).
named(Config) when is_list(Config) ->
%% pass test case name down to function starting nodes
Peer = start_node_impl(named_test),
peer:stop(Peer).
start_node_impl(ActualTestCase) ->
{ok, Peer, Node} = ?CT_PEER(#{name => ?CT_PEER_NAME(ActualTestCase)}),
%% extra setup needed for multiple test cases
ok = rpc:call(Node, application, set_env, [kernel, key, value]),
Peer.
restart_node(Config) when is_list(Config) ->
Name = ?CT_PEER_NAME(),
{ok, Peer, Node} = ?CT_PEER(#{name => Name}),
peer:stop(Peer),
%% restart the node with the same name as before
{ok, Peer2, Node} = ?CT_PEER(#{name => Name, args => ["+fnl"]}),
peer:stop(Peer2).
</code>
<p>
The next example demonstrates how to start multiple nodes concurrently:
</p>
<code type="erl">
multi_node(Config) when is_list(Config) ->
Peers = [?CT_PEER(#{wait_boot => {self(), tag}})
|| _ <- lists:seq(1, 4)],
%% wait for all nodes to complete boot process, get their names:
_Nodes = [receive {tag, {started, Node, Peer}} -> Node end
|| {ok, Peer} <- Peers],
[peer:stop(Peer) || {ok, Peer} <- Peers].
</code>
<p>
Start a peer on a different host. Requires <c>ssh</c> key-based
authentication set up, allowing "another_host" connection without password
prompt.
</p>
<code type="erl">
Ssh = os:find_executable("ssh"),
peer:start_link(#{exec => {Ssh, ["another_host", "erl"]},
connection => standard_io}),
</code>
<p>
The following Common Test case demonstrates Docker integration, starting two
containers with hostnames "one" and "two". In this example Erlang nodes
running inside containers form an Erlang cluster.
</p>
<code type="erl">
docker(Config) when is_list(Config) ->
Docker = os:find_executable("docker"),
PrivDir = proplists:get_value(priv_dir, Config),
build_release(PrivDir),
build_image(PrivDir),
%% start two Docker containers
{ok, Peer, Node} = peer:start_link(#{name => lambda,
connection => standard_io,
exec => {Docker, ["run", "-h", "one", "-i", "lambda"]}}),
{ok, Peer2, Node2} = peer:start_link(#{name => lambda,
connection => standard_io,
exec => {Docker, ["run", "-h", "two", "-i", "lambda"]}}),
%% find IP address of the second node using alternative connection RPC
{ok, Ips} = peer:call(Peer2, inet, getifaddrs, []),
{"eth0", Eth0} = lists:keyfind("eth0", 1, Ips),
{addr, Ip} = lists:keyfind(addr, 1, Eth0),
%% make first node to discover second one
ok = peer:call(Peer, inet_db, set_lookup, [[file]]),
ok = peer:call(Peer, inet_db, add_host, [Ip, ["two"]]),
%% join a cluster
true = peer:call(Peer, net_kernel, connect_node, [Node2]),
%% verify that second peer node has only the first node visible
[Node] = peer:call(Peer2, erlang, nodes, []),
%% stop peers, causing containers to also stop
peer:stop(Peer2),
peer:stop(Peer).
build_release(Dir) ->
%% load sasl.app file, otherwise application:get_key will fail
application:load(sasl),
%% create *.rel - release file
RelFile = filename:join(Dir, "lambda.rel"),
Release = {release, {"lambda", "1.0.0"},
{erts, erlang:system_info(version)},
[{App, begin {ok, Vsn} = application:get_key(App, vsn), Vsn end}
|| App <- [kernel, stdlib, sasl]]},
ok = file:write_file(RelFile, list_to_binary(lists:flatten(
io_lib:format("~tp.", [Release])))),
RelFileNoExt = filename:join(Dir, "lambda"),
%% create boot script
{ok, systools_make, []} = systools:make_script(RelFileNoExt,
[silent, {outdir, Dir}]),
%% package release into *.tar.gz
ok = systools:make_tar(RelFileNoExt, [{erts, code:root_dir()}]).
build_image(Dir) ->
%% Create Dockerfile example, working only for Ubuntu 20.04
%% Expose port 4445, and make Erlang distribution to listen
%% on this port, and connect to it without EPMD
%% Set cookie on both nodes to be the same.
BuildScript = filename:join(Dir, "Dockerfile"),
Dockerfile =
"FROM ubuntu:20.04 as runner\n"
"EXPOSE 4445\n"
"WORKDIR /opt/lambda\n"
"COPY lambda.tar.gz /tmp\n"
"RUN tar -zxvf /tmp/lambda.tar.gz -C /opt/lambda\n"
"ENTRYPOINT [\"/opt/lambda/erts-" ++ erlang:system_info(version) ++
"/bin/dyn_erl\", \"-boot\", \"/opt/lambda/releases/1.0.0/start\","
" \"-kernel\", \"inet_dist_listen_min\", \"4445\","
" \"-erl_epmd_port\", \"4445\","
" \"-setcookie\", \"secret\"]\n",
ok = file:write_file(BuildScript, Dockerfile),
os:cmd("docker build -t lambda " ++ Dir).
</code>
</section>
<datatypes>
<datatype>
<name name="server_ref"/>
<desc>
<p>
Identifies the controlling process of a peer node.
</p>
</desc>
</datatype>
<datatype>
<name name="start_options"/>
<desc>
<p>
Options that can be used when starting
a <c>peer</c> node through <seemfa marker="#start/1"><c>start/1</c></seemfa>
and <seemfa marker="#start_link/0"><c>start_link/0,1</c></seemfa>.
</p>
<taglist>
<tag><c>name</c></tag>
<item>
<p>
Node name (the part before "@"). When <c>name</c> is not specified, but <c>host</c>
is, <c>peer</c> follows compatibility behaviour and uses the origin node name.
</p>
</item>
<tag><c>host</c></tag>
<item>
<p>
Enforces a specific host name. Can be used to override the default
behaviour and start "node@localhost" instead of "node@realhostname".
</p>
</item>
<tag><c>longnames</c></tag>
<item>
<p>
Use long names to start a node. Default is taken from the origin
using <c>net_kernel:longnames()</c>. If the origin is not distributed,
short names is the default.
</p>
</item>
<tag><c>peer_down</c></tag>
<item>
<p>
Defines the peer control process behaviour when the control connection is
closed from the peer node side (for example when the peer crashes or dumps core).
When set to <c>stop</c> (default), a lost control connection causes
the control process to exit normally. Setting <c>peer_down</c> to <c>continue</c>
keeps the control process running, and <c>crash</c> will cause
the controlling process to exit abnormally.
</p>
</item>
<tag><c>exec</c></tag>
<item>
<p>
Alternative mechanism to start peer nodes with, for example, ssh instead of the
default bash.
</p>
</item>
<tag><c>connection</c></tag>
<item>
<p>Alternative connection specification. See the
<seetype marker="#connection"><c>connection</c> datatype</seetype>.</p>
</item>
<tag><c>args</c></tag>
<item>
<p>Extra command line arguments to append to the "erl" command. Arguments are
passed as is, no escaping or quoting is needed or accepted.</p>
</item>
<tag><c>env</c></tag>
<item>
<p>
List of environment variables with their values. This list is applied
to a locally started executable. If you need to change the environment of
the remote peer, adjust <c>args</c> to contain
<c>-env ENV_KEY ENV_VALUE</c>.
</p>
</item>
<tag><c>wait_boot</c></tag>
<item>
<p>Specifies the start/start_link timeout.
See <seetype marker="#wait_boot"><c>wait_boot</c> datatype</seetype>.
</p>
</item>
<tag><c>shutdown</c></tag>
<item>
<p>Specifies the peer node stopping behaviour. See
<seemfa marker="#stop/1"><c>stop()</c></seemfa>.</p>
</item>
</taglist>
</desc>
</datatype>
<datatype>
<name name="peer_state"/>
<desc><p>Peer node state.</p></desc>
</datatype>
<datatype>
<name name="connection"/>
<desc><p>Alternative connection between the origin and the peer. When the
connection closes, the peer node terminates automatically. If
the <c>peer_down</c> startup flag is set to <c>crash</c>, the controlling
process on the origin node exits with corresponding reason, effectively
providing a two-way link. </p>
<p>When <c>connection</c> is set to a port number, the origin starts listening on
the requested TCP port, and the peer node connects to the port. When it is set to
an <c>{IP, Port}</c> tuple, the origin listens only on the specified IP. The port
number can be set to 0 for automatic selection.
</p>
<p>Using the <c>standard_io</c> alternative connection starts the peer attached to
the origin (other connections use <c>-detached</c> flag to erl). In this mode
peer and origin communicate via stdin/stdout.
</p>
</desc>
</datatype>
<datatype>
<name name="exec"/>
<desc>
<p>
Overrides executable to start peer nodes with. By default it is
the path to "erl", taken from <c>init:get_argument(progname)</c>.
If <c>progname</c> is not known, <c>peer</c> makes best guess given the current
ERTS version.
</p>
<p>
When a tuple is passed, the first element is the path to executable,
and the second element is prepended to the final command line. This can be used
to start peers on a remote host or in a Docker container. See the examples
above.
</p>
<p>
This option is useful for testing backwards compatibility with previous releases,
installed at specific paths, or when the Erlang installation location
is missing from the <c>PATH</c>.
</p>
</desc>
</datatype>
<datatype>
<name name="wait_boot"/>
<desc><p>Specifies start/start_link timeout in milliseconds. Can be set to
<c>false</c>, allowing the peer to start asynchronously. If <c>{Pid, Tag}</c>
is specified instead of a timeout, the peer will send <c>Tag</c> to the
requested process.</p></desc>
</datatype>
<datatype>
<name name="disconnect_timeout"/>
<desc><p>Disconnect timeout. See
<seemfa marker="#stop/1"><c>stop()</c></seemfa>.</p></desc>
</datatype>
</datatypes>
<funcs>
<func>
<name name="call" arity="4" since="OTP @OTP-17720@"/>
<name name="call" arity="5" since="OTP @OTP-17720@"/>
<fsummary>Evaluates a function call on a peer node.</fsummary>
<desc>
<p>
Uses the alternative connection to
evaluate <c>apply(<anno>Module</anno>, <anno>Function</anno>,
<anno>Args</anno>)</c> on the peer node and returns
the corresponding value <c><anno>Result</anno></c>.
<c><anno>Timeout</anno></c> is an integer representing
the timeout in milliseconds or the atom <c>infinity</c>
which prevents the operation from ever timing out.
</p>
<p>
When an alternative connection is not requested, this
function will raise <c>exit</c> signal with the <c>noconnection</c>
reason. Use <seeerl marker="kernel:erpc"><c>erpc</c></seeerl> module
to communicate over Erlang distribution.
</p>
</desc>
</func>
<func>
<name name="cast" arity="4" since="OTP @OTP-17720@"/>
<fsummary>Evaluates a function call on a peer node ignoring the result.</fsummary>
<desc>
<p>
Uses the alternative connection to
evaluate <c>apply(<anno>Module</anno>, <anno>Function</anno>,
<anno>Args</anno>)</c> on the peer node. No response is delivered to the
calling process.
</p>
<p>
<c>peer:cast/4</c> fails silently when the alternative connection is not
configured. Use <seeerl marker="kernel:erpc"><c>erpc</c></seeerl> module
to communicate over Erlang distribution.
</p>
</desc>
</func>
<func>
<name name="send" arity="3" since="OTP @OTP-17720@"/>
<fsummary>Sends a message to a process on the peer node.</fsummary>
<desc>
<p>
Uses the alternative connection to send <anno>Message</anno> to a process on the
the peer node. Silently fails if no alternative connection is configured.
The process can be referenced by process ID or registered name.
</p>
</desc>
</func>
<func>
<name name="get_state" arity="1" since="OTP @OTP-17720@"/>
<fsummary>Returns peer node state.</fsummary>
<desc>
<p>Returns the peer node state. Th initial state is <c>booting</c>; the node stays in that
state until then boot script is complete, and then the node progresses to <c>running</c>.
If the node stops (gracefully or not), the state changes to <c>down</c>.
</p>
</desc>
</func>
<func>
<name name="random_name" arity="0" since="OTP @OTP-17720@"/>
<fsummary>Creates a sufficiently unique node name.</fsummary>
<desc>
<p>
The same as <seemfa marker="#random_name/1"><c>random_name(peer)</c></seemfa>.
</p>
</desc>
</func>
<func>
<name name="random_name" arity="1" since="OTP @OTP-17720@"/>
<fsummary>Creates a sufficiently unique node name given a prefix.</fsummary>
<desc>
<p>
Creates a sufficiently unique node name for the current host,
combining a prefix, a unique number, and the current OS process ID.
</p>
<note>
<p>
Use the <c>?CT_PEER(["erl_arg1"])</c> macro provided by Common Test
<c>-include_lib("common_test/include/ct.hrl")</c> for convenience.
It starts a new peer using Erlang distribution as the control channel,
supplies thes calling module's code path to the peer, and uses the calling
function name for the name prefix.
</p>
</note>
</desc>
</func>
<func>
<name name="start" arity="1" since="OTP @OTP-17720@"/>
<fsummary>Starts a peer node.</fsummary>
<desc>
<p>
Starts a peer node with the specified
<seetype marker="#start_options"><c>start_options()</c></seetype>.
Returns the controlling process and the full peer node name, unless
<c>wait_boot</c> is not requested and the host name is not known in advance.
</p>
</desc>
</func>
<func>
<name name="start_link" arity="0" since="OTP @OTP-17720@"/>
<fsummary>Starts a peer node, and links controlling process to caller process.</fsummary>
<desc>
<p>
The same as
<seemfa marker="#start_link/1"><c>start_link(#{name => random_name()})</c></seemfa>.
</p>
</desc>
</func>
<func>
<name name="start_link" arity="1" since="OTP @OTP-17720@"/>
<fsummary>Starts a peer node, and links controlling process to caller process.</fsummary>
<desc>
<p>Starts a peer node in the same way as <seemfa marker="#start/1"><c>start/1</c></seemfa>,
except that the peer node is linked to the currently
executing process. If that process terminates, the peer node
also terminates.</p>
<p>
Accepts <seetype marker="#start_options"><c>start_options()</c></seetype>.
Returns the controlling process and the full peer node name, unless <c>wait_boot</c> is not
requested and host name is not known in advance.
</p>
<p>
When the <c>standard_io</c> alternative connection is requested, and <c>wait_boot</c> is
not set to <c>false</c>, a failed peer boot sequence causes the caller to exit with
the <c>{boot_failed, {exit_status, ExitCode}}</c> reason.
</p>
</desc>
</func>
<func>
<name name="stop" arity="1" since="OTP @OTP-17720@"/>
<fsummary>Stop controlling process and terminate peer node.</fsummary>
<type name="disconnect_timeout"/>
<desc>
<p>
Stops a peer node. How the node is stopped depends on the
<seetype marker="#start_options"><c>shutdown</c></seetype>
option passed when starting the peer node. Currently the
following <c>shutdown</c> options are supported:
</p>
<taglist>
<tag><c>halt</c></tag>
<item><p>
This is the default shutdown behavior. It behaves as <c>shutdown</c>
option <c>{halt, DefaultTimeout}</c> where <c>DefaultTimeout</c>
currently equals <c>5000</c>.
</p></item>
<tag><c>{halt, Timeout :: disconnect_timeout()}</c></tag>
<item><p>
Triggers a call to
<seemfa marker="erts:erlang#halt/0"><c>erlang:halt()</c></seemfa>
on the peer node and then waits for the Erlang distribution
connection to the peer node to be taken down. If this connection
has not been taken down after <c>Timeout</c> milliseconds, it will
forcefully be taken down by <c>peer:stop/1</c>. See the
<seeerl marker="#dist_connection_close">warning</seeerl> below for
more info about this.
</p></item>
<tag><c>Timeout :: disconnect_timeout()</c></tag>
<item><p>
Triggers a call to
<seemfa marker="erts:init#stop/0"><c>init:stop()</c></seemfa>
on the peer node and then waits for the Erlang distribution
connection to the peer node to be taken down. If this connection
has not been taken down after <c>Timeout</c> milliseconds, it will
forcefully be taken down by <c>peer:stop/1</c>. See the
<seeerl marker="#dist_connection_close">warning</seeerl> below for
more info about this.
</p></item>
<tag><c>close</c></tag>
<item><p>
Close the <i>control connection</i> to the peer node and
return. This is the fastest way for the caller of
<c>peer:stop/1</c> to stop a peer node.
</p>
<p>
Note that if the Erlang distribution connection is not used as
control connection it might not have been taken down when
<c>peer:stop/1</c> returns. Also note that the
<seeerl marker="#dist_connection_close">warning</seeerl> below
applies when the Erlang distribution connection is used as control
connection.
</p>
</item>
</taglist>
<marker id="dist_connection_close"/>
<warning>
<p>
In the cases where the Erlang distribution connection is taken
down by <c>peer:stop/1</c>, other code independent of the peer
code might react to the connection loss before the peer node is
stopped which might cause undesirable effects. For example,
<seeerl marker="kernel:global#prevent_overlapping_partitions"><c>global</c></seeerl>
might trigger even more Erlang distribution connections to other
nodes to be taken down. The potential undesirable effects are,
however, not limited to this. It is hard to say what the effects
will be since these effects can be caused by any code with links
or monitors to something on the origin node, or code monitoring
the connection to the origin node.
</p>
</warning>
</desc>
</func>
</funcs>
</erlref>
|