| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Valgrind reported:
1925: schema conversion online - standalone
==10727== 689 (56 direct, 633 indirect) bytes in 1 blocks are definitely lost in loss record 64 of 66
==10727== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==10727== by 0x449D42: xcalloc (util.c:121)
==10727== by 0x40F45C: ovsdb_schema_create (ovsdb.c:41)
==10727== by 0x40F7F8: ovsdb_schema_from_json (ovsdb.c:217)
==10727== by 0x40FB4E: ovsdb_schema_from_file (ovsdb.c:101)
==10727== by 0x40B156: do_convert (ovsdb-client.c:1639)
==10727== by 0x4061C6: main (ovsdb-client.c:282)
This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Valgrind reported:
1925: schema conversion online - standalone
==10884== 689 (56 direct, 633 indirect) bytes in 1 blocks are definitely lost in loss record 384 of 420
==10884== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==10884== by 0x44A592: xcalloc (util.c:121)
==10884== by 0x40E2EC: ovsdb_schema_create (ovsdb.c:41)
==10884== by 0x40E688: ovsdb_schema_from_json (ovsdb.c:217)
==10884== by 0x416C6F: ovsdb_trigger_try (trigger.c:246)
==10884== by 0x40D4DE: ovsdb_jsonrpc_trigger_create (jsonrpc-server.c:1119)
==10884== by 0x40D4DE: ovsdb_jsonrpc_session_got_request (jsonrpc-server.c:986)
==10884== by 0x40D4DE: ovsdb_jsonrpc_session_run (jsonrpc-server.c:556)
==10884== by 0x40D4DE: ovsdb_jsonrpc_session_run_all (jsonrpc-server.c:586)
==10884== by 0x40D4DE: ovsdb_jsonrpc_server_run (jsonrpc-server.c:401)
==10884== by 0x406A6E: main_loop (ovsdb-server.c:209)
==10884== by 0x406A6E: main (ovsdb-server.c:460)
'new_schema' should also be freed when there is no error.
This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Valgrind reported:
1924: compacting online - cluster
==29312== 2,886 (240 direct, 2,646 indirect) bytes in 6 blocks are definitely lost in loss record 406 of 413
==29312== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==29312== by 0x44A5F4: xmalloc (util.c:138)
==29312== by 0x4308EA: json_create (json.c:1451)
==29312== by 0x4308EA: json_object_create (json.c:254)
==29312== by 0x430ED0: json_parser_push_object (json.c:1273)
==29312== by 0x430ED0: json_parser_input (json.c:1371)
==29312== by 0x431CF1: json_lex_input (json.c:991)
==29312== by 0x43233B: json_parser_feed (json.c:1149)
==29312== by 0x41D87F: parse_body.isra.0 (log.c:411)
==29312== by 0x41E141: ovsdb_log_read (log.c:476)
==29312== by 0x42646D: raft_read_log (raft.c:866)
==29312== by 0x42646D: raft_open (raft.c:951)
==29312== by 0x4151AF: ovsdb_storage_open__ (storage.c:81)
==29312== by 0x408FFC: open_db (ovsdb-server.c:642)
==29312== by 0x40657F: main (ovsdb-server.c:358)
This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OVN is separated into its own repo. This commit removes the OVN source,
OVN tests, and OVN documentation. It also removes mentions of OVN from
most documentation. The only place where OVN has been left is in
changelogs/NEWS, since we shouldn't mess with the history of the
project.
There is an exception here. The ovsdb-cluster tests rely on ovn-nbctl
and ovn-sbctl to run. Therefore those ovn utilities, as well as their
dependencies remain in the repo with this commit.
Acked-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch store the latest election timer in snapshot during log
compression, and when server restarts it reads the value from the log.
Without this, any previous changes to election timer will be lost
in the log, and if server restarts, it will use the default value
instead of the changed value.
Fixes: commit 8e35461 ("ovsdb raft: Support leader election time change online.")
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After election timer is changed through cluster/change-election-timer
command, if a server restarts, it firstly initializes with the default
value and use it to reset the timer. Although it reads the latest
timer value later from the log, the first timeout may be much shorter
than expected by other servers that use latest timeout, and it would
start election before it receives the first heartbeat from the leader.
This patch fixes it by changing the order of reading log and resetting
timer so that the latest value is read from the log before the initial
resetting of the timer.
Fixes: commit 8e35461 ("ovsdb raft: Support leader election time change online.")
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a client uses monitor-cond-since with a non-zero last-id but the
server is not in cluster mode for the DB being monitored, it leads to
segmentation fault because the txn_history list is not initialized in
this case.
Program terminated with signal SIGSEGV, Segmentation fault.
1536 struct ovsdb_txn *txn = h_node->txn;
(gdb) bt
0 ovsdb_monitor_get_changes_after (txn_uuid=txn_uuid@entry=0x7ffe8605b7e0, dbmon=0x17c1b40, p_mcs=p_mcs@entry=0x17c4900) at ovsdb/monitor.c:1536
1 0x000000000040da2d in ovsdb_jsonrpc_monitor_create (request_id=0x1804630, version=<optimized out>, params=0x17ad330, db=0x18015b0, s=<optimized out>) at ovsdb/jsonrpc-server.c:1469
2 ovsdb_jsonrpc_session_got_request (request=0x17ad520, s=<optimized out>) at ovsdb/jsonrpc-server.c:1002
3 ovsdb_jsonrpc_session_run (s=<optimized out>) at ovsdb/jsonrpc-server.c:556
...
Although it doesn't happen in normal use cases, no one can prevent a
client to send this on purpose or in a corner case when a client firstly
connected to a clustered DB but later the server restarted with a
non-clustered DB.
This patch fixes it by always initialize the txn_history list to avoid
the undefined behavior in this case. It adds a test case to cover it, too.
Fixes: 695e815 ("ovsdb-server: Transaction history tracking.")
Reported-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
A new unixctl command cluster/change-election-timer is implemented to
change leader election timeout base value according to the scale needs.
The change takes effect upon consensus of the cluster, implemented through
the append-request RPC. A new field "election-timer" is added to raft log
entry for this purpose.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
candiate_retrying is used to determine if the current node is disconnected
from the cluster when the node is in candiate role. However, a node
can flap between candidate and follower role before a leader is elected
when majority of the cluster is down, so is_connected() will flap, too, which
confuses clients.
This patch avoids the flapping with the help of a new member had_leader,
so that if no leader was elected since last election, we know we are
still retrying, and keep as disconnected from the cluster.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As mentioned in RAFT paper, section 6.2:
Leaders: A server might be in the leader state, but if it isn’t the current
leader, it could be needlessly delaying client requests. For example, suppose a
leader is partitioned from the rest of the cluster, but it can still
communicate with a particular client. Without additional mechanism, it could
delay a request from that client forever, being unable to replicate a log entry
to any other servers. Meanwhile, there might be another leader of a newer term
that is able to communicate with a majority of the cluster and would be able to
commit the client’s request. Thus, a leader in Raft steps down if an election
timeout elapses without a successful round of heartbeats to a majority of its
cluster; this allows clients to retry their requests with another server.
Reported-by: Aliasgar Ginwala <aginwala@ebay.com>
Tested-by: Aliasgar Ginwala <aginwala@ebay.com>
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
| |
Fixes: commit 5a9b53a5 ("ovsdb raft: Fix duplicated transaction execution when leader failover.")
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this patch, only db change aware connections were dropped
on a read/write status change. However, current schema in OVN does
not allow clients to monitor whether a particular DB changes this
status. In order to accomplish this, we'd need to change the schema
and adapting ovsdb-server and existing clients.
Before tackling that, this patch is changing ovsdb-server to drop
*all* the existing connections upon a read/write status change. This
will force clients to reconnect and honor the change.
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-July/048981.html
Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
| |
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a server claims itself as "disconnected", all clients connected
to that server will try to reconnect to a new server in the cluster.
However, currently a server would claim itself as disconnected even
when itself is the candidate and try to become the new leader (most
likely it will be), and all its clients will reconnect to another
node.
During a leader fail-over (e.g. due to a leader failure), it is
expected that all clients of the old leader will have to reconnect
to other nodes in the cluster, but it is unnecessary for all the
clients of a healthy node to reconnect, which could cause more
disturbance in a large scale environment.
This patch fixes the problem by slightly change the condition that
a server regards itself as disconnected: if its role is candidate,
it is regarded as disconnected only if the election didn't succeed
at the first attempt. Related failure test cases are also unskipped
and all passed with this patch.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a transaction is submitted from a client connected to a follower,
if leader crashes after receiving the execute_command_request from the
follower and sending out append request to the majority of followers,
but before sending execute_command_reply to the follower. The
transaction would finally got commited by the new leader. However,
with current implementation the transaction would be commited twice.
For the root cause, there are two cases:
Case 1, the connected follower becomes the new leader. In this case,
the pending command of the follower will be cancelled during its role
changing to leader, so the trigger for the transaction will be retried.
Case 2, another follower becomes the new leader. In this case, since
there is no execute_command_reply from the original leader (which has
crashed), the command will finally timed out, causing the trigger for
the transaction retried.
In both cases, the transaction will be retried by the server node's
trigger retrying logic. This patch fixes the problem by below changes:
1) A pending command can be completed not only by
execute_command_reply, but also when the eid is committed, if the
execute_command_reply never came.
2) Instead of cancelling all pending commands during role change, let
the commands continue waiting to be completed when the eid is
committed. The timer is increased to be twice the election base time,
so that it has the chance to be completed when leader crashes.
This patch fixes the two raft failure test cases previously disabled.
See the test case for details of how to reproduce the problem.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
| |
raft_command's eid should always be non-null in all 3 cases. Fix the
comment, and also replace if condition with assert.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Added unix commands cluster/... for ovsdb raft, which will be used in a future
patch to test more fine-grained failure scenarios. The commands either causes
a node to crash at certain point, or manipulate the election timer so that
we can control the election process to elect a new leader we desired for the
test cases.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When update is requested from follower, the leader sends AppendRequest
to all followers and wait until AppendReply received from majority, and
then it will update commit index - the new entry is regarded as committed
in raft log. However, this commit will not be notified to followers
(including the one initiated the request) until next heartbeat (ping
timeout), if no other pending requests. This results in long latency
for updates made through followers, especially when a batch of updates
are requested through the same follower.
$ time for i in `seq 1 100`; do ovn-nbctl ls-add ls$i; done
real 0m34.154s
user 0m0.083s
sys 0m0.250s
This patch solves the problem by sending heartbeat as soon as the commit
index is updated in leader. It also avoids unnessary heartbeat by resetting
the ping timer whenever AppendRequest is broadcasted. With this patch
the performance is improved more than 50 times in same test:
$ time for i in `seq 1 100`; do ovn-nbctl ls-add ls$i; done
real 0m0.564s
user 0m0.080s
sys 0m0.199s
Torture test cases are also updated because otherwise the tests will
all be skipped because of the improved performance.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In current OVSDB Raft design, when there are multiple transactions
pending, either from same server node or different nodes in the
cluster, only the first one can be successful at once, and following
ones will fail at the prerequisite check on leader node, because
the first one will update the expected prerequisite eid on leader
node, and the prerequisite used for proposing a commit has to be
committed eid, so it is not possible for a node to use the latest
prerequisite expected by the leader to propose a commit until the
lastest transaction is committed by the leader and updated the
committed_index on the node.
Current implementation proposes the commit as soon as the transaction
is requested by the client, which results in continously retry which
causes high CPU load and waste.
Particularly, even if all clients are using leader_only to connect to
only the leader, the prereq check failure still happens a lot when
a batch of transactions are pending on the leader node - the leader
node proposes a batch of commits using the same committed eid as
prerequisite and it updates the expected prereq as soon as the first
one is in progress, but it needs time to append to followers and wait
until majority replies to update the committed_index, which results in
continously useless retries of the following transactions proposed by
the leader itself.
This patch doesn't change the design but simplely pre-checks if current
eid is same as prereq, before proposing the commit, to avoid waste of
CPU cycles, for both leader and followers. When clients use leader_only
mode, this patch completely eliminates the prereq check failures.
In scale test of OVN with 1k HVs and creating and binding 10k lports,
the patch resulted in 90% CPU cost reduction on leader and >80% CPU cost
reduction on followers. (The test was with leader election base time
set to 10000ms, because otherwise the test couldn't complete because
of the frequent leader re-election.)
This is just one of the related performance problems of the prereq
checking mechanism dicussed at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2019-February/048243.html
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
| |
Run triggers after storage_run and read_db to make sure new raft
updates are utilized in current iteration.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support the new monitor method monitor_cond_since so that a client
can request monitoring start from a specific point instead of always
from beginning. This will reduce the cost at scenarios when server
is restarted/failed-over but client still has all existing data. In
these scenarios only new changes (and in most cases no change) needed
to be transfered to client. When ovsdb-server restarted, history
transactions are read from disk file; when ovsdb-server failed over,
history transactions exists already in the memory of the new server.
There are situations that the requested transaction may not be found.
For example, a transaction that is too old and has been discarded
from the maintained history list in memory, or the transactions on
disk has been compacted during ovsdb compaction. In those situations
the server fall backs to transfer all data start from begining.
For more details of the protocol change, see
Documentation/ref/ovsdb-server.7.rst.
This change includes both server side and ovsdb-client side changes
with the new protocol. IDLs using this capability will be added in
future patches.
Now the feature takes effect only for cluster mode of ovsdb-server,
because cluster mode is the only mode that supports unique transcation
uuid today. For other modes, the monitor_cond_since always fall back
to transfer all data with found = false. Support for those modes can
be added in the future.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
| |
Maintaining last N (n = 100) transactions in memory, which will be
used for future patches for generating monitor data from any point
in this N transactions.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current ovsdb monitor maintains its own transaction version through an
incremental integer and use it to identify changes starting from different
version, and also use it to figure out if each set of changes should be
flushed. In particular, it uses number 0 to represent that the change set
contains all data for initial client population. It is a smart way but it
prevents further extension of the monitoring mechanism to support future use
case for clients to request changes starting from a given history point. This
patch refactors the structures so that change sets are referenced directly
through the pointer. It uses additional members such as init_change_set,
new_change_set to indicates the specific change set explicitely, instead of
through calculated version numbers based on implicite rules.
At the same time, this patch provides better encapsulation for change set
(composed of data in a list of tables), while still allowing traversing
across change sets for a given table.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Ommiting the parameter names in prototypes, as suggested by coding
style: Omit parameter names from function prototypes when the names
do not give useful information.
Adjust orders of parameters as suggested by coding style.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
| |
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OVSDB conditional monitor implementation allows many clients
to share same copy of monitored data if the clients are sharing
same tables and columns being monitored, while they can have
different monitor conditions. In monitor conditions they can
have different columns which can be different from the columns
being monitored. So the struct ovsdb_monitor_table maintains the
union of the all the columns being used in any conditions.
The problem of the current implementation is that for each change
set generated, it doesn't maintain any metadata for the number of
columns for the data that has already populated in it. Instead, it
always rely on the n_columns field of the struct ovsdb_monitor_table
to manipulate the data. However, the n_columns in struct
ovsdb_monitor_table can increase (e.g. when a client changes its
condition which involves more columns). So it can result in that
the existing rows in a change set with N columns being later processed
as if it had more than N columns, typically, when the row is freed.
This causes the ovsdb-server crashing (see an example of the
backtrace).
The patch fixes the problem by maintaining n_columns for each
change set, and added a test case which fails without the fix.
(gdb) bt
at lib/ovsdb-data.c:1031
out>, mt=<optimized out>) at ovsdb/monitor.c:320
mt=0x1e7b940) at ovsdb/monitor.c:333
out>, transaction=<optimized out>) at ovsdb/monitor.c:527
initial=<optimized out>, cond_updated=cond_updated@entry=false,
unflushed_=unflushed_@entry=0x20dae70,
condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156
(m=m@entry=0x20dae40, initial=initial@entry=false) at
ovsdb/jsonrpc-server.c:1655
at ovsdb/jsonrpc-server.c:1729
ovsdb/jsonrpc-server.c:551
ovsdb/jsonrpc-server.c:586
ovsdb/jsonrpc-server.c:401
exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0,
unixctl=0x1e7a560, all_dbs=0x7ffdb947f800,
jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
| |
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
| |
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On some systems in case where remote is not responding, socket could
remain in SYN_SENT state for a really long time without errors waiting
for connection. This leads to situations where open_blok() hangs for
a few minutes waiting for connection to the DOWN remote.
For example, our "multiple remotes" idl tests hangs waiting for
connection to the WRONG_PORT on FreeBSD in CirrusCI environment.
This leads to test failures because Alarm signal arrives much faster
than ETIMEDOUT from the socket.
This patch allowes to specify timeout value for 'open_block' function.
If the connection takes more time, socket will be closed with
ETIMEDOUT error code. Negative value or None in python could be
used to wait infinitely.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
| |
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When server A sends the leader a request to remove server B from the
cluster, where A != B, the leader sends both A and B a notification when
the removal is complete. Until now, however, the notification (which is a
raft_remove_server_reply message) did not say which server had been
removed, and the receiver did not check. Instead, the receiver assumed
that it had been removed. The result was that B was removed and A stopped
serving out the database even though it was still part of the cluster,
This commit fixes the problem.
Reported-by: ramteja tadishetti <ramtejatadishetti@gmail.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
| |
When the server is leaving the cluster but remains leader, the
raft_find_server() call can return NULL. Previously this caused a null
dereference. This commit fixes the problem.
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
raft_update_commit_index() iterates over a sequence of entries that may
have up to two components: a set of servers and a piece of data. When
a set of servers is present, it calls raft_run_reconfigure(), which can
call through the following chain of functions in some cases:
raft_log_reconfiguration()
raft_command_execute__()
raft_command_initiate()
raft_write_entry()
raft_add_entry()
and raft_add_entry() can reallocate raft->entries, which turns the pointer
'e' that raft_update_commit_index() has to the current entry into a wild
pointer.
This commit fixes the problem.
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
| |
For debugging, it is useful to know the source code line that sent a
given RPC message.
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When 'make check' is called by the mock rpm build (which disables networking),
the test "ovn-nbctl: LBs - daemon" fails when it runs the command
"ovn-nbctl lb-add lb0 30.0.0.1a 192.168.10.10:80,192.168.10.20:80". ovn-nbctl
extracts the vip by calling the socket util function 'inet_parse_active()',
and this function blocks when libunbound function ub_resolve() is called
further down. ub_resolve() is a blocking function without timeout and all the
ovs/ovn utilities use this function.
As reported by Timothy Redaelli, the issue can also be reproduced by running
the below commands
$ sudo unshare -mn -- sh -c 'ip addr add dev lo 127.0.0.1 && \
mount --bind /dev/null /etc/resolv.conf && runuser $SUDO_USER'
$ make sandbox SANDBOXFLAGS="--ovn"
$ ovn-nbctl -vsocket_util:off lb-add lb0 30.0.0.1a \
192.168.10.10:80,192.168.10.20:80
To address this issue, this patch adds a new bool argument 'resolve_host' to
the function inet_parse_active() to resolve the host only if it is 'true'.
ovn-nbctl/ovn-northd will pass 'false' when it calls this function to parse
the load balancer values.
Reported-by: Timothy Redaelli <tredaelli@redhat.com>
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1641672
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
| |
This wasn't clear from the documentation.
Reported-by; Paul Greenberg <greenpau@outlook.com>
Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows to not redefine common macroses in every single
file and allowes using things like .EX without warying about
compatibility.
manpages.mk updated automatically.
Files that are already complete pages (i.e. has no *.in sources)
wasn't touched, because this will require additional file
manipulations and changes in makefiles/specs without serious
profit.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When relational comparisons against optional scalars were introduced, it
was meant to work only when the right-hand side of the comparison was a
scalar, not the empty set. The implementation wasn't that picky. This
commit fixes the problem.
CC: Terry Wilson <twilson@redhat.com>
Fixes: 09e256031a62 ("ovsdb: Allow comparison on optional scalar types")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Open vSwitch 2.4 introduced an OVSDB extension in which a column with
type optional integer or real could be compared with the operators <,
<=, >, and >=. At the same time, it broke the implementation of the
operators ==, !=, includes, and excludes on columns with the same types.
This fixes the problem.
Reported-by: Hans Ole Rafaelsen <hrafaelsen@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-September/047356.html
CC: Terry Wilson <twilson@redhat.com>
Fixes: 09e256031a62 ("ovsdb: Allow comparison on optional scalar types")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
|
|
|
|
|
|
|
| |
This patch fixes the incorrect index to argv.
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The present code resets the database when it is in the state -
'RPL_S_SCHEMA_REQUESTED' and repopulates the database when it
receives the monitor reply when it is in the state -
'RPL_S_MONITOR_REQUESTED'. If however, it goes to active mode
before it processes the monitor reply, the whole data is lost.
This patch alleviates the problem by resetting the database when it
receives the monitor reply (before processing it). So that
reset and repopulation of the db happens in the same state.
This approach still has a window for data loss if the function
process_notification() when processing the monitor reply fails for
some reason or ovsdb-server crashes for some reason during
process_notification().
Reported-by: Han Zhou <zhouhan@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047161.html
Tested-by: aginwala <aginwala@ebay.com>
Acked-by: Han Zhou <zhouhan@gmail.com>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The *_index_init_row() function casts a generic ovsdb_idl_row pointer to
a specific type of row pointer. This can cause an increase in required
alignment with some kinds of data on some architectures. GCC complains,
e.g.:
lib/vswitch-idl.c: In function 'ovsrec_flow_sample_collector_set_index_init_row'
lib/vswitch-idl.c:9277:12: warning: cast increases required alignment of target
However, rows are always allocated with malloc(), which returns member
suitable for any type, so this is a false positive warning and this commit
suppresses it.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Han Zhou <zhouhan@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "wait" command in ovsdb-client (which was introduced as part of the
clustering support) fairly often logs things that are normal for it but
in other circumstances might be cause for concern, for example messages
about being unable to connect to a remote. Until now, it has tried to
suppress some of those itself by raising log levels. Unfortunately, in
some cases this had the opposite effect because it overrode any settings on
the command line, such as an attempt in ovsdb-cluster.at to suppress all
logging related to the timeval module. This commit drops the special
log levels from the "wait" command and puts equivalents into the tests
themselves.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce new 'OVS_CTL_TIMEOUT' environment variable
that, if set, will be used as a default timeout for
OVS control utilities. Setting it in 'atlocal.in' will
cover all the hangs inside the testsuite, even when
utils called in a subshell.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Parsing of the '--timeout' option implemented differently
for every single control utility and, which is more
important, highly inaccurate. In most cases unsigned result
of 'strtoul' stored in signed variable. Parsing failures are
not tracked. 'ovs-appctl' even uses just 'atoi' without any
checking of the argument or result.
This patch unifies the parsing by using 'str_to_uint'.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When ovsdb-server closes a remote connection, it logs a message about it
that includes the reason. Until now this has included sessions that it
closes when it exits. That meant that, when --run was used, there was a
race between noticing that the subprocess exited and noticing that the
session that that subprocess (presumably) had open had been closed. If
it noticed the latter first, nothing was logged (because it didn't log
anything if a session was closed in the ordinary way by the client). If
it noticed the former first, it logged a message about closing the session
itself.
This is a benign race that causes no real problems--except that the tests
didn't expect to see the log message from the former case and fail with
errors like the following:
1826. ovsdb-server.at:92: testing truncating database log with bad transaction ...
./ovsdb-server.at:96: ovsdb-tool create db schema
stderr:
stdout:
./ovsdb-server.at:104: ovsdb-server --remote=punix:socket db --run="sh txnfile"
--- /dev/null 2018-04-24 08:50:58.769000000 +0000
+++ /root/openvswitch-2.9.2/rpm/rpmbuild/BUILD/openvswitch-2.9.2/tests/testsuite.dir/at-groups/1826/stderr 2018-05-29 14:29:56.529257295 +0000
@@ -0,0 +1,2 @@
+2018-05-29T14:29:56Z|00001|ovsdb_jsonrpc_server|INFO|unix#0: disconnecting (removing ordinals database due to server termination)
+2018-05-29T14:29:56Z|00002|ovsdb_jsonrpc_server|INFO|unix#0: disconnecting (removing _Server database due to server termination)
This fixes the race. This particular log message isn't too useful since
it's pretty obvious that ovsdb-server is closing those sessions, since
after all it's exiting!
Reported-by: Sanket Sudake <sanket@infracloud.io>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-May/046840.html
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Numan Siddique <nusiddiq@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
tables.
This is a follow-up patch for commit 0eb1e37c, to add more interfaces
that supports passing around pointers of individual tables, which will
be used in incremental processing.
Signed-off-by: Han Zhou <hzhou8@ebay.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
raft_store_snapshot() constructs a new snapshot in a local variable then
destroys the current snapshot and replaces it by the new one. Until now,
it has not cloned the data in the new snapshot until it did the
replacement. This led to the unexpected consequence that, if 'servers' in
the old and new snapshots was the same, then it would first be freed and
later cloned, which could cause a segfault.
Multiple people reported the crash. Gurucharan Shetty provided a
reproduction case.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Generally when we know the leader for a term, in "check-cluster", it's
because we read that leader's log file. In that case, we have the leader's
log_end because it told us. However, taking a snapshot can discard that
data. In that case, log_end is 0 and we should not try to check for leader
completeness on that basis.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
|
|
|
|
|
|
|
|
| |
The user shouldn't provide a given server's log more than once but this
check makes sure.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mark Michelson <mmichels@redhat.com>
|