summaryrefslogtreecommitdiff
path: root/ovsdb
Commit message (Collapse)AuthorAgeFilesLines
* ovsdb-client: Free ovsdb_schemaYifeng Sun2019-09-191-0/+1
| | | | | | | | | | | | | | | | | | | | | Valgrind reported: 1925: schema conversion online - standalone ==10727== 689 (56 direct, 633 indirect) bytes in 1 blocks are definitely lost in loss record 64 of 66 ==10727== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==10727== by 0x449D42: xcalloc (util.c:121) ==10727== by 0x40F45C: ovsdb_schema_create (ovsdb.c:41) ==10727== by 0x40F7F8: ovsdb_schema_from_json (ovsdb.c:217) ==10727== by 0x40FB4E: ovsdb_schema_from_file (ovsdb.c:101) ==10727== by 0x40B156: do_convert (ovsdb-client.c:1639) ==10727== by 0x4061C6: main (ovsdb-client.c:282) This patch fixes it. Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* trigger: Free leaked ovsdb_schemaYifeng Sun2019-09-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Valgrind reported: 1925: schema conversion online - standalone ==10884== 689 (56 direct, 633 indirect) bytes in 1 blocks are definitely lost in loss record 384 of 420 ==10884== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==10884== by 0x44A592: xcalloc (util.c:121) ==10884== by 0x40E2EC: ovsdb_schema_create (ovsdb.c:41) ==10884== by 0x40E688: ovsdb_schema_from_json (ovsdb.c:217) ==10884== by 0x416C6F: ovsdb_trigger_try (trigger.c:246) ==10884== by 0x40D4DE: ovsdb_jsonrpc_trigger_create (jsonrpc-server.c:1119) ==10884== by 0x40D4DE: ovsdb_jsonrpc_session_got_request (jsonrpc-server.c:986) ==10884== by 0x40D4DE: ovsdb_jsonrpc_session_run (jsonrpc-server.c:556) ==10884== by 0x40D4DE: ovsdb_jsonrpc_session_run_all (jsonrpc-server.c:586) ==10884== by 0x40D4DE: ovsdb_jsonrpc_server_run (jsonrpc-server.c:401) ==10884== by 0x406A6E: main_loop (ovsdb-server.c:209) ==10884== by 0x406A6E: main (ovsdb-server.c:460) 'new_schema' should also be freed when there is no error. This patch fixes it. Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* raft: Free leaked json dataYifeng Sun2019-09-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Valgrind reported: 1924: compacting online - cluster ==29312== 2,886 (240 direct, 2,646 indirect) bytes in 6 blocks are definitely lost in loss record 406 of 413 ==29312== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==29312== by 0x44A5F4: xmalloc (util.c:138) ==29312== by 0x4308EA: json_create (json.c:1451) ==29312== by 0x4308EA: json_object_create (json.c:254) ==29312== by 0x430ED0: json_parser_push_object (json.c:1273) ==29312== by 0x430ED0: json_parser_input (json.c:1371) ==29312== by 0x431CF1: json_lex_input (json.c:991) ==29312== by 0x43233B: json_parser_feed (json.c:1149) ==29312== by 0x41D87F: parse_body.isra.0 (log.c:411) ==29312== by 0x41E141: ovsdb_log_read (log.c:476) ==29312== by 0x42646D: raft_read_log (raft.c:866) ==29312== by 0x42646D: raft_open (raft.c:951) ==29312== by 0x4151AF: ovsdb_storage_open__ (storage.c:81) ==29312== by 0x408FFC: open_db (ovsdb-server.c:642) ==29312== by 0x40657F: main (ovsdb-server.c:358) This patch fixes it. Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Remove OVN.Mark Michelson2019-09-061-12/+11
| | | | | | | | | | | | | | | | OVN is separated into its own repo. This commit removes the OVN source, OVN tests, and OVN documentation. It also removes mentions of OVN from most documentation. The only place where OVN has been left is in changelogs/NEWS, since we shouldn't mess with the history of the project. There is an exception here. The ovsdb-cluster tests rely on ovn-nbctl and ovn-sbctl to run. Therefore those ovn utilities, as well as their dependencies remain in the repo with this commit. Acked-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* raft: Save and read new election timer in header snapshot.Han Zhou2019-08-231-0/+7
| | | | | | | | | | | | This patch store the latest election timer in snapshot during log compression, and when server restarts it reads the value from the log. Without this, any previous changes to election timer will be lost in the log, and if server restarts, it will use the default value instead of the changed value. Fixes: commit 8e35461 ("ovsdb raft: Support leader election time change online.") Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* raft.c: Election timer initial reset with value from log.Han Zhou2019-08-231-2/+3
| | | | | | | | | | | | | | | | | After election timer is changed through cluster/change-election-timer command, if a server restarts, it firstly initializes with the default value and use it to reset the timer. Although it reads the latest timer value later from the log, the first timeout may be much shorter than expected by other servers that use latest timeout, and it would start election before it receives the first heartbeat from the leader. This patch fixes it by changing the order of reading log and resetting timer so that the latest value is read from the log before the initial resetting of the timer. Fixes: commit 8e35461 ("ovsdb raft: Support leader election time change online.") Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb monitor: Fix crash when using non-zero last-id with standalone DB.Han Zhou2019-08-213-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a client uses monitor-cond-since with a non-zero last-id but the server is not in cluster mode for the DB being monitored, it leads to segmentation fault because the txn_history list is not initialized in this case. Program terminated with signal SIGSEGV, Segmentation fault. 1536 struct ovsdb_txn *txn = h_node->txn; (gdb) bt 0 ovsdb_monitor_get_changes_after (txn_uuid=txn_uuid@entry=0x7ffe8605b7e0, dbmon=0x17c1b40, p_mcs=p_mcs@entry=0x17c4900) at ovsdb/monitor.c:1536 1 0x000000000040da2d in ovsdb_jsonrpc_monitor_create (request_id=0x1804630, version=<optimized out>, params=0x17ad330, db=0x18015b0, s=<optimized out>) at ovsdb/jsonrpc-server.c:1469 2 ovsdb_jsonrpc_session_got_request (request=0x17ad520, s=<optimized out>) at ovsdb/jsonrpc-server.c:1002 3 ovsdb_jsonrpc_session_run (s=<optimized out>) at ovsdb/jsonrpc-server.c:556 ... Although it doesn't happen in normal use cases, no one can prevent a client to send this on purpose or in a corner case when a client firstly connected to a clustered DB but later the server restarted with a non-clustered DB. This patch fixes it by always initialize the txn_history list to avoid the undefined behavior in this case. It adds a test case to cover it, too. Fixes: 695e815 ("ovsdb-server: Transaction history tracking.") Reported-by: Aliasgar Ginwala <aginwala@ebay.com> Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb raft: Support leader election time change online.Han Zhou2019-08-215-30/+185
| | | | | | | | | | | | A new unixctl command cluster/change-election-timer is implemented to change leader election timeout base value according to the scale needs. The change takes effect upon consensus of the cluster, implemented through the append-request RPC. A new field "election-timer" is added to raft log entry for this purpose. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* raft.c: Set candidate_retrying if no leader elected since last election.Han Zhou2019-08-211-6/+25
| | | | | | | | | | | | | | | candiate_retrying is used to determine if the current node is disconnected from the cluster when the node is in candiate role. However, a node can flap between candidate and follower role before a leader is elected when majority of the cluster is down, so is_connected() will flap, too, which confuses clients. This patch avoids the flapping with the help of a new member had_leader, so that if no leader was elected since last election, we know we are still retrying, and keep as disconnected from the cluster. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* raft.c: Stale leader should disconnect from cluster.Han Zhou2019-08-212-3/+43
| | | | | | | | | | | | | | | | | | | | As mentioned in RAFT paper, section 6.2: Leaders: A server might be in the leader state, but if it isn’t the current leader, it could be needlessly delaying client requests. For example, suppose a leader is partitioned from the rest of the cluster, but it can still communicate with a particular client. Without additional mechanism, it could delay a request from that client forever, being unable to replicate a log entry to any other servers. Meanwhile, there might be another leader of a newer term that is able to communicate with a majority of the cluster and would be able to commit the client’s request. Thus, a leader in Raft steps down if an election timeout elapses without a successful round of heartbeats to a majority of its cluster; this allows clients to retry their requests with another server. Reported-by: Aliasgar Ginwala <aginwala@ebay.com> Tested-by: Aliasgar Ginwala <aginwala@ebay.com> Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* raft: Move raft_reset_ping_timer() out of the loop.Han Zhou2019-08-211-1/+1
| | | | | | Fixes: commit 5a9b53a5 ("ovsdb raft: Fix duplicated transaction execution when leader failover.") Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb-server: drop all connections on read/write status changeDaniel Alvarez2019-07-101-1/+1
| | | | | | | | | | | | | | | | Prior to this patch, only db change aware connections were dropped on a read/write status change. However, current schema in OVN does not allow clients to monitor whether a particular DB changes this status. In order to accomplish this, we'd need to change the schema and adapting ovsdb-server and existing clients. Before tackling that, this patch is changing ovsdb-server to drop *all* the existing connections upon a read/write status change. This will force clients to reconnect and honor the change. Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-July/048981.html Signed-off-by: Daniel Alvarez <dalvarez@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb-client.1: Fix typo.Justin Pettit2019-05-241-1/+1
| | | | | Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* ovsdb raft: Avoid unnecessary reconnecting during leader election.Han Zhou2019-04-221-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | If a server claims itself as "disconnected", all clients connected to that server will try to reconnect to a new server in the cluster. However, currently a server would claim itself as disconnected even when itself is the candidate and try to become the new leader (most likely it will be), and all its clients will reconnect to another node. During a leader fail-over (e.g. due to a leader failure), it is expected that all clients of the old leader will have to reconnect to other nodes in the cluster, but it is unnecessary for all the clients of a healthy node to reconnect, which could cause more disturbance in a large scale environment. This patch fixes the problem by slightly change the condition that a server regards itself as disconnected: if its role is candidate, it is regarded as disconnected only if the election didn't succeed at the first attempt. Related failure test cases are also unskipped and all passed with this patch. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb raft: Fix duplicated transaction execution when leader failover.Han Zhou2019-04-151-27/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a transaction is submitted from a client connected to a follower, if leader crashes after receiving the execute_command_request from the follower and sending out append request to the majority of followers, but before sending execute_command_reply to the follower. The transaction would finally got commited by the new leader. However, with current implementation the transaction would be commited twice. For the root cause, there are two cases: Case 1, the connected follower becomes the new leader. In this case, the pending command of the follower will be cancelled during its role changing to leader, so the trigger for the transaction will be retried. Case 2, another follower becomes the new leader. In this case, since there is no execute_command_reply from the original leader (which has crashed), the command will finally timed out, causing the trigger for the transaction retried. In both cases, the transaction will be retried by the server node's trigger retrying logic. This patch fixes the problem by below changes: 1) A pending command can be completed not only by execute_command_reply, but also when the eid is committed, if the execute_command_reply never came. 2) Instead of cancelling all pending commands during role change, let the commands continue waiting to be completed when the eid is committed. The timer is increased to be twice the election base time, so that it has the chance to be completed when leader crashes. This patch fixes the two raft failure test cases previously disabled. See the test case for details of how to reproduce the problem. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb raft: cmd->eid should always be non-null.Han Zhou2019-04-151-6/+3
| | | | | | | | raft_command's eid should always be non-null in all 3 cases. Fix the comment, and also replace if condition with assert. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb raft: Support commands that are required for testing failure scenarios.Han Zhou2019-04-151-0/+83
| | | | | | | | | | | Added unix commands cluster/... for ovsdb raft, which will be used in a future patch to test more fine-grained failure scenarios. The commands either causes a node to crash at certain point, or manipulate the election timer so that we can control the election process to elect a new leader we desired for the test cases. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb raft: Sync commit index to followers without delay.Han Zhou2019-04-151-14/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When update is requested from follower, the leader sends AppendRequest to all followers and wait until AppendReply received from majority, and then it will update commit index - the new entry is regarded as committed in raft log. However, this commit will not be notified to followers (including the one initiated the request) until next heartbeat (ping timeout), if no other pending requests. This results in long latency for updates made through followers, especially when a batch of updates are requested through the same follower. $ time for i in `seq 1 100`; do ovn-nbctl ls-add ls$i; done real 0m34.154s user 0m0.083s sys 0m0.250s This patch solves the problem by sending heartbeat as soon as the commit index is updated in leader. It also avoids unnessary heartbeat by resetting the ping timer whenever AppendRequest is broadcasted. With this patch the performance is improved more than 50 times in same test: $ time for i in `seq 1 100`; do ovn-nbctl ls-add ls$i; done real 0m0.564s user 0m0.080s sys 0m0.199s Torture test cases are also updated because otherwise the tests will all be skipped because of the improved performance. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb raft: Precheck prereq before proposing commit.Han Zhou2019-03-078-4/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In current OVSDB Raft design, when there are multiple transactions pending, either from same server node or different nodes in the cluster, only the first one can be successful at once, and following ones will fail at the prerequisite check on leader node, because the first one will update the expected prerequisite eid on leader node, and the prerequisite used for proposing a commit has to be committed eid, so it is not possible for a node to use the latest prerequisite expected by the leader to propose a commit until the lastest transaction is committed by the leader and updated the committed_index on the node. Current implementation proposes the commit as soon as the transaction is requested by the client, which results in continously retry which causes high CPU load and waste. Particularly, even if all clients are using leader_only to connect to only the leader, the prereq check failure still happens a lot when a batch of transactions are pending on the leader node - the leader node proposes a batch of commits using the same committed eid as prerequisite and it updates the expected prereq as soon as the first one is in progress, but it needs time to append to followers and wait until majority replies to update the committed_index, which results in continously useless retries of the following transactions proposed by the leader itself. This patch doesn't change the design but simplely pre-checks if current eid is same as prereq, before proposing the commit, to avoid waste of CPU cycles, for both leader and followers. When clients use leader_only mode, this patch completely eliminates the prereq check failures. In scale test of OVN with 1k HVs and creating and binding 10k lports, the patch resulted in 90% CPU cost reduction on leader and >80% CPU cost reduction on followers. (The test was with leader election base time set to 10000ms, because otherwise the test couldn't complete because of the frequent leader re-election.) This is just one of the related performance problems of the prereq checking mechanism dicussed at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-February/048243.html Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb: Move trigger_run after storage_run and read_db.Han Zhou2019-03-041-2/+4
| | | | | | | | Run triggers after storage_run and read_db to make sure new raft updates are utilized in current iteration. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb-monitor: Support monitor_cond_since.Han Zhou2019-02-287-36/+315
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support the new monitor method monitor_cond_since so that a client can request monitoring start from a specific point instead of always from beginning. This will reduce the cost at scenarios when server is restarted/failed-over but client still has all existing data. In these scenarios only new changes (and in most cases no change) needed to be transfered to client. When ovsdb-server restarted, history transactions are read from disk file; when ovsdb-server failed over, history transactions exists already in the memory of the new server. There are situations that the requested transaction may not be found. For example, a transaction that is too old and has been discarded from the maintained history list in memory, or the transactions on disk has been compacted during ovsdb compaction. In those situations the server fall backs to transfer all data start from begining. For more details of the protocol change, see Documentation/ref/ovsdb-server.7.rst. This change includes both server side and ovsdb-client side changes with the new protocol. IDLs using this capability will be added in future patches. Now the feature takes effect only for cluster mode of ovsdb-server, because cluster mode is the only mode that supports unique transcation uuid today. For other modes, the monitor_cond_since always fall back to transfer all data with found = false. Support for those modes can be added in the future. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb-server: Transaction history tracking.Han Zhou2019-02-285-1/+149
| | | | | | | | | Maintaining last N (n = 100) transactions in memory, which will be used for future patches for generating monitor data from any point in this N transactions. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb-monitor: Refactor ovsdb monitor implementation.Han Zhou2019-02-283-209/+236
| | | | | | | | | | | | | | | | | | | | | Current ovsdb monitor maintains its own transaction version through an incremental integer and use it to identify changes starting from different version, and also use it to figure out if each set of changes should be flushed. In particular, it uses number 0 to represent that the change set contains all data for initial client population. It is a smart way but it prevents further extension of the monitoring mechanism to support future use case for clients to request changes starting from a given history point. This patch refactors the structures so that change sets are referenced directly through the pointer. It uses additional members such as init_change_set, new_change_set to indicates the specific change set explicitely, instead of through calculated version numbers based on implicite rules. At the same time, this patch provides better encapsulation for change set (composed of data in a list of tables), while still allowing traversing across change sets for a given table. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb_monitor: Fix style of prototypes.Han Zhou2019-02-223-42/+42
| | | | | | | | | | | Ommiting the parameter names in prototypes, as suggested by coding style: Omit parameter names from function prototypes when the names do not give useful information. Adjust orders of parameters as suggested by coding style. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb-client: Fix typo.Han Zhou2019-02-221-1/+1
| | | | | Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* monitor: Fix crash when monitor condition adds new columns.Han Zhou2019-02-141-31/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The OVSDB conditional monitor implementation allows many clients to share same copy of monitored data if the clients are sharing same tables and columns being monitored, while they can have different monitor conditions. In monitor conditions they can have different columns which can be different from the columns being monitored. So the struct ovsdb_monitor_table maintains the union of the all the columns being used in any conditions. The problem of the current implementation is that for each change set generated, it doesn't maintain any metadata for the number of columns for the data that has already populated in it. Instead, it always rely on the n_columns field of the struct ovsdb_monitor_table to manipulate the data. However, the n_columns in struct ovsdb_monitor_table can increase (e.g. when a client changes its condition which involves more columns). So it can result in that the existing rows in a change set with N columns being later processed as if it had more than N columns, typically, when the row is freed. This causes the ovsdb-server crashing (see an example of the backtrace). The patch fixes the problem by maintaining n_columns for each change set, and added a test case which fails without the fix. (gdb) bt at lib/ovsdb-data.c:1031 out>, mt=<optimized out>) at ovsdb/monitor.c:320 mt=0x1e7b940) at ovsdb/monitor.c:333 out>, transaction=<optimized out>) at ovsdb/monitor.c:527 initial=<optimized out>, cond_updated=cond_updated@entry=false, unflushed_=unflushed_@entry=0x20dae70, condition=<optimized out>, version=<optimized out>) at ovsdb/monitor.c:1156 (m=m@entry=0x20dae40, initial=initial@entry=false) at ovsdb/jsonrpc-server.c:1655 at ovsdb/jsonrpc-server.c:1729 ovsdb/jsonrpc-server.c:551 ovsdb/jsonrpc-server.c:586 ovsdb/jsonrpc-server.c:401 exiting=0x7ffdb947f76f, run_process=0x0, remotes=0x7ffdb947f7c0, unixctl=0x1e7a560, all_dbs=0x7ffdb947f800, jsonrpc=<optimized out>, config=0x7ffdb947f820) at ovsdb/ovsdb-server.c:209 Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* raft.c: Remove noisy INFO logHan Zhou2019-01-281-1/+0
| | | | | Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb-server: Correct json-rpc comment for "disable-monitor-cond".Justin Pettit2019-01-161-1/+1
| | | | | Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* stream: Allow timeout configuration for open_block.Ilya Maximets2019-01-101-1/+1
| | | | | | | | | | | | | | | | | | | | On some systems in case where remote is not responding, socket could remain in SYN_SENT state for a really long time without errors waiting for connection. This leads to situations where open_blok() hangs for a few minutes waiting for connection to the DOWN remote. For example, our "multiple remotes" idl tests hangs waiting for connection to the WRONG_PORT on FreeBSD in CirrusCI environment. This leads to test failures because Alarm signal arrives much faster than ETIMEDOUT from the socket. This patch allowes to specify timeout value for 'open_block' function. If the connection takes more time, socket will be closed with ETIMEDOUT error code. Negative value or None in python could be used to wait infinitely. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* osvdb: Add some helpful comments.Ben Pfaff2018-12-032-11/+20
| | | | | Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* raft: Fix notifications when a server leaves the cluster.Ben Pfaff2018-11-193-12/+50
| | | | | | | | | | | | | | | When server A sends the leader a request to remove server B from the cluster, where A != B, the leader sends both A and B a notification when the removal is complete. Until now, however, the notification (which is a raft_remove_server_reply message) did not say which server had been removed, and the receiver did not check. Instead, the receiver assumed that it had been removed. The result was that B was removed and A stopped serving out the database even though it was still part of the cluster, This commit fixes the problem. Reported-by: ramteja tadishetti <ramtejatadishetti@gmail.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* raft: Avoid null dereference in raft_update_our_match_index().Ben Pfaff2018-11-191-2/+4
| | | | | | | | | When the server is leaving the cluster but remains leader, the raft_find_server() call can return NULL. Previously this caused a null dereference. This commit fixes the problem. Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* raft: Avoid use-after-free error in raft_update_commit_index().Ben Pfaff2018-11-191-3/+6
| | | | | | | | | | | | | | | | | | | | | | raft_update_commit_index() iterates over a sequence of entries that may have up to two components: a set of servers and a piece of data. When a set of servers is present, it calls raft_run_reconfigure(), which can call through the following chain of functions in some cases: raft_log_reconfiguration() raft_command_execute__() raft_command_initiate() raft_write_entry() raft_add_entry() and raft_add_entry() can reallocate raft->entries, which turns the pointer 'e' that raft_update_commit_index() has to the current entry into a wild pointer. This commit fixes the problem. Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* raft: Improve logging for sent RPCs.Ben Pfaff2018-11-191-21/+34
| | | | | | | | For debugging, it is useful to know the source code line that sent a given RPC message. Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovn-nbctl: Fix the ovn-nbctl test "LBs - daemon" which fails during rpm buildNuman Siddique2018-11-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When 'make check' is called by the mock rpm build (which disables networking), the test "ovn-nbctl: LBs - daemon" fails when it runs the command "ovn-nbctl lb-add lb0 30.0.0.1a 192.168.10.10:80,192.168.10.20:80". ovn-nbctl extracts the vip by calling the socket util function 'inet_parse_active()', and this function blocks when libunbound function ub_resolve() is called further down. ub_resolve() is a blocking function without timeout and all the ovs/ovn utilities use this function. As reported by Timothy Redaelli, the issue can also be reproduced by running the below commands $ sudo unshare -mn -- sh -c 'ip addr add dev lo 127.0.0.1 && \ mount --bind /dev/null /etc/resolv.conf && runuser $SUDO_USER' $ make sandbox SANDBOXFLAGS="--ovn" $ ovn-nbctl -vsocket_util:off lb-add lb0 30.0.0.1a \ 192.168.10.10:80,192.168.10.20:80 To address this issue, this patch adds a new bool argument 'resolve_host' to the function inet_parse_active() to resolve the host only if it is 'true'. ovn-nbctl/ovn-northd will pass 'false' when it calls this function to parse the load balancer values. Reported-by: Timothy Redaelli <tredaelli@redhat.com> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1641672 Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb: Clarify that a server that leaves a cluster may never rejoin.Ben Pfaff2018-11-021-0/+3
| | | | | | | | This wasn't clear from the documentation. Reported-by; Paul Greenberg <greenpau@outlook.com> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* manpages: Include ovs.tmac in most man roots.Ilya Maximets2018-10-303-17/+3
| | | | | | | | | | | | | | | This allows to not redefine common macroses in every single file and allowes using things like .EX without warying about compatibility. manpages.mk updated automatically. Files that are already complete pages (i.e. has no *.in sources) wasn't touched, because this will require additional file manipulations and changes in makefiles/specs without serious profit. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* condition: Reject <, <=, >=, > with optional scalar against empty set.Ben Pfaff2018-10-031-0/+5
| | | | | | | | | | | | When relational comparisons against optional scalars were introduced, it was meant to work only when the right-hand side of the comparison was a scalar, not the empty set. The implementation wasn't that picky. This commit fixes the problem. CC: Terry Wilson <twilson@redhat.com> Fixes: 09e256031a62 ("ovsdb: Allow comparison on optional scalar types") Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* condition: Fix ==, !=, includes, excludes on optional scalars.Ben Pfaff2018-10-031-18/+11
| | | | | | | | | | | | | | | Open vSwitch 2.4 introduced an OVSDB extension in which a column with type optional integer or real could be compared with the operators <, <=, >, and >=. At the same time, it broke the implementation of the operators ==, !=, includes, and excludes on columns with the same types. This fixes the problem. Reported-by: Hans Ole Rafaelsen <hrafaelsen@gmail.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-September/047356.html CC: Terry Wilson <twilson@redhat.com> Fixes: 09e256031a62 ("ovsdb: Allow comparison on optional scalar types") Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* ovsdb-client: Fix a bug that uses wrong indexYifeng Sun2018-09-271-2/+2
| | | | | | | This patch fixes the incorrect index to argv. Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb-server: Alleviate the possible data loss in an active/standby setupNuman Siddique2018-09-171-14/+11
| | | | | | | | | | | | | | | | | | | | | | | | The present code resets the database when it is in the state - 'RPL_S_SCHEMA_REQUESTED' and repopulates the database when it receives the monitor reply when it is in the state - 'RPL_S_MONITOR_REQUESTED'. If however, it goes to active mode before it processes the monitor reply, the whole data is lost. This patch alleviates the problem by resetting the database when it receives the monitor reply (before processing it). So that reset and repopulation of the db happens in the same state. This approach still has a window for data loss if the function process_notification() when processing the monitor reply fails for some reason or ovsdb-server crashes for some reason during process_notification(). Reported-by: Han Zhou <zhouhan@gmail.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047161.html Tested-by: aginwala <aginwala@ebay.com> Acked-by: Han Zhou <zhouhan@gmail.com> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb-idlc: Use ALIGNED_CAST to avoid spurious warnings for index rows.Ben Pfaff2018-09-171-1/+1
| | | | | | | | | | | | | | | | | The *_index_init_row() function casts a generic ovsdb_idl_row pointer to a specific type of row pointer. This can cause an increase in required alignment with some kinds of data on some architectures. GCC complains, e.g.: lib/vswitch-idl.c: In function 'ovsrec_flow_sample_collector_set_index_init_row' lib/vswitch-idl.c:9277:12: warning: cast increases required alignment of target However, rows are always allocated with malloc(), which returns member suitable for any type, so this is a false positive warning and this commit suppresses it. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Han Zhou <zhouhan@gmail.com>
* ovsdb-client: Make "wait" command logging more sensible.Ben Pfaff2018-08-171-4/+0
| | | | | | | | | | | | | | | | The "wait" command in ovsdb-client (which was introduced as part of the clustering support) fairly often logs things that are normal for it but in other circumstances might be cause for concern, for example messages about being unable to connect to a remote. Until now, it has tried to suppress some of those itself by raising log levels. Unfortunately, in some cases this had the opposite effect because it overrode any settings on the command line, such as an attempt in ovsdb-cluster.at to suppress all logging related to the timeval module. This commit drops the special log levels from the "wait" command and puts equivalents into the tests themselves. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* tests: Use environment variable for default timeout.Ilya Maximets2018-08-151-3/+3
| | | | | | | | | | | Introduce new 'OVS_CTL_TIMEOUT' environment variable that, if set, will be used as a default timeout for OVS control utilities. Setting it in 'atlocal.in' will cover all the hangs inside the testsuite, even when utils called in a subshell. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* utilities: Fix and unify parsing of timeout option.Ilya Maximets2018-08-151-5/+3
| | | | | | | | | | | | | | Parsing of the '--timeout' option implemented differently for every single control utility and, which is more important, highly inaccurate. In most cases unsigned result of 'strtoul' stored in signed variable. Parsing failures are not tracked. 'ovs-appctl' even uses just 'atoi' without any checking of the argument or result. This patch unifies the parsing by using 'str_to_uint'. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb-server: Don't log closing session at program termination.Ben Pfaff2018-08-152-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When ovsdb-server closes a remote connection, it logs a message about it that includes the reason. Until now this has included sessions that it closes when it exits. That meant that, when --run was used, there was a race between noticing that the subprocess exited and noticing that the session that that subprocess (presumably) had open had been closed. If it noticed the latter first, nothing was logged (because it didn't log anything if a session was closed in the ordinary way by the client). If it noticed the former first, it logged a message about closing the session itself. This is a benign race that causes no real problems--except that the tests didn't expect to see the log message from the former case and fail with errors like the following: 1826. ovsdb-server.at:92: testing truncating database log with bad transaction ... ./ovsdb-server.at:96: ovsdb-tool create db schema stderr: stdout: ./ovsdb-server.at:104: ovsdb-server --remote=punix:socket db --run="sh txnfile" --- /dev/null 2018-04-24 08:50:58.769000000 +0000 +++ /root/openvswitch-2.9.2/rpm/rpmbuild/BUILD/openvswitch-2.9.2/tests/testsuite.dir/at-groups/1826/stderr 2018-05-29 14:29:56.529257295 +0000 @@ -0,0 +1,2 @@ +2018-05-29T14:29:56Z|00001|ovsdb_jsonrpc_server|INFO|unix#0: disconnecting (removing ordinals database due to server termination) +2018-05-29T14:29:56Z|00002|ovsdb_jsonrpc_server|INFO|unix#0: disconnecting (removing _Server database due to server termination) This fixes the race. This particular log message isn't too useful since it's pretty obvious that ovsdb-server is closing those sessions, since after all it's exiting! Reported-by: Sanket Sudake <sanket@infracloud.io> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-May/046840.html Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Numan Siddique <nusiddiq@redhat.com>
* ovsdb-idlc.in: Support more interfaces for passing pointers of individual ↵Han Zhou2018-08-141-0/+25
| | | | | | | | | | | tables. This is a follow-up patch for commit 0eb1e37c, to add more interfaces that supports passing around pointers of individual tables, which will be used in incremental processing. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* raft: Fix use-after-free error in raft_store_snapshot().Ben Pfaff2018-08-071-5/+5
| | | | | | | | | | | | | | | raft_store_snapshot() constructs a new snapshot in a local variable then destroys the current snapshot and replaces it by the new one. Until now, it has not cloned the data in the new snapshot until it did the replacement. This led to the unexpected consequence that, if 'servers' in the old and new snapshots was the same, then it would first be freed and later cloned, which could cause a segfault. Multiple people reported the crash. Gurucharan Shetty provided a reproduction case. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>
* ovsdb-tool: Only check leader completeness when we can, in "check-cluster".Ben Pfaff2018-08-031-1/+2
| | | | | | | | | | | Generally when we know the leader for a term, in "check-cluster", it's because we read that leader's log file. In that case, we have the leader's log_end because it told us. However, taking a snapshot can discard that data. In that case, log_end is 0 and we should not try to check for leader completeness on that basis. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>
* ovsdb-tool: Check for duplicate server IDs in "check-cluster".Ben Pfaff2018-08-031-0/+7
| | | | | | | | The user shouldn't provide a given server's log more than once but this check makes sure. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>