summaryrefslogtreecommitdiff
path: root/datapath/datapath.h
Commit message (Collapse)AuthorAgeFilesLines
* datapath: Add meter infrastructureAndy Zhou2018-02-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit 96fbc13d7e770b542d2d1fcf700d0baadc6e8063 Author: Andy Zhou <azhou@ovn.org> Date: Fri Nov 10 12:09:42 2017 -0800 openvswitch: Add meter infrastructure OVS kernel datapath so far does not support Openflow meter action. This is the first stab at adding kernel datapath meter support. This implementation supports only drop band type. Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Added a compat layer fixup for nla_parse. Added another compat fixup for ktime_get_ns. Cc: Andy Zhou <azhou@ovn.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* datapath: reliable interface indentification in port dumpsJiri Benc2018-02-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit 9354d452034273a50a4fd703bea31e5d6b1fc20b Author: Jiri Benc <jbenc@redhat.com> Date: Thu Nov 2 17:04:37 2017 -0200 openvswitch: reliable interface indentification in port dumps This patch allows reliable identification of netdevice interfaces connected to openvswitch bridges. In particular, user space queries the netdev interfaces belonging to the ports for statistics, up/down state, etc. Datapath dump needs to provide enough information for the user space to be able to do that. Currently, only interface names are returned. This is not sufficient, as openvswitch allows its ports to be in different name spaces and the interface name is valid only in its name space. What is needed and generally used in other netlink APIs, is the pair ifindex+netnsid. The solution is addition of the ifindex+netnsid pair (or only ifindex if in the same name space) to vport get/dump operation. On request side, ideally the ifindex+netnsid pair could be used to get/set/del the corresponding vport. This is not implemented by this patch and can be added later if needed. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Added compat fixup for peernet2id. Cc: Jiri Benc <jbenc@redhat.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* datapath: export get_dp() APIAndy Zhou2018-02-121-0/+31
| | | | | | | | | | | | | | | | | | Upstream commit: commit 9602c01e57f7b868d748c2ba2aef0efa64b71ffc Author: Andy Zhou <azhou@ovn.org> Date: Fri Nov 10 12:09:41 2017 -0800 openvswitch: export get_dp() API. Later patches will invoke get_dp() outside of datapath.c. Export it. Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Andy Zhou <azhou@ovn.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* datapath: fix skb_panic due to the incorrect actions attrlenGreg Rose2017-09-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit 494bea39f3201776cdfddc232705f54a0bd210c4 Author: Liping Zhang <zlpnobody@gmail.com> Date: Wed Aug 16 13:30:07 2017 +0800 openvswitch: fix skb_panic due to the incorrect actions attrlen For sw_flow_actions, the actions_len only represents the kernel part's size, and when we dump the actions to the userspace, we will do the convertions, so it's true size may become bigger than the actions_len. But unfortunately, for OVS_PACKET_ATTR_ACTIONS, we use the actions_len to alloc the skbuff, so the user_skb's size may become insufficient and oops will happen like this: skbuff: skb_over_panic: text:ffffffff8148fabf len:1749 put:157 head: ffff881300f39000 data:ffff881300f39000 tail:0x6d5 end:0x6c0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:129! [...] Call Trace: <IRQ> [<ffffffff8148be82>] skb_put+0x43/0x44 [<ffffffff8148fabf>] skb_zerocopy+0x6c/0x1f4 [<ffffffffa0290d36>] queue_userspace_packet+0x3a3/0x448 [openvswitch] [<ffffffffa0292023>] ovs_dp_upcall+0x30/0x5c [openvswitch] [<ffffffffa028d435>] output_userspace+0x132/0x158 [openvswitch] [<ffffffffa01e6890>] ? ip6_rcv_finish+0x74/0x77 [ipv6] [<ffffffffa028e277>] do_execute_actions+0xcc1/0xdc8 [openvswitch] [<ffffffffa028e3f2>] ovs_execute_actions+0x74/0x106 [openvswitch] [<ffffffffa0292130>] ovs_dp_process_packet+0xe1/0xfd [openvswitch] [<ffffffffa0292b77>] ? key_extract+0x63c/0x8d5 [openvswitch] [<ffffffffa029848b>] ovs_vport_receive+0xa1/0xc3 [openvswitch] [...] Also we can find that the actions_len is much little than the orig_len: crash> struct sw_flow_actions 0xffff8812f539d000 struct sw_flow_actions { rcu = { next = 0xffff8812f5398800, func = 0xffffe3b00035db32 }, orig_len = 1384, actions_len = 592, actions = 0xffff8812f539d01c } So as a quick fix, use the orig_len instead of the actions_len to alloc the user_skb. Last, this oops happened on our system running a relative old kernel, but the same risk still exists on the mainline, since we use the wrong actions_len from the beginning. Fixes: ccea74457bbd ("openvswitch: include datapath actions with sampled-pac Cc: Neil McKee <neil.mckee@inmon.com> Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Fixes: 0e469d3b380c ("datapath: Include datapath actions with sampled-packet upcall to userspace.") Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
* datapath: fix mis-ordered comment lines for ovs_skb_cbGreg Rose2017-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | | Upstream commit: commit 52427fa0631269c62885dc48e0c32e2ad6e17f8c Author: Daniel Axtens <dja@axtens.net> Date: Mon Jul 3 21:46:43 2017 +1000 openvswitch: fix mis-ordered comment lines for ovs_skb_cb I was trying to wrap my head around meaning of mru, and realised that the second line of the comment defining it had somehow ended up after the line defining cutlen, leading to much confusion. Reorder the lines to make sense. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* datapath: Fix kernel panic for ovs reassemble.wangzhike2017-07-211-0/+6
| | | | | | | | | | | | Ovs and kernel stack would add frag_queue to same netns_frags list. As result, ovs and kernel may access the fraq_queue without correct lock. Also the struct ipq may be different on kernel(older than 4.3), which leads to invalid pointer access. The fix creates specific netns_frags for ovs. Signed-off-by: wangzhike <wangzhike@jd.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* datapath: openvswitch: Optimize sample action for the clone use casesAndy Zhou2017-04-191-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: openvswitch: Optimize sample action for the clone use cases With the introduction of open flow 'clone' action, the OVS user space can now translate the 'clone' action into kernel datapath 'sample' action, with 100% probability, to ensure that the clone semantics, which is that the packet seen by the clone action is the same as the packet seen by the action after clone, is faithfully carried out in the datapath. While the sample action in the datpath has the matching semantics, its implementation is only optimized for its original use. Specifically, there are two limitation: First, there is a 3 level of nesting restriction, enforced at the flow downloading time. This limit turns out to be too restrictive for the 'clone' use case. Second, the implementation avoid recursive call only if the sample action list has a single userspace action. The main optimization implemented in this series removes the static nesting limit check, instead, implement the run time recursion limit check, and recursion avoidance similar to that of the 'recirc' action. This optimization solve both #1 and #2 issues above. One related optimization attempts to avoid copying flow key as long as the actions enclosed does not change the flow key. The detection is performed only once at the flow downloading time. Another related optimization is to rewrite the action list at flow downloading time in order to save the fast path from parsing the sample action list in its original form repeatedly. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Upstream: 798c166173ff ("openvswitch: Optimize sample action for the clone use cases") Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* datapath: netns: make struct pernet_operations::id unsigned int.Alexey Dobriyan2017-03-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit c7d03a00b56fc23c3a01a8353789ad257363e281 Author: Alexey Dobriyan <adobriyan@gmail.com> Date: Thu Nov 17 04:58:21 2016 +0300 netns: make struct pernet_operations::id unsigned int Make struct pernet_operations::id unsigned. There are 2 reasons to do so: 1) This field is really an index into an zero based array and thus is unsigned entity. Using negative value is out-of-bound access by definition. 2) On x86_64 unsigned 32-bit data which are mixed with pointers via array indexing or offsets added or subtracted to pointers are preffered to signed 32-bit data. "int" being used as an array index needs to be sign-extended to 64-bit before being used. void f(long *p, int i) { g(p[i]); } roughly translates to movsx rsi, esi mov rdi, [rsi+...] call g MOVSX is 3 byte instruction which isn't necessary if the variable is unsigned because x86_64 is zero extending by default. Now, there is net_generic() function which, you guessed it right, uses "int" as an array index: static inline void *net_generic(const struct net *net, int id) { ... ptr = ng->ptr[id - 1]; ... } And this function is used a lot, so those sign extensions add up. Patch snipes ~1730 bytes on allyesconfig kernel (without all junk messing with code generation): add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) Unfortunately some functions actually grow bigger. This is a semmingly random artefact of code generation with register allocator being used differently. gcc decides that some variable needs to live in new r8+ registers and every access now requires REX prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be used which is longer than [r8] However, overall balance is in negative direction: add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) function old new delta nfsd4_lock 3886 3959 +73 tipc_link_build_proto_msg 1096 1140 +44 mac80211_hwsim_new_radio 2776 2808 +32 tipc_mon_rcv 1032 1058 +26 svcauth_gss_legacy_init 1413 1429 +16 tipc_bcbase_select_primary 379 392 +13 nfsd4_exchange_id 1247 1260 +13 nfsd4_setclientid_confirm 782 793 +11 ... put_client_renew_locked 494 480 -14 ip_set_sockfn_get 730 716 -14 geneve_sock_add 829 813 -16 nfsd4_sequence_done 721 703 -18 nlmclnt_lookup_host 708 686 -22 nfsd4_lockt 1085 1063 -22 nfs_get_client 1077 1050 -27 tcf_bpf_init 1106 1076 -30 nfsd4_encode_fattr 5997 5930 -67 Total: Before=154856051, After=154854321, chg -0.00% Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> [Committer notes] It looks like changing the type of this doesn't affect the build on older kernels, so we can just make the change. I didn't go through all of the compat code to update the net_id variables there as none of that code should be enabled on kernels with this patch. Upstream: c7d03a00b56f ("netns: make struct pernet_operations::id unsigned int") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
* datapath: backport: ovs: propagate per dp max headroom to all vportsPravin B Shelar2016-07-171-0/+4
| | | | | | | | | | | | | | | | | | | | | Upstream commit: commit 3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7 Author: Paolo Abeni <pabeni@redhat.com> ovs: propagate per dp max headroom to all vports This patch implements bookkeeping support to compute the maximum headroom for all the devices in each datapath. When said value changes, the underlying devs are notified via the ndo_set_rx_headroom method. This also increases the internal vports xmit performance. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: compat: Refactor egress tunnel infoPravin B Shelar2016-07-081-1/+0
| | | | | | | | | | | | | upstream tunnel egress info is retrieved using ndo_fill_metadata_dst. Since we do not have it on older kernel we need to keep vport operation to do same on these kernels. Following patch try to merge these to operations into one to avoid code duplication. This commit backports fc4099f1 ("openvswitch: Fix egress tunnel info.") Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath:backport: openvswitch: Add packet truncation support.William Tu2016-06-241-1/+4
| | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit f2a4d086ed4c588d32fe9b7aa67fead7280e7bf1 Author: William Tu <u9012063@gmail.com> Date: Fri Jun 10 11:49:33 2016 -0700 openvswitch: Add packet truncation support. The patch adds a new OVS action, OVS_ACTION_ATTR_TRUNC, in order to truncate packets. A 'max_len' is added for setting up the maximum packet size, and a 'cutlen' field is to record the number of bytes to trim the packet when the packet is outputting to a port, or when the packet is sent to userspace. Signed-off-by: William Tu <u9012063@gmail.com> Cc: Pravin Shelar <pshelar@nicira.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* datapath: Drop support for kernel older than 3.10Pravin B Shelar2016-03-141-1/+0
| | | | | | | | | | | | | | | | Currently OVS out of tree datapath supports a large number of kernel versions. From 2.6.32 to 4.3 and various distribution-specific kernels. But at this point major features are only available on more recent kernels. For example, stateful services are only available starting in kernel 3.10 and STT is available on starting with 3.5. Since these features are becoming essential to many OVS deployments, and the effort of maintaining the backports is high. We have decided to drop support for older kernel. Following patch drops supports for kernel older than 3.10. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: Allow matching on conntrack labelJoe Stringer2015-12-031-0/+3
| | | | | | | | | | | | | | Allow matching and setting the ct_label field. As with ct_mark, this is populated by executing the CT action. The label field may be modified by specifying a label and mask nested under the CT action. It is stored as metadata attached to the connection. Label modification occurs after lookup, and will only persist when the conntrack entry is committed by providing the COMMIT flag to the CT action. Labels are currently fixed to 128 bits in size. Upstream: c2ac667 "openvswitch: Allow matching on conntrack label" Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Add conntrack actionJoe Stringer2015-12-031-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expose the kernel connection tracker via OVS. Userspace components can make use of the CT action to populate the connection state (ct_state) field for a flow. This state can be subsequently matched. Exposed connection states are OVS_CS_F_*: - NEW (0x01) - Beginning of a new connection. - ESTABLISHED (0x02) - Part of an existing connection. - RELATED (0x04) - Related to an established connection. - INVALID (0x20) - Could not track the connection for this packet. - REPLY_DIR (0x40) - This packet is in the reply direction for the flow. - TRACKED (0x80) - This packet has been sent through conntrack. When the CT action is executed by itself, it will send the packet through the connection tracker and populate the ct_state field with one or more of the connection state flags above. The CT action will always set the TRACKED bit. When the COMMIT flag is passed to the conntrack action, this specifies that information about the connection should be stored. This allows subsequent packets for the same (or related) connections to be correlated with this connection. Sending subsequent packets for the connection through conntrack allows the connection tracker to consider the packets as ESTABLISHED, RELATED, and/or REPLY_DIR. The CT action may optionally take a zone to track the flow within. This allows connections with the same 5-tuple to be kept logically separate from connections in other zones. If the zone is specified, then the "ct_zone" match field will be subsequently populated with the zone id. IP fragments are handled by transparently assembling them as part of the CT action. The maximum received unit (MRU) size is tracked so that refragmentation can occur during output. IP frag handling contributed by Andy Zhou. Based on original design by Justin Pettit. Upstream: 7f8a436 "openvswitch: Add conntrack action" Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Justin Pettit <jpettit@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Move MASKED* macros to datapath.hJoe Stringer2015-12-031-0/+4
| | | | | | | | This will allow the ovs-conntrack code to reuse these macros. Upstream: be26b9a "openvswitch: Move MASKED* macros to datapath.h" Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Add support for lwtunnelPravin B Shelar2015-12-031-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Following patch adds support for lwtunnel to OVS datapath. With this change OVS datapath detect lwtunnel support and make use of new APIs if available. On older kernel where the support is not there the backported tunnel modules are used. These backported tunnel devices acts as lwtunnel devices. I tried to keep backported module same as upstream for easier bug-fix backport. Since STT and LISP are not upstream OVS always needs to use respective modules from tunnel compat layer. To make it work on kernel 4.3 I have converted STT and LISP modules to lwtunnel API model. lwtunnel make use of skb-dst to pass tunnel information to the tunnel module. On older kernel this is not possible. So the in case of old kernel metadata ref is stored in OVS_CB and direct call to tunnel transmit function is made by respective tunnel vport modules. Similarly on receive side tunnel recv directly call netdev-vport-receive to pass the skb to OVS. Major backported components include: Geneve, GRE, VXLAN, ip_tunnel, udp-tunnels GRO. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: Add support for 4.1 kernel.Joe Stringer2015-09-181-5/+4
| | | | | | Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Revert "datapath: Constify netlink structs."Pravin B Shelar2015-08-051-1/+1
| | | | | | | | | | | This reverts commit 2023bdcfc44c149a8e3b38dcde8f04f2ec3f8501. This commit is causing segfaults when genl compat code is in use. Compat code update genl_multicast_group and genl_family type objects. Therefore these can not be const. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Joe Stringer <joestringer@nicira.com>
* datapath: Constify netlink structs.Joe Stringer2015-07-301-1/+1
| | | | | Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Include datapath actions with sampled-packet upcall to userspace.Neil McKee2015-07-171-0/+2
| | | | | | | | | | | | | | | | | | | If new optional attribute OVS_USERSPACE_ATTR_ACTIONS is added to an OVS_ACTION_ATTR_USERSPACE action, then include the datapath actions in the upcall. This Directly associates the sampled packet with the path it takes through the virtual switch. Path information currently includes mangling, encapsulation and decapsulation actions for tunneling protocols GRE, VXLAN, Geneve, MPLS and QinQ, but this extension requires no further changes to accommodate datapath actions that may be added in the future. Adding path information enhances visibility into complex virtual networks. Signed-off-by: Neil McKee <neil.mckee@inmon.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: fix coding style.Pravin B Shelar2014-11-091-10/+10
| | | | | | | | | Kernel datapath code has diverged from upstream code. This makes porting patches between these two code bases harder than it needs to be. Following patch fixes this by fixing coding style issues on this branch. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Add support for OVS_FLOW_ATTR_PROBE.Jarno Rajahalme2014-10-031-3/+3
| | | | | | | | | This new flag is useful for suppressing error logging while probing for datapath features using flow commands. For backwards compatibility reasons the commands are executed normally, but error logging is suppressed. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Constify various function argumentsThomas Graf2014-09-231-3/+4
| | | | | | | Help produce better optimized code. Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Remove pkt_key from OVS_CB.Pravin B Shelar2014-09-201-5/+3
| | | | | | | | | OVS keeps pointer to packet key in skb->cb, but the packet key is store on stack. This could make code bit tricky. So it is better to get rid of the pointer. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
* openvswitch: rename ->sync to ->syncpWANG Cong2014-09-121-1/+1
| | | | | | | | | | | | | Openvswitch defines u64_stats_sync as ->sync rather than ->syncp, so fails to compile with netdev_alloc_pcpu_stats(). So just rename it to ->syncp. Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: 1c213bd24ad04f4430031 (net: introduce netdev_alloc_pcpu_stats() for drivers) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Implement recirc action without recursionAndy Zhou2014-09-051-0/+3
| | | | | | | | | | | | | | | | | | | | | | | Since kernel stack is limited in size, it is not wise to using recursive function with large stack frames. This patch provides an alternative implementation of recirc action without using recursion. A per CPU fixed sized, 'deferred action FIFO', is used to store either recirc or sample actions encountered during execution of an action list. Not executing recirc or sample action in place, but rather execute them laster as 'deferred actions' avoids recursion. Deferred actions are only executed after all other actions has been executed, including the ones triggered by loopback from the kernel network stack. The size of the private FIFO, currently set to 20, limits the number of total 'deferred actions' any one packet can accumulate. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Remove recirc stack depth limit checkAndy Zhou2014-09-051-2/+2
| | | | | | | | Future patches will change the recirc action implementation to not using recursion. The stack depth detection is no longer necessary. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Remove flow member from struct ovs_skb_cbLorand Jakab2014-08-251-4/+3
| | | | | | | | | | | struct ovs_skb_cb is full on kernels < 3.11 due to compatibility code. This patch removes the 'flow' member in order to make room for data needed by layer 3 flow/port support that will be added in an upcoming patch. The 'flow' memeber was chosen for removal because it's only used in ovs_execute_actions(). Signed-off-by: Lorand Jakab <lojakab@cisco.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* Extend OVS IPFIX exporter to export tunnel headersWenyu Zhang2014-08-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | Extend IPFIX exporter to export tunnel headers when both input and output of the port. Add three other_config options in IPFIX table: enable-input-sampling, enable-output-sampling and enable-tunnel-sampling, to control whether sampling tunnel info, on which direction (input or output). Insert sampling action before output action and the output tunnel port is sent to datapath in the sampling action. Make datapath collect output tunnel info and send it back to userpace in upcall message with a new additional optional attribute. Add a tunnel ports map to make the tunnel port lookup faster in sampling upcalls in IPFIX exporter. Make the IPFIX exporter generate IPFIX template sets with enterprise elements for the tunnel info, save the tunnel info in IPFIX cache entries, and send IPFIX DATA with tunnel info. Add flowDirection element in IPFIX templates. Signed-off-by: Wenyu Zhang <wenyuz@vmware.com> Acked-by: Romain Lenglet <rlenglet@vmware.com> Acked-by: Ben Pfaff <blp@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Use tun_info only for egress tunnel path.Pravin B Shelar2014-08-061-6/+4
| | | | | | | | | Currently tun_info is used for passing tunnel information on ingress and egress path, this cause confusion. Following patch removes its use on ingress path make it egress only parameter. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
* datapath: Correct comment about 'tun_info' member in ovs_skb_cb.Justin Pettit2014-08-051-1/+2
| | | | | Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Remove unlikely() for WARN_ON() conditionsThomas Graf2014-07-301-1/+1
| | | | | | | | No need for the unlikely(), WARN_ON() and BUG_ON() internally use unlikely() on the condition. Signed-off-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Remove redundant key ref from upcall_info.Pravin B Shelar2014-07-301-3/+1
| | | | | | | | | struct dp_upcall_info has pointer to pkt_key which is already available in OVS_CB. This also simplifies upcall handling for gso packet. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
* datapath: Initialize OVS_CB in ovs_vport_receive()Pravin B Shelar2014-06-271-1/+1
| | | | | | | | | | On packet recv OVS CB is initialized in multiple function. Following patch moves all these initialization to ovs_vport_receive(). This patch also save a check in execute actions. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Wrap struct ovs_key_ipv4_tunnel in a new structure.Jesse Gross2014-06-191-1/+1
| | | | | | | | | | | | | | | | Currently, the flow information that is matched for tunnels and the tunnel data passed around with packets is the same. However, as additional information is added this is not necessarily desirable, as in the case of pointers. This adds a new structure for tunnel metadata which currently contains only the existing struct. This change is purely internal to the kernel since the current OVS_KEY_ATTR_IPV4_TUNNEL is simply a compressed version of OVS_KEY_ATTR_TUNNEL that is translated at flow setup. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: handle recirculation loop detectionAndy Zhou2014-05-011-2/+2
| | | | | | | | | | | | | | | | | | | | | Current datapath limits the number of times same packet can loop through action execution to avoid blowing out the kernel stack. Recirculation also adds to action execution count, but does not use the same amount of stack compare to other services, such as IPsec. This patch introduces the concept of stack cost. Recirculation has a stack cost of 1 while other services have stack cost of 4. Datapath packet process can accommodate packets that need both services and recirculation as long as the total stack cost does not exceed the max stack cost. Packets exceed the limit will be treated as looped packets and dropped. The max stack cost is set to allow up to 4 regular services, plus up to 3 recirculation. The behavior of packets do not recirculate does not change. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: add recirc actionAndy Zhou2014-04-211-2/+6
| | | | | | | Recirculation implementation for Linux kernel data path. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Add support for kernels 3.13Pravin Shelar2014-03-311-0/+1
| | | | | | | | | | | | | | Add support for building the in-tree kernel datapath for Linux kernels up to 3.13. There were some changes in the netlink area which required adding new compatibility code for this layer. Also, some new per-cpu stats initialization code was added. Based on patch from Kyle Mestery. Signed-off-by: Kyle Mestery <mestery@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Kyle Mestery <mestery@noironetworks.com>
* datapath: Use net_ratelimit in OVS_NLERRJoe Perches2014-02-031-3/+5
| | | | | | | | | | | Each use of pr_<level>_once has a per-site flag. Some of the OVS_NLERR messages look as if seeing them multiple times could be useful, so use net_ratelimit() instead of pr_info_once. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
* datapath: Allow user space to announce ability to accept unaligned Netlink ↵Thomas Graf2013-12-161-0/+2
| | | | | | | | messages Signed-off-by: Thomas Graf <tgraf@suug.ch> Reviewed-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
* datapath: Silence RCU lockdep checks from flow lookup.Jesse Gross2013-12-031-0/+2
| | | | | | | | | | Flow lookup can happen either in packet processing context or userspace context but it was annotated as requiring RCU read lock to be held. This also allows OVS mutex to be held without causing warnings. Reported-by: Justin Pettit <jpettit@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Reviewed-by: Thomas Graf <tgraf@redhat.com>
* datapath: collect mega flow mask statsAndy Zhou2013-10-221-0/+4
| | | | | | | | Collect mega flow mask stats. ovs-dpctl show command can be used to display them. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
* datapath: Move mega-flow list out of rehashing struct.Pravin B Shelar2013-10-011-4/+2
| | | | | | | | | ovs-flow rehash does not touch mega flow list. Following patch moves it dp struct datapath. Avoid one extra indirection for accessing mega-flow list head on every packet receive. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Restructure datapath.c and flow.cPravin B Shelar2013-10-011-0/+1
| | | | | | | | | | | | | | | | Over the time datapath.c and flow.c has became pretty large files. Following patch restructures functionality of component into three different components: flow.c: contains flow extract. flow_netlink.c: netlink flow api. flow_table.c: flow table api. Diffstat is showing wrong count. This patch mostly restructures code without changing logic. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Move flow table rehashing to flow install.Pravin B Shelar2013-09-071-0/+2
| | | | | | | | | | | Rehashing in ovs-workqueue can cause ovs-mutex lock contentions in case of heavy flow setups where both needs ovs-mutex. So by moving rehashing to flow-setup we can eliminate contention. This also simplify ovs locking and reduces dependence on workqueue. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Remove vlan compat supportPravin B Shelar2013-09-061-5/+0
| | | | | Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Remove checksum compat supportPravin B Shelar2013-09-061-10/+0
| | | | | Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Move generic tunnel functions to lisp module.Pravin B Shelar2013-08-131-1/+0
| | | | | | | | | | Generic tunnel rcv and send function are only used by lisp tunneling module, so It make sense to move them to lisp module. CC: Lori Jakab <lojakab@cisco.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Lorand Jakab <lojakab@cisco.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Fix Netlink error message header.Jesse Gross2013-07-091-1/+1
| | | | | CC: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
* datapath: add netlink error message to help kernel userspace integration.Andy Zhou2013-07-031-0/+4
| | | | | | | | | | | | | | | | | | | When kernel rejects a netlink message, it usually returns EINVAL error code to the userspace. The actual reason for rejecting the netlinke message is not available, making it harder to debug netlink issues. This patch adds kernel log messages whenever a netlink message is rejected with reasons. Those messages are logged at the info level. Those messages are logged only once per message, to keep kernel log noise level down. Reload the kernel module to re-enable already logged messages. The messages are meant to help developers to debug userspace and kernel intergration issues. The actual message may change or be removed over time. These messages are not expected to show up in a production environment. Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>