summaryrefslogtreecommitdiff
path: root/datapath/actions.c
Commit message (Collapse)AuthorAgeFilesLines
* datapath: avoid deferred execution of recirc actionsLance Richardson2016-09-201-3/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Port upstream fix to datapath module. The only notable difference between this patch and the upstream version is that the value of ovs_recursion_limit (5 for upstream kernel, 4 for out-of-tree module) is maintained in this patch. Upstream commit: commit f43e6dfb056b58628e43179d8f6b59eae417754d Author: Lance Richardson <lrichard@redhat.com> Date: Mon Sep 12 17:07:23 2016 -0400 openvswitch: avoid deferred execution of recirc actions The ovs kernel data path currently defers the execution of all recirc actions until stack utilization is at a minimum. This is too limiting for some packet forwarding scenarios due to the small size of the deferred action FIFO (10 entries). For example, broadcast traffic sent out more than 10 ports with recirculation results in packet drops when the deferred action FIFO becomes full, as reported here: http://openvswitch.org/pipermail/dev/2016-March/067672.html Since the current recursion depth is available (it is already tracked by the exec_actions_level pcpu variable), we can use it to determine whether to execute recirculation actions immediately (safe when recursion depth is low) or defer execution until more stack space is available. With this change, the deferred action fifo size becomes a non-issue for currently failing scenarios because it is no longer used when there are three or fewer recursions through ovs_execute_actions(). Suggested-by: Pravin Shelar <pshelar@ovn.org> Signed-off-by: Lance Richardson <lrichard@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* datapath: Add support for kernel 4.4Pravin B Shelar2016-07-181-3/+4
| | | | | | | | Most of changes are related to ip-fragment API and genetlink API changes. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: Sync OVS recursive loop counter with upstream.Pravin B Shelar2016-07-181-19/+12
| | | | | Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: openvswitch: update checksum in {push,pop}_mplsPravin B Shelar2016-07-171-4/+15
| | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit bc7cc5999fd392cc799630d7e375b2f4e29cc398 Author: Simon Horman <simon.horman@netronome.com> openvswitch: update checksum in {push,pop}_mpls In the case of CHECKSUM_COMPLETE the skb checksum should be updated in {push,pop}_mpls() as they the type in the ethernet header. As suggested by Pravin Shelar. Cc: Pravin Shelar <pshelar@ovn.org> Fixes: 25cd9ba0abc0 ("openvswitch: Add basic MPLS support to kernel") Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: remove get_pcpu_ptrPravin B Shelar2016-07-171-2/+2
| | | | | | | | There is no need to support old kernel so now we can use newer api to access per cpu data. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: Use skb_postpush_rcsum()Pravin B Shelar2016-07-171-5/+3
| | | | | | | Use kernel function to update checksum. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: compat: Refactor egress tunnel infoPravin B Shelar2016-07-081-8/+9
| | | | | | | | | | | | | upstream tunnel egress info is retrieved using ndo_fill_metadata_dst. Since we do not have it on older kernel we need to keep vport operation to do same on these kernels. Following patch try to merge these to operations into one to avoid code duplication. This commit backports fc4099f1 ("openvswitch: Fix egress tunnel info.") Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* compat: ipv4: Pass struct net through ip_fragment.Eric W. Biederman2016-06-271-1/+1
| | | | | | | | | | | Upstream commit: ipv4: Pass struct net through ip_fragment Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Upstream: 694869b3c544 ("ipv4: Pass struct net through ip_fragment") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: Pass net into ovs_fragment.Eric W. Biederman2016-06-271-3/+4
| | | | | | | | | | | | | | | Upstream commit: openvswitch: Pass net into ovs_fragment In preparation for the ipv4 and ipv6 fragmentation code taking a net parameter pass a struct net into ovs_fragment where the v4 and v6 fragmentation code is called. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Upstream: c559cd3ad32b ("openvswitch: Pass net into ovs_fragment") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath:backport: openvswitch: Add packet truncation support.William Tu2016-06-241-4/+36
| | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit f2a4d086ed4c588d32fe9b7aa67fead7280e7bf1 Author: William Tu <u9012063@gmail.com> Date: Fri Jun 10 11:49:33 2016 -0700 openvswitch: Add packet truncation support. The patch adds a new OVS action, OVS_ACTION_ATTR_TRUNC, in order to truncate packets. A 'max_len' is added for setting up the maximum packet size, and a 'cutlen' field is to record the number of bytes to trim the packet when the packet is outputting to a port, or when the packet is sent to userspace. Signed-off-by: William Tu <u9012063@gmail.com> Cc: Pravin Shelar <pshelar@nicira.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* datapath:backport: openvswitch: use flow protocol when recalculating ipv6 ↵Pravin B Shelar2016-06-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | checksums Upstream commit: commit b4f70527f052b0c00be4d7cac562baa75b212df5 Author: Simon Horman <simon.horman@netronome.com> Date: Thu Apr 21 11:49:15 2016 +1000 openvswitch: use flow protocol when recalculating ipv6 checksums When using masked actions the ipv6_proto field of an action to set IPv6 fields may be zero rather than the prevailing protocol which will result in skipping checksum recalculation. This patch resolves the problem by relying on the protocol in the flow key rather than that in the set field action. Fixes: 83d2b9ba1abc ("net: openvswitch: Support masked set actions.") Cc: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: Drop support for kernel older than 3.10Pravin B Shelar2016-03-141-21/+7
| | | | | | | | | | | | | | | | Currently OVS out of tree datapath supports a large number of kernel versions. From 2.6.32 to 4.3 and various distribution-specific kernels. But at this point major features are only available on more recent kernels. For example, stateful services are only available starting in kernel 3.10 and STT is available on starting with 3.5. Since these features are becoming essential to many OVS deployments, and the effort of maintaining the backports is high. We have decided to drop support for older kernel. Following patch drops supports for kernel older than 3.10. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: Fix panic sending IP frags over tunnels.Joe Stringer2016-01-201-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The entire OVS_GSO_CB was not preserved when handling IP fragments, leading to the following NULL pointer dereference in ovs_stt_xmit(). Fix this in the fragmentation handling code by preserving the whole CB. BUG: unable to handle kernel NULL pointer dereference at 000000000000001c IP: [<ffffffffa0cfc5b1>] ovs_stt_xmit+0x61/0x260 [openvswitch] Call Trace: [<ffffffff815f682e>] ? __alloc_skb+0x7e/0x2b0 [<ffffffffa0cf1134>] ovs_vport_send+0x44/0xb0 [openvswitch] [<ffffffffa0ce241f>] ovs_vport_output+0x10f/0x190 [openvswitch] [<ffffffff8163fe98>] ip_fragment+0x238/0x870 [<ffffffffa0ce2310>] ? do_output.isra.35+0x120/0x120 [openvswitch] [<ffffffffa0d02093>] ovs_fragment+0x283/0x292 [openvswitch] [<ffffffff81073ff7>] ? mod_timer_pending+0x67/0x1b0 [<ffffffff8160e2d0>] ? dst_ifdown+0x90/0x90 [<ffffffff8160e2d0>] ? dst_ifdown+0x90/0x90 [<ffffffffa0b30165>] ? nfnetlink_has_listeners+0x15/0x20 [nfnetlink] [<ffffffffa0cdb164>] ? ctnetlink_conntrack_event+0x74/0x7ee [nf_conntrack_netlink] [<ffffffffa0b873cd>] ? nf_ct_deliver_cached_events+0xad/0xf0 [nf_conntrack] [<ffffffff81360331>] ? csum_partial+0x11/0x20 [<ffffffffa0ce2747>] ? execute_masked_set_action+0x2a7/0xa60 [openvswitch] [<ffffffffa0ce22a8>] do_output.isra.35+0xb8/0x120 [openvswitch] [<ffffffffa0ce2ff4>] do_execute_actions+0xf4/0x7f0 [openvswitch] [<ffffffffa0ce3730>] ovs_execute_actions+0x40/0x130 [openvswitch] [<ffffffffa0ce7c69>] ovs_packet_cmd_execute+0x2b9/0x2e0 [openvswitch] [<ffffffff81634fad>] genl_family_rcv_msg+0x18d/0x370 [<ffffffff81635190>] ? genl_family_rcv_msg+0x370/0x370 [<ffffffff81635221>] genl_rcv_msg+0x91/0xd0 [<ffffffff816332c9>] netlink_rcv_skb+0xa9/0xc0 [<ffffffff816337c8>] genl_rcv+0x28/0x40 [<ffffffff816329b5>] netlink_unicast+0xd5/0x1b0 [<ffffffff81632d9e>] netlink_sendmsg+0x30e/0x680 [<ffffffff8162fc84>] ? netlink_rcv_wake+0x44/0x60 [<ffffffff81630d12>] ? netlink_recvmsg+0x1a2/0x3a0 [<ffffffff815ed7fb>] sock_sendmsg+0x8b/0xc0 [<ffffffff8114d06d>] ? __alloc_pages_nodemask+0x16d/0xac0 [<ffffffff8101c4b9>] ? sched_clock+0x9/0x10 [<ffffffff815edbc9>] ___sys_sendmsg+0x349/0x360 [<ffffffff811f8a39>] ? ep_scan_ready_list.isra.7+0x199/0x1c0 [<ffffffff8110705c>] ? acct_account_cputime+0x1c/0x20 [<ffffffff811cd90f>] ? fget_light+0x8f/0xf0 [<ffffffff815ee922>] __sys_sendmsg+0x42/0x80 [<ffffffff815ee972>] SyS_sendmsg+0x12/0x20 [<ffffffff8170f22f>] tracesys+0xe1/0xe6 VMware-BZ: #1587324 Fixes: a94ebc39996b ("datapath: Add conntrack action") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* compat: Backport conntrack strictly to v3.10+.Joe Stringer2015-12-181-3/+3
| | | | | | | | | | | The conntrack/ipfrag backport was previously not entirely consistent in its include for versions 3.9 and 3.10. The intention was to build it for all kernels 3.10 and newer, so fix the version checks. Reported-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@nicira.com> Tested-by: Simon Horman <simon.horman@netronome.com>
* datapath: Avoid warning for unused static data on Linux <=3.9.0.Ben Pfaff2015-12-081-0/+2
| | | | | Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Backport conntrack fixes.Joe Stringer2015-12-031-7/+16
| | | | | | | | | | | | | | | | | | | | | | | Backport the following fixes for conntrack from upstream. 9723e6abc70a openswitch: fix typo CONFIG_NF_CONNTRACK_LABEL 0d5cdef8d5dd openvswitch: Fix conntrack compilation without mark. 982b52700482 openvswitch: Fix mask generation for nested attributes. cc5706056baa openvswitch: Fix IPv6 exthdr handling with ct helpers. 33db4125ec74 openvswitch: Rename LABEL->LABELS b8f2257069f1 openvswitch: Fix skb leak in ovs_fragment() ec0d043d05e6 openvswitch: Ensure flow is valid before executing ct 6f225952461b openvswitch: Reject ct_state unsupported bits fbccce5965a5 openvswitch: Extend ct_state match field to 32 bits ab38a7b5a449 openvswitch: Change CT_ATTR_FLAGS to CT_ATTR_COMMIT 9e384715e9e7 openvswitch: Reject ct_state masks for unknown bits 4f0909ee3d8e openvswitch: Mark connections new when not confirmed. e754ec69ab69 openvswitch: Serialize nested ct actions if provided 74c16618137f openvswitch: Fix double-free on ip_defrag() errors 6f5cadee44d8 openvswitch: Fix skb leak using IPv6 defrag Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Allow matching on conntrack labelJoe Stringer2015-12-031-0/+1
| | | | | | | | | | | | | | Allow matching and setting the ct_label field. As with ct_mark, this is populated by executing the CT action. The label field may be modified by specifying a label and mask nested under the CT action. It is stored as metadata attached to the connection. Label modification occurs after lookup, and will only persist when the conntrack entry is committed by providing the COMMIT flag to the CT action. Labels are currently fixed to 128 bits in size. Upstream: c2ac667 "openvswitch: Allow matching on conntrack label" Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Allow matching on conntrack markJoe Stringer2015-12-031-0/+1
| | | | | | | | | | | | | | | Allow matching and setting the ct_mark field. As with ct_state and ct_zone, these fields are populated when the CT action is executed. To write to this field, a value and mask can be specified as a nested attribute under the CT action. This data is stored with the conntrack entry, and is executed after the lookup occurs for the CT action. The conntrack entry itself must be committed using the COMMIT flag in the CT action flags for this change to persist. Upstream: 182e304 "openvswitch: Allow matching on conntrack mark" Signed-off-by: Justin Pettit <jpettit@nicira.com> Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Add conntrack actionJoe Stringer2015-12-031-6/+181
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expose the kernel connection tracker via OVS. Userspace components can make use of the CT action to populate the connection state (ct_state) field for a flow. This state can be subsequently matched. Exposed connection states are OVS_CS_F_*: - NEW (0x01) - Beginning of a new connection. - ESTABLISHED (0x02) - Part of an existing connection. - RELATED (0x04) - Related to an established connection. - INVALID (0x20) - Could not track the connection for this packet. - REPLY_DIR (0x40) - This packet is in the reply direction for the flow. - TRACKED (0x80) - This packet has been sent through conntrack. When the CT action is executed by itself, it will send the packet through the connection tracker and populate the ct_state field with one or more of the connection state flags above. The CT action will always set the TRACKED bit. When the COMMIT flag is passed to the conntrack action, this specifies that information about the connection should be stored. This allows subsequent packets for the same (or related) connections to be correlated with this connection. Sending subsequent packets for the connection through conntrack allows the connection tracker to consider the packets as ESTABLISHED, RELATED, and/or REPLY_DIR. The CT action may optionally take a zone to track the flow within. This allows connections with the same 5-tuple to be kept logically separate from connections in other zones. If the zone is specified, then the "ct_zone" match field will be subsequently populated with the zone id. IP fragments are handled by transparently assembling them as part of the CT action. The maximum received unit (MRU) size is tracked so that refragmentation can occur during output. IP frag handling contributed by Andy Zhou. Based on original design by Justin Pettit. Upstream: 7f8a436 "openvswitch: Add conntrack action" Signed-off-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Justin Pettit <jpettit@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Move MASKED* macros to datapath.hJoe Stringer2015-12-031-27/+25
| | | | | | | | This will allow the ovs-conntrack code to reuse these macros. Upstream: be26b9a "openvswitch: Move MASKED* macros to datapath.h" Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Add support for lwtunnelPravin B Shelar2015-12-031-7/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Following patch adds support for lwtunnel to OVS datapath. With this change OVS datapath detect lwtunnel support and make use of new APIs if available. On older kernel where the support is not there the backported tunnel modules are used. These backported tunnel devices acts as lwtunnel devices. I tried to keep backported module same as upstream for easier bug-fix backport. Since STT and LISP are not upstream OVS always needs to use respective modules from tunnel compat layer. To make it work on kernel 4.3 I have converted STT and LISP modules to lwtunnel API model. lwtunnel make use of skb-dst to pass tunnel information to the tunnel module. On older kernel this is not possible. So the in case of old kernel metadata ref is stored in OVS_CB and direct call to tunnel transmit function is made by respective tunnel vport modules. Similarly on receive side tunnel recv directly call netdev-vport-receive to pass the skb to OVS. Major backported components include: Geneve, GRE, VXLAN, ip_tunnel, udp-tunnels GRO. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* datapath: Make 100 percents packets sampled when sampling rate is 1.Wenyu Zhang2015-08-251-1/+4
| | | | | | | | | | | | | | | When sampling rate is 1, the sampling probability is UINT32_MAX. The packet should be sampled even the prandom32() generate the number of UINT32_MAX. And none packet need be sampled when the probability is 0. Signed-off-by: Wenyu Zhang <wenyuz@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net> Upstream: e05176a3283 ("openvswitch: Make 100 percents packets sampled when sampling rate is 1.") Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Fix L4 checksum handling when dealing with IP fragmentsGlenn Griffin2015-08-171-4/+13
| | | | | | | | | | | | | | | | | | openvswitch modifies the L4 checksum of a packet when modifying the ip address. When an IP packet is fragmented only the first fragment contains an L4 header and checksum. Prior to this change openvswitch would modify all fragments, modifying application data in non-first fragments, causing checksum failures in the reassembled packet. Signed-off-by: Glenn Griffin <ggriffin.kernel@gmail.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net> Upstream: 3576fd794b3 ("openvswitch: Fix L4 checksum handling when dealing with IP fragments"). Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Use skb_postpull_rcsum().Joe Stringer2015-07-301-4/+1
| | | | | Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Whitespace fixes.Joe Stringer2015-07-301-5/+1
| | | | | Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Include datapath actions with sampled-packet upcall to userspace.Neil McKee2015-07-171-8/+15
| | | | | | | | | | | | | | | | | | | If new optional attribute OVS_USERSPACE_ATTR_ACTIONS is added to an OVS_ACTION_ATTR_USERSPACE action, then include the datapath actions in the upcall. This Directly associates the sampled packet with the path it takes through the virtual switch. Path information currently includes mangling, encapsulation and decapsulation actions for tunneling protocols GRE, VXLAN, Geneve, MPLS and QinQ, but this extension requires no further changes to accommodate datapath actions that may be added in the future. Adding path information enhances visibility into complex virtual networks. Signed-off-by: Neil McKee <neil.mckee@inmon.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Support masked set actions.Jarno Rajahalme2015-05-221-137/+237
| | | | | | | | | | | | | | | | | | | | | | | | | | | | OVS kernel module support for masked set actions in already upstream in Linux (commit 83d2b9ba1abca241df44a502b6da950a25856b5b). This patch adds the same for the OVS tree kernel module. The existing set action sets many fields at once. When only a subset of the IP header fields, for example, should be modified, all the IP fields need to be exact matched so that the other field values can be copied to the set action. A masked set action allows modification of an arbitrary subset of the supported header bits without requiring the rest to be matched. Masked set action is now supported for all writeable key types, except for the tunnel key. The set tunnel action is an exception as any input tunnel info is cleared before action processing starts, so there is no tunnel info to mask. The kernel module converts all (non-tunnel) set actions to masked set actions. This makes action processing more uniform, and results in less branching and duplicating the action processing code. When returning actions to userspace, the conversion is inverted. We use a kernel internal action code to be able to tell the userspace provided and converted masked set actions apart. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Account for "rename vlan_tx_* helpers since "tx" is misleading there"Thomas Graf2015-02-031-2/+2
| | | | | | | | | | | | | | Upstream commit: net: rename vlan_tx_* helpers since "tx" is misleading there The same macros are used for rx as well. So rename it. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net> Upstream: df8a39d ("net: rename vlan_tx_* helpers since "tx" is misleading there") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: move vlan pop/push functions into common codeThomas Graf2015-01-071-73/+10
| | | | | | | | | | | | | | | So it can be used from out of openvswitch code. Did couple of cosmetic changes on the way, namely variable naming and adding support for 8021AD proto. Note on backwards compatability: Unlike the upstream version, the backport of skb_vlan_push() does not support translating a hardware accelerated 8021AD tag to software. This is not a problem though as it preserves existing behaviour. Upstream: 93515d53 ("net: move vlan pop/push functions into common code") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: move make_writable helper into common codeThomas Graf2015-01-071-25/+14
| | | | | | | | | note that skb_make_writable already exists in net/netfilter/core.c but does something slightly different. Upstream: e219512 ("net: move make_writable helper into common code") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Account for rename to vlan_insert_tag_set_proto()Thomas Graf2015-01-071-1/+1
| | | | | | | | | __vlan_put_tag() was renamed to vlan_insert_tag_set_proto() with the argument list kept intact. Upstream: 62749e ("vlan: rename __vlan_put_tag to vlan_insert_tag_set_proto") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: fix coding style.Pravin B Shelar2014-11-091-8/+9
| | | | | | | | | Kernel datapath code has diverged from upstream code. This makes porting patches between these two code bases harder than it needs to be. Following patch fixes this by fixing coding style issues on this branch. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Fix few mpls issues.Pravin B Shelar2014-11-091-21/+14
| | | | | | | Found during MPLS upstreaming. Also sync-up MPLS header files with upstream code. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Rename last_action() as nla_is_last() and move to netlink.hSimon Horman2014-11-091-8/+3
| | | | | | | | | | | | | | | | | | | | The original motivation for this change was to allow the helper to be used in files other than actions.c as part of work on an odp select group action. It was as pointed out by Thomas Graf that this helper would be best off living in netlink.h. Furthermore, I think that the generic nature of this helper means it is best off in netlink.h regardless of if it is used more than one .c file or not. Thus, I would like it considered independent of the work on an odp select group action. Cc: Thomas Graf <tgraf@suug.ch> Cc: Pravin Shelar <pshelar@nicira.com> Cc: Andy Zhou <azhou@nicira.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Use upstream ipv6_find_hdr().Pravin B Shelar2014-10-231-1/+1
| | | | | | | | | | | ipv6_find_hdr() already fixed in newer upstram kernel by Ansis, we can start using this API safely. This patch also backports fix (ipv6: ipv6_find_hdr restore prev functionality) to compat ipv6_find_hdr(). CC: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
* datapath: Fix comment style.Pravin B Shelar2014-10-231-1/+2
| | | | | | | Use netdev comment style. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
* datapath: Constify various function argumentsThomas Graf2014-09-231-4/+5
| | | | | | | Help produce better optimized code. Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Remove pkt_key from OVS_CB.Pravin B Shelar2014-09-201-185/+113
| | | | | | | | | OVS keeps pointer to packet key in skb->cb, but the packet key is store on stack. This could make code bit tricky. So it is better to get rid of the pointer. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
* datapath: fix panic with multiple vlan headersJiri Benc2014-09-081-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When there are multiple vlan headers present in a received frame, the first one is put into vlan_tci and protocol is set to ETH_P_8021Q. Anything in the skb beyond the VLAN TPID may be still non-linear, including the inner TCI and ethertype. While ovs_flow_extract takes care of IP and IPv6 headers, it does nothing with ETH_P_8021Q. Later, if OVS_ACTION_ATTR_POP_VLAN is executed, __pop_vlan_tci pulls the next vlan header into vlan_tci. This leads to two things: 1. Part of the resulting ethernet header is in the non-linear part of the skb. When eth_type_trans is called later as the result of OVS_ACTION_ATTR_OUTPUT, kernel BUGs in __skb_pull. Also, __pop_vlan_tci is in fact accessing random data when it reads past the TPID. 2. network_header points into the ethernet header instead of behind it. mac_len is set to a wrong value (10), too. Reported-by: Yulong Pei <ypei@redhat.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> I have dropped second change. Since it assumes inner mac header is of ETH_HLEN len which is not always true. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Implement recirc action without recursionAndy Zhou2014-09-051-15/+154
| | | | | | | | | | | | | | | | | | | | | | | Since kernel stack is limited in size, it is not wise to using recursive function with large stack frames. This patch provides an alternative implementation of recirc action without using recursion. A per CPU fixed sized, 'deferred action FIFO', is used to store either recirc or sample actions encountered during execution of an action list. Not executing recirc or sample action in place, but rather execute them laster as 'deferred actions' avoids recursion. Deferred actions are only executed after all other actions has been executed, including the ones triggered by loopback from the kernel network stack. The size of the private FIFO, currently set to 20, limits the number of total 'deferred actions' any one packet can accumulate. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Remove recirc stack depth limit checkAndy Zhou2014-09-051-59/+4
| | | | | | | | Future patches will change the recirc action implementation to not using recursion. The stack depth detection is no longer necessary. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: simplify sample action implementationAndy Zhou2014-08-291-26/+19
| | | | | | | | | The current sample() function implementation is more complicated than necessary in handling single user space action optimization and skb reference counting. There is no functional changes. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Fix checksum calculation when modifying ICMPv6 packets.Jesse Gross2014-08-291-2/+6
| | | | | | | | | | | | The checksum of ICMPv6 packets uses the IP pseudoheader as part of the calculation, unlike ICMP in IPv4. This was not implemented, which means that modifying the IP addresses of an ICMPv6 packet would cause the checksum to no longer be correct as the psuedoheader did not match. Reported-by: Neal Shrader <icosahedral@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Fix recirc bug where skb is double freed.Andy Zhou2014-08-261-12/+17
| | | | | | | | If recirc action is the last action of a action list, the SKB triggers the recirc will be freed twice. This patch fixes this bug. Reported-by: Justin Pettit <jpettit@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
* datapath: Remove flow member from struct ovs_skb_cbLorand Jakab2014-08-251-2/+2
| | | | | | | | | | | struct ovs_skb_cb is full on kernels < 3.11 due to compatibility code. This patch removes the 'flow' member in order to make room for data needed by layer 3 flow/port support that will be added in an upcoming patch. The 'flow' memeber was chosen for removal because it's only used in ovs_execute_actions(). Signed-off-by: Lorand Jakab <lojakab@cisco.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* Extend OVS IPFIX exporter to export tunnel headersWenyu Zhang2014-08-181-0/+19
| | | | | | | | | | | | | | | | | | | | | | Extend IPFIX exporter to export tunnel headers when both input and output of the port. Add three other_config options in IPFIX table: enable-input-sampling, enable-output-sampling and enable-tunnel-sampling, to control whether sampling tunnel info, on which direction (input or output). Insert sampling action before output action and the output tunnel port is sent to datapath in the sampling action. Make datapath collect output tunnel info and send it back to userpace in upcall message with a new additional optional attribute. Add a tunnel ports map to make the tunnel port lookup faster in sampling upcalls in IPFIX exporter. Make the IPFIX exporter generate IPFIX template sets with enterprise elements for the tunnel info, save the tunnel info in IPFIX cache entries, and send IPFIX DATA with tunnel info. Add flowDirection element in IPFIX templates. Signed-off-by: Wenyu Zhang <wenyuz@vmware.com> Acked-by: Romain Lenglet <rlenglet@vmware.com> Acked-by: Ben Pfaff <blp@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Update flow key before recircAndy Zhou2014-08-121-24/+19
| | | | | | | | | | | | | | | | | When flow key becomes invalid due to push or pop actions, current implementation leaves it as invalid, only rebuild the flow key used for recirculation. This works, but is less efficient in case of multiple recirc actions. Each recirc action will have to re-extract its own flow keys. This patch update the original flow key as soon as the first recirc action is encountered, avoiding expensive flow extract call for any future recirc actions as long as the flow key remains valid. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath/actions: Mark recalculate_csum as likely in set_ipv6_addr().Jarno Rajahalme2014-08-111-1/+1
| | | | | | | | | The ‘recalculate_csum’ is almost always ‘true’. It is false only if the ipv6 nexthdr is an extension header, and a routing header is found. For the majority of ipv6 packets this would not be the case, so this can be marked as 'likely'. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Optimize recirc action.Pravin B Shelar2014-08-081-37/+185
| | | | | | | | | | | | | | | OVS need to flow key for flow lookup in recic action. OVS does key extract in recic action. Most of cases we could use OVS_CB packet key directly and can avoid packet flow key extract. SET action we can update flow-key along with packet to keep it consistent. But there are some action like MPLS pop which forces OVS to do flow-extract. In such cases we can mark flow key as invalid so that subsequent recirc action can do full flow extract. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
* datapath: Use tun_info only for egress tunnel path.Pravin B Shelar2014-08-061-5/+3
| | | | | | | | | Currently tun_info is used for passing tunnel information on ingress and egress path, this cause confusion. Following patch removes its use on ingress path make it egress only parameter. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>