summaryrefslogtreecommitdiff
path: root/datapath/flow_netlink.c
Commit message (Collapse)AuthorAgeFilesLines
* datapath: Support masked set actions.Jarno Rajahalme2015-05-221-34/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | OVS kernel module support for masked set actions in already upstream in Linux (commit 83d2b9ba1abca241df44a502b6da950a25856b5b). This patch adds the same for the OVS tree kernel module. The existing set action sets many fields at once. When only a subset of the IP header fields, for example, should be modified, all the IP fields need to be exact matched so that the other field values can be copied to the set action. A masked set action allows modification of an arbitrary subset of the supported header bits without requiring the rest to be matched. Masked set action is now supported for all writeable key types, except for the tunnel key. The set tunnel action is an exception as any input tunnel info is cleared before action processing starts, so there is no tunnel info to mask. The kernel module converts all (non-tunnel) set actions to masked set actions. This makes action processing more uniform, and results in less branching and duplicating the action processing code. When returning actions to userspace, the conversion is inverted. We use a kernel internal action code to be able to tell the userspace provided and converted masked set actions apart. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Fix masked key serialization.Pravin B Shelar2015-02-271-1/+1
| | | | | | | | Fix typo where mask is used rather than key. Fixes: 74ed7ab9264("openvswitch: Add support for unique flow IDs.") Reported-by: Joe Stringer <joestringer@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Add support for unique flow IDs.Joe Stringer2015-02-271-2/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously, flows were manipulated by userspace specifying a full, unmasked flow key. This adds significant burden onto flow serialization/deserialization, particularly when dumping flows. This patch adds an alternative way to refer to flows using a variable-length "unique flow identifier" (UFID). At flow setup time, userspace may specify a UFID for a flow, which is stored with the flow and inserted into a separate table for lookup, in addition to the standard flow table. Flows created using a UFID must be fetched or deleted using the UFID. All flow dump operations may now be made more terse with OVS_UFID_F_* flags. For example, the OVS_UFID_F_OMIT_KEY flag allows responses to omit the flow key from a datapath operation if the flow has a corresponding UFID. This significantly reduces the time spent assembling and transacting netlink messages. With all OVS_UFID_F_OMIT_* flags enabled, the datapath only returns the UFID and statistics for each flow during flow dump, increasing ovs-vswitchd revalidator performance by 40% or more. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* datapath: Refactor ovs_nla_fill_match().Joe Stringer2015-02-271-3/+35
| | | | | | | | | | | | Refactor the ovs_nla_fill_match() function into separate netlink serialization functions ovs_nla_put_{unmasked_key,mask}(). Modify ovs_nla_put_flow() to handle attribute nesting and expose the 'is_mask' parameter - all callers need to nest the flow, and callers have better knowledge about whether it is serializing a mask or not. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* datapath: Fix return of uninitialized variable.Alex Wang2015-02-261-1/+2
| | | | | | | | | | This commit fixes a return of uninitialized variable bug. The bug can cause failures of operations like flow_add. VMware-BZ: #1405810 Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Support VXLAN Group Policy extensionThomas Graf2015-02-061-13/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: openvswitch: Support VXLAN Group Policy extension Introduces support for the group policy extension to the VXLAN virtual port. The extension is disabled by default and only enabled if the user has provided the respective configuration. ovs-vsctl add-port br0 vxlan0 -- \ set Interface vxlan0 type=vxlan options:exts=gbp The configuration interface to enable the extension is based on a new attribute OVS_VXLAN_EXT_GBP nested inside OVS_TUNNEL_ATTR_EXTENSION which can carry additional extensions as needed in the future. The group policy metadata is stored as binary blob (struct ovs_vxlan_opts) internally just like Geneve options but transported as nested Netlink attributes to user space. Renames the existing TUNNEL_OPTIONS_PRESENT to TUNNEL_GENEVE_OPT with the binary value kept intact, a new flag TUNNEL_VXLAN_OPT is introduced. The attributes OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS and existing OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS are implemented mutually exclusive. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Upstream: 1dd144 ("openvswitch: Support VXLAN Group Policy extension") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Allow for any level of nesting in flow attributesThomas Graf2015-02-031-51/+56
| | | | | | | | | | | | | | | | Upstream commit: openvswitch: Allow for any level of nesting in flow attributes nlattr_set() is currently hardcoded to two levels of nesting. This change introduces struct ovs_len_tbl to define minimal length requirements plus next level nesting tables to traverse the key attributes to arbitrary depth. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Upstream: 81bfe3 ("openvswitch: Allow for any level of nesting in flow attributes") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Rename GENEVE_TUN_OPTS() to TUN_METADATA_OPTS()Thomas Graf2015-02-031-33/+40
| | | | | | | | | | | | | | | | | | | | Backport of upstream commit: openvswitch: Rename GENEVE_TUN_OPTS() to TUN_METADATA_OPTS() Also factors out Geneve validation code into a new separate function validate_and_copy_geneve_opts(). A subsequent patch will introduce VXLAN options. Rename the existing GENEVE_TUN_OPTS() to reflect its extended purpose of carrying generic tunnel metadata options. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Upstream: d91641d ("openvswitch: Rename GENEVE_TUN_OPTS() to TUN_METADATA_OPTS()") Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Fix MPLS action validation.Pravin B Shelar2014-12-191-11/+2
| | | | | | | | | | | | | | Linux stack do not allow GSO for packet with multiple encapsulations. Therefore there was check in MPLS action validation to detect such case, But it is better to add such check at run time to detect such cases. Removing this check also fixes bug in action copy to no skip multiple set actions. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reported-by: Srinivas Neginhal <sneginha@vmware.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Bug #1367702
* datapath: Don't validate IPv6 label masks.Joe Stringer2014-11-251-1/+1
| | | | | | | | | | | | | | | | When userspace doesn't provide a mask, OVS datapath generates a fully unwildcarded mask for the flow by copying the flow and setting all bits in all fields. For IPv6 label, this creates a mask that matches on the upper 12 bits, causing the following error: openvswitch: netlink: Invalid IPv6 flow label value (value=ffffffff, max=fffff) This patch ignores the label validation check for masks, avoiding this error. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* datapath: fix coding style.Pravin B Shelar2014-11-091-117/+104
| | | | | | | | | Kernel datapath code has diverged from upstream code. This makes porting patches between these two code bases harder than it needs to be. Following patch fixes this by fixing coding style issues on this branch. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Fix few mpls issues.Pravin B Shelar2014-11-091-6/+17
| | | | | | | Found during MPLS upstreaming. Also sync-up MPLS header files with upstream code. Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Fix comment style.Pravin B Shelar2014-10-231-4/+8
| | | | | | | Use netdev comment style. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
* datapath: Add support for OVS_FLOW_ATTR_PROBE.Jarno Rajahalme2014-10-031-98/+140
| | | | | | | | | This new flag is useful for suppressing error logging while probing for datapath features using flow commands. For backwards compatibility reasons the commands are executed normally, but error logging is suppressed. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Constify various function argumentsThomas Graf2014-09-231-2/+2
| | | | | | | Help produce better optimized code. Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Remove unused dp parameter.Pravin B Shelar2014-09-081-1/+1
| | | | | Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
* Extend OVS IPFIX exporter to export tunnel headersWenyu Zhang2014-08-181-19/+66
| | | | | | | | | | | | | | | | | | | | | | Extend IPFIX exporter to export tunnel headers when both input and output of the port. Add three other_config options in IPFIX table: enable-input-sampling, enable-output-sampling and enable-tunnel-sampling, to control whether sampling tunnel info, on which direction (input or output). Insert sampling action before output action and the output tunnel port is sent to datapath in the sampling action. Make datapath collect output tunnel info and send it back to userpace in upcall message with a new additional optional attribute. Add a tunnel ports map to make the tunnel port lookup faster in sampling upcalls in IPFIX exporter. Make the IPFIX exporter generate IPFIX template sets with enterprise elements for the tunnel info, save the tunnel info in IPFIX cache entries, and send IPFIX DATA with tunnel info. Add flowDirection element in IPFIX templates. Signed-off-by: Wenyu Zhang <wenyuz@vmware.com> Acked-by: Romain Lenglet <rlenglet@vmware.com> Acked-by: Ben Pfaff <blp@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Move key_attr_size() to flow_netlink.h.Joe Stringer2014-08-151-0/+31
| | | | | Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath/flow_netlink: Validate IPv6 flow key and mask values.Jarno Rajahalme2014-08-111-0/+5
| | | | | | Reject flow label key and mask values with invalid bits set. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Avoid NULL mask check while building maskPravin B Shelar2014-08-081-61/+60
| | | | | | | | | | | | | | OVS does mask validation even if it does not need to convert netlink mask attributes to mask structure. ovs_nla_get_match() caller can pass NULL mask structure pointer if the caller does not need mask. Therefore NULL check is required in SW_FLOW_KEY* macros. Following patch does not convert mask netlink attributes if mask pointer is NULL, so we do not need these checks in SW_FLOW_KEY* macro. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Andy Zhou <azhou@nicira.com>
* datapath: Refactor action alloc and copy api.Pravin B Shelar2014-08-071-7/+17
| | | | | | | | | | There are two separate API to allocate and copy actions list. Anytime OVS needs to copy action list, it needs to call both functions. Following patch moves action allocation to copy function to avoid code duplication. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
* datapath: Avoid using wrong metadata for recic action.Pravin B Shelar2014-08-061-6/+1
| | | | | | | | | | | | Recirc action needs to extract flow key from packet, it uses tun_info from OVS_CB for setting tunnel meta data in flow key. But tun_info can be overwritten by tunnel send action. This would result in wrong flow key for the recirculation. Following patch copies flow-key meta data from OVS_CB packet key itself thus avoids this bug. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
* datapath: refactor ovs flow extract API.Pravin B Shelar2014-08-061-18/+13
| | | | | | | | | OVS flow extract is called on packet receive or packet execute code path. Following patch defines separate API for extracting flow-key in packet execute code path. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>
* datapath: do not use vport type to determine presence of Geneve attributesAnsis Atteka2014-08-011-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes following kernel crash that could happen, if geneve vport was not added yet, but revalidator thread attempted to dump flows. To reproduce: 1. switch tunnel type between geneve and gre in a loop; and 2. run ping. BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 IP: [<ffffffffa0385470>] ovs_nla_put_flow+0x3d0/0x7c0 [openvswitch] PGD 3b32b067 PUD 3b2ef067 PMD 0 Oops: 0000 [#2] SMP ... CPU: 0 PID: 6450 Comm: revalidator2 Tainted: GF D O 3.13.0-24-generic #46-Ubuntu Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2012 task: ffff88003b4aafe0 ti: ffff88003d314000 task.ti: ffff88003d314000 RIP: 0010:[<ffffffffa0385470>] [<ffffffffa0385470>] ovs_nla_put_flow+0x3d0/0x7c0 [openvswitch] RSP: 0018:ffff88003d315a10 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88003a9a9960 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffffffffffc8 RDI: ffff88003babcb80 RBP: ffff88003d315a68 R08: 0000000000000000 R09: 0000000000000004 R10: ffff880039c23034 R11: 0000000000000008 R12: ffff88003a861600 R13: ffff88003a9a9960 R14: ffff88003babcb80 R15: qffff88003a861600 FS: 00007ff0f5d94700(0000) GS:ffff88003f600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000048 CR3: 000000003b55b000 CR4: 00000000000007f0 Stack: ffffffff81385093 0000000000000000 0000000000000000 0000000000000000 ffff880000000000 ffff88003d315a58 ffff880039c23014 ffff88003a9a97a0 ffff88003babcb80 ffff880039c23018 ffff88003a861600 ffff88003d315ad0 Call Trace: [<ffffffff81385093>] ? __nla_reserve+0x43/0x50 [<ffffffffa037e683>] ovs_flow_cmd_fill_info+0x93/0x2b0 [openvswitch] [<ffffffffa0387159>] ? ovs_flow_tbl_dump_next+0x49/0xc0 [openvswitch] [<ffffffffa037e920>] ovs_flow_cmd_dump+0x80/0xd0 [openvswitch] [<ffffffff81645004>] netlink_dump+0x84/0x240 [<ffffffff816458eb>] __netlink_dump_start+0x1ab/0x220 [<ffffffff816498d7>] genl_family_rcv_msg+0x337/0x370 [<ffffffffa037e8a0>] ? ovs_flow_cmd_fill_info+0x2b0/0x2b0 [openvswitch] [<ffffffff811a2778>] ? __kmalloc_node_track_caller+0x58/0x1e0 [<ffffffff81649910>] ? genl_family_rcv_msg+0x370/0x370 [<ffffffff816499a1>] genl_rcv_msg+0x91/0xd0 [<ffffffff81647a29>] netlink_rcv_skb+0xa9/0xc0 [<ffffffff81647f28>] genl_rcv+0x28/0x40 [<ffffffff81647055>] netlink_unicast+0xd5/0x1b0 [<ffffffff8164742f>] netlink_sendmsg+0x2ff/0x740 [<ffffffff816024eb>] sock_sendmsg+0x8b/0xc0 [<ffffffff811bbaa1>] ? __sb_end_write+0x31/0x60 [<ffffffff811d42bf>] ? touch_atime+0x10f/0x140 [<ffffffff811c2471>] ? pipe_read+0x371/0x400 [<ffffffff81602691>] SYSC_sendto+0x121/0x1c0 [<ffffffff8109dd84>] ? vtime_account_user+0x54/0x60 [<ffffffff81020d35>] ? syscall_trace_enter+0x145/0x250 [<ffffffff8160319e>] SyS_sendto+0xe/0x10 [<ffffffff8172663f>] tracesys+0xe1/0xe6 Signed-Off-By: Ansis Atteka <aatteka@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath/flow_netlink: Avoid wildcarding tunnel key with disabled megaflowsDaniele Di Proietto2014-07-231-0/+19
| | | | | | | | | | | | | | If the userspace wants to match on a flow with some tunnel attributesset to 0, it simply omits them in the netlink attributes stream. Since our wildcarding logic (when megaflows are disabled) is based on the attributes in the netlink stream, we set our mask incorrectly. This commit adds a check to detect if the userspace wants to match on a tunnel, in which case we simply unwildcard the whole tun_key Reported-by: Andy Zhou <azhou@nicira.com> Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Signed-off-by: Andy Zhou <azhou@nicira.com>
* datapath: flow_netlink: Fix a bug.Alex Wang2014-07-231-1/+2
| | | | | | | | | | | | | | | | | | | | Commit 62974663fe (datapath/flow_netlink: Create right mask with disabled megaflows) introduced the bug which caused ovs_nla_get_match() returns immediately after parsing the flow mask for OVS_KEY_ATTR_ENCAP. Consequently, when vlan encapsulated packets are present, the corresponding datapath flows will have incorrect mask like below. And the incorrect flows could affect other non-vlan packets. ~/ovs# ovs-dpctl dump-flows in_port(3/0xffff0000),eth_type(0x8100),encap(), packets:0, bytes:0, used:never, actions:2 This commit fixes the bug by checking and handling the return value of the parsing function correctly. Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath/flow_netlink: Create right mask with disabled megaflowsDaniele Di Proietto2014-07-111-23/+53
| | | | | | | | | | | | | | | | | | | | | | | | If megaflows are disabled, the userspace does not send the netlink attribute OVS_FLOW_ATTR_MASK, and the kernel must create an exact match mask. sw_flow_mask_set() sets every bytes (in 'range') of the mask to 0xff, even the bytes that represent padding for struct sw_flow, or the bytes that represent fields that may not be set during ovs_flow_extract(). This is a problem, because when we extract a flow from a packet, we do not memset() anymore the struct sw_flow to 0 (since commit 9cef26ac6a71). This commit gets rid of sw_flow_mask_set() and introduces mask_set_nlattr(), which operates on the netlink attributes rather than on the mask key. Using this approach we are sure that only the bytes that the user provided in the flow are matched. Also, if the parse_flow_mask_nlattrs() for the mask ENCAP attribute fails, we now return with an error. Reported-by: Alex Wang <alexw@nicira.com> Suggested-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath/flow_netlink: Fix NDP flow mask validationDaniele Di Proietto2014-07-101-1/+1
| | | | | | | | | | | | | | match_validate() enforce that a mask matching on NDP attributes has also an exact match on ICMPv6 type. The ICMPv6 type, which is 8-bit wide, is stored in the 'tp.src' field of 'struct sw_flow_key', which is 16-bit wide. Therefore, an exact match on ICMPv6 type should only check the first 8 bits. This commit fixes a bug that prevented flows with an exact match on NDP field from being installed Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Additional logging for -EINVAL on flow setups.Jesse Gross2014-07-021-4/+14
| | | | | | | | | | There are many possible ways that a flow can be invalid so we've added logging for most of them. This adds logs for the remaining possible cases so there isn't any ambiguity while debugging. CC: Federico Iezzi <fiezzi@enter.it> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com>
* datapath: Allow pop and push MPLS actions after pop VLANSimon Horman2014-07-011-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch loosens the restrictions surrounding push and pop MPLS actions such that they will be allowed after a pop VLAN action if the inner ethernet type is acceptable for pop and push MPLS actions. This implies that there is only one VLAN tag present. Some analysis of logic of this change is as follows: The purpose of tracking vlan_tci is to allow prohibition of push and pop MPLS actions in the presence of a VLAN. In this scenario the VLAN_TAG_PRESENT bit of vlan_tci is set and eth_type is that of the packet with the outermost VLAN tag removed. A pop VLAN action may clear vlan_tci as it removes the outermost VLAN tag and the push and pop MPLS logic may rely on eth_type for their prohibition logic. This will not allow push and pop MPLS on packets with multiple VLAN tags, regardless of if they are all remove using POP VLAN, as there is no mechanism to expose the inner ethernet type beyond that of the outermost VLAN tag. Suggested-by: Jesse Gross <jgross@nicira.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jesse Gross <jesse@nicira.com>
* datapath: Fix error handling for Geneve options in ipv4_tun_to_nlattr().Ben Pfaff2014-06-301-2/+3
| | | | | | | Found by inspection. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Remove redundant tcp_flags code.Joe Stringer2014-06-271-10/+3
| | | | | | | | | These two cases used to be treated differently for IPv4/IPv6, but they are now identical. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Add basic MPLS support to kernelSimon Horman2014-06-241-18/+112
| | | | | | | | | | | | | | Allow datapath to recognize and extract MPLS labels into flow keys and execute actions which push, pop, and set labels on packets. Based heavily on work by Leo Alterman, Ravi K, Isaku Yamahata and Joe Stringer. Cc: Ravi K <rkerur@gmail.com> Cc: Leo Alterman <lalterman@nicira.com> Cc: Isaku Yamahata <yamahata@valinux.co.jp> Cc: Joe Stringer <joe@wand.net.nz> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jesse Gross <jesse@nicira.com>
* datapath: Add support for Geneve tunneling.Jesse Gross2014-06-201-14/+129
| | | | | | | | | | | | | | | | | | | | | | This adds support for Geneve - Generic Network Virtualization Encapsulation. The protocol is documented at http://tools.ietf.org/html/draft-gross-geneve-00 The kernel implementation is completely agnostic to the options that are in use and can handle newly defined options without further work. It does this by simply matching on a byte array of options and allowing userspace to setup flows on this array. Userspace currently implements only support for basic version of Geneve. It can work with the base header (including the VNI) and is capable of parsing options but does not currently support any particular option definitions. Over time, the intention is to allow options to be matched through OpenFlow without requiring explicit support in OVS userspace. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* tunnel: Add support for matching on OAM packets.Jesse Gross2014-06-191-0/+7
| | | | | | | | | | | Some tunnel formats have mechanisms for indicating that packets are OAM frames that should be handled specially (either as high priority or not forwarded beyond an endpoint). This provides support for allowing those types of packets to be matched. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Wrap struct ovs_key_ipv4_tunnel in a new structure.Jesse Gross2014-06-191-7/+31
| | | | | | | | | | | | | | | | Currently, the flow information that is matched for tunnels and the tunnel data passed around with packets is the same. However, as additional information is added this is not necessarily desirable, as in the case of pointers. This adds a new structure for tunnel metadata which currently contains only the existing struct. This change is purely internal to the kernel since the current OVS_KEY_ATTR_IPV4_TUNNEL is simply a compressed version of OVS_KEY_ATTR_TUNNEL that is translated at flow setup. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: add recirc actionAndy Zhou2014-04-211-0/+16
| | | | | | | Recirculation implementation for Linux kernel data path. Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: add hash actionAndy Zhou2014-04-211-1/+26
| | | | | Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Compact sw_flow_key.Jarno Rajahalme2014-03-241-80/+32
| | | | | | | | | | | | | | | | | | | | | | | | Minimize padding in sw_flow_key and move 'tp' top the main struct. These changes simplify code when accessing the transport port numbers and the tcp flags, and makes the sw_flow_key 8 bytes smaller on 64-bit systems (128->120 bytes). These changes also make the keys for IPv4 packets to fit in one cache line. There is a valid concern for safety of packing the struct ovs_key_ipv4_tunnel, as it would be possible to take the address of the tun_id member as a __be64 * which could result in unaligned access in some systems. However: - sw_flow_key itself is 64-bit aligned, so the tun_id within is always 64-bit aligned. - We never make arrays of ovs_key_ipv4_tunnel (which would force every second tun_key to be misaligned). - We never take the address of the tun_id in to a __be64 *. - Whereever we use struct ovs_key_ipv4_tunnel outside the sw_flow_key, it is in stack (on tunnel input functions), where compiler has full control of the alignment. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Fix output of SCTP mask.Jarno Rajahalme2014-03-241-4/+4
| | | | | | | | | | The 'output' argument of the ovs_nla_put_flow() is the one from which the bits are written to the netlink attributes. For SCTP we accidentally used the bits from the 'swkey' instead. This caused the mask attributes to include the bits from the actual flow key instead of the mask. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Add support for Linux 3.12Pravin Shelar2014-03-071-0/+1
| | | | | | | | | | | Bump kernel support for datapath module to include 3.12. Make use of native ip-tunnel API for Kernel >= 3.12. Based on patch from James Page. Signed-off-by: James Page <james.page@ubuntu.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Reviewed-by: Kyle Mestery <mestery@noironetworks.com>
* datapath: Use ether_addr_copyJoe Perches2014-02-161-6/+6
| | | | | | | It's slightly smaller/faster for some architectures. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* datapath: Remove 5-tuple optimization.Jarno Rajahalme2014-02-181-54/+4
| | | | | | | | The 5-tuple optimization becomes unnecessary with a later per-NUMA node stats patch. Remove it first to make the changes easier to grasp. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
* datapath: flow_netlink: Use pr_fmt to OVS_NLERRJoe Perches2014-02-031-0/+2
| | | | | | | | Add "openvswitch: " prefix to OVS_NLERR output to match the other OVS_NLERR output of datapath.c Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
* datapath: Added (unsigned long long) cast in printfDaniele Di Proietto2014-02-031-2/+2
| | | | | | | | This is necessary, since u64 is not unsigned long long in all architectures: u64 could be also uint64_t. Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
* datapath: use const in some local vars and castsDaniele Di Proietto2014-01-231-2/+4
| | | | | | | | | In few functions, const formal parameters are assigned or cast to non-const. These changes suppress warnings if compiled with -Wcast-qual. Signed-off-by: Daniele Di Proietto <daniele.di.proietto@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
* datapath: Use percpu allocator for flow-stats.Pravin B Shelar2013-12-031-4/+52
| | | | | | | | | | | | | | Use percpu allocator for stats due to objection to stats array. But percpu allocator is not designed for high churn allocation/ deallcation. so we need to avoid allocating percpu flow for short lived flows. One cheaper way to detect flow is by checking if 5-tuple used in RSS are masked or not. if any one of them is masked, flow is likely shared across CPU where percpu stat should be more scalable. And that flow should be relatively long lived flow. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* TCP flags matching support.Jarno Rajahalme2013-10-291-2/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tcp_flags=flags/mask Bitwise match on TCP flags. The flags and mask are 16-bit num‐ bers written in decimal or in hexadecimal prefixed by 0x. Each 1-bit in mask requires that the corresponding bit in port must match. Each 0-bit in mask causes the corresponding bit to be ignored. TCP protocol currently defines 9 flag bits, and additional 3 bits are reserved (must be transmitted as zero), see RFCs 793, 3168, and 3540. The flag bits are, numbering from the least significant bit: 0: FIN No more data from sender. 1: SYN Synchronize sequence numbers. 2: RST Reset the connection. 3: PSH Push function. 4: ACK Acknowledgement field significant. 5: URG Urgent pointer field significant. 6: ECE ECN Echo. 7: CWR Congestion Windows Reduced. 8: NS Nonce Sum. 9-11: Reserved. 12-15: Not matchable, must be zero. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>
* datapath: Restructure datapath.c and flow.cPravin B Shelar2013-10-011-0/+1603
Over the time datapath.c and flow.c has became pretty large files. Following patch restructures functionality of component into three different components: flow.c: contains flow extract. flow_netlink.c: netlink flow api. flow_table.c: flow table api. Diffstat is showing wrong count. This patch mostly restructures code without changing logic. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jesse Gross <jesse@nicira.com>