delta/openvswitch.git - github.com: openvswitch/ovs.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	flow: Add some L7 payload data to most L4 protocols that accept it.	Ben Pfaff	2018-01-27	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes traffic generated by flow_compose() look slightly more realistic. It requires lots of updates to tests, but at least the tests themselves should be slightly more realistic too. At the same time, add --l7 and --l7-len options to ofproto/trace to allow users to specify the amount or contents of payloads that they want. Suggested-by: Brad Cowie <brad@cowie.nz> Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Yifeng Sun <pkusunyifeng@gmail.com> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com>
*	nsh: add new flow key 'ttl'	Yi Yang	2018-01-11	1	-1/+1
\| \| \| \| \| \| \| \|	IETF NSH draft added a new filed ttl in NSH header, this patch is to add new nsh key 'ttl' for it. Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	Check flow's dl_type before setting ct_orig_tuple in 'pkt_metadata_from_flow()'	Numan Siddique	2017-10-25	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Normally flow's dl_type will be a valid value. However when a packet is sent to the controller, dl_type is not stored in the 'ofputil_packet_in_private'. When the controller resumes (OFPRAW_NXT_RESUME) the packet, the flow's dl_type will be 0. If the flow's ct_state has valid value, then the 'pkt_metadata_from_flow' neither sets the ct_orig_tuple from the flow nor resets it. This results in invalid value ct_orig_tuple in the pkt_metadata. This patch handles this situation by checking the dl_type before setting the ct_orig_tuple. If dl_type is 0, it resets it. It also resets ct_orig_tuple if dl_type is non zero and other than IPv4 or IPv6. Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-October/339868.html Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	userspace: Add support for NSH MD1 match fields	Jan Scheurich	2017-08-07	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for NSH packet header fields to the OVS control plane and the userspace datapath. Initially we support the fields of the NSH base header as defined in https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt and the fixed context headers specified for metadata format MD1. The variable length MD2 format is parsed but the TLV context headers are not yet available for matching. The NSH fields are modelled as experimenter fields with the dedicated experimenter class 0x005ad650 proposed for NSH in ONF. The following fields are defined: NXOXM code ofctl name Size Comment ===================================================================== NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word (0x005ad650,1) NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23 (0x005ad650,2) NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31 (0x005ad650,3) NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word (0x005ad650,4) NXOXM_NSH_SI nsh_si 8 Bits 24-31 (0x005ad650,5) NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1 (0x005ad650,6) NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1 (0x005ad650,7) NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1 (0x005ad650,8) NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1 (0x005ad650,9) Co-authored-by: Johnson Li <johnson.li@intel.com> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	flow: Refactor flow_compose() API.	Andy Zhou	2017-07-27	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, flow_compose_size() is only supposed to be called after flow_compose(). I find this API to be unintuitive. Change flow_compose() API to take the 'size' argument, and returns 'true' if the packet can be created, 'false' otherwise. This change also improves error detection and reporting when 'size' is unreasonably small. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ilya Maximets <i.maximets@samsung.com>
*	flow: Add flow_compose_size().	Ilya Maximets	2017-07-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This allows to compose packets with different real lenghts from odp flows i.e. memory will be allocated for requested packet size and all required headers like ip->tot_len filled correctly. Will be used in netdev-dummy to properly handle '--len' option. Suggested-by: Andy Zhou <azhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
*	conntrack: Move ct_state parsing to lib/flow.c	Yi-Hung Wei	2017-07-12	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	This patch moves conntrack state parsing function from ovn-trace.c to lib/flow.c, because it will be used by ofproto/trace unixctl command later on. It also updates the ct_state checking logic, since we no longer assume CS_TRACKED is enable by default. Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	userspace: Add OXM field MFF_PACKET_TYPE	Jan Scheurich	2017-06-27	1	-6/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow packet type namespace OFPHTN_ETHERTYPE as alternative pre-requisite for matching L3 protocols (MPLS, IP, IPv6, ARP etc). Change the meta-flow definition of packet_type field to use the new custom format MFS_PACKET_TYPE representing "(NS,NS_TYPE)". Parsing routine for MFS_PACKET_TYPE added to meta-flow.c. Formatting routine for field packet_type extracted from match_format() and moved to flow.c to be used from meta-flow.c for formatting MFS_PACKET_TYPE. Updated the ovs-fields man page source meta-flow.xml with documentation for packet-type-aware bridges and added documentation for field packet_type. Added packet_type to the matching properties in tests/ofproto.at. If dl_type is unwildcarded due to later packet modification, make sure it is cleared again if the original packet_type was not PT_ETH. Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	Support accepting and displaying port names in OVS tools.	Ben Pfaff	2017-05-31	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Until now, most ovs-ofctl commands have not accepted names for ports, only numbers, and have not been able to display port names either. It's a lot easier for users if they can use and see meaningful names instead of arbitrary numbers. This commit adds that support. For backward compatibility, only interactive ovs-ofctl commands by default display port names; to display them in scripts, use the new --names option. Signed-off-by: Ben Pfaff <blp@ovn.org> Tested-by: Aaron Conole <aconole@redhat.com>
*	userspace: Add packet_type in dp_packet and flow	Jan Scheurich	2017-05-03	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit adds a packet_type attribute to the structs dp_packet and flow to explicitly carry the type of the packet as prepration for the introduction of the so-called packet type-aware pipeline (PTAP) in OVS. The packet_type is a big-endian 32 bit integer with the encoding as specified in OpenFlow verion 1.5. The upper 16 bits contain the packet type name space. Pre-defined values are defined in openflow-common.h: enum ofp_header_type_namespaces { OFPHTN_ONF = 0, /* ONF namespace. / OFPHTN_ETHERTYPE = 1, / ns_type is an Ethertype. / OFPHTN_IP_PROTO = 2, / ns_type is a IP protocol number. / OFPHTN_UDP_TCP_PORT = 3, / ns_type is a TCP or UDP port. / OFPHTN_IPV4_OPTION = 4, / ns_type is an IPv4 option number. */ }; The lower 16 bits specify the actual type in the context of the name space. Only name spaces 0 and 1 will be supported for now. For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet). This is the default packet_type in OVS and the only one supported so far. Packets of type (OFPHTN_ONF, 0) are called Ethernet packets. In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet. A packet of type (OFPHTN_ETHERTYPE, <Ethertype>) is a standard L2 packet whith the Ethernet header (and any VLAN tags) removed to expose the L3 (or L2.5) payload of the packet. These will simply be called L3 packets. The Ethernet address fields dl_src and dl_dst in struct flow are not applicable for an L3 packet and must be zero. However, to maintain compatibility with the large code base, we have chosen to copy the Ethertype of an L3 packet into the the dl_type field of struct flow. This does not mean that it will be possible to match on dl_type for L3 packets with PTAP later on. Matching must be done on packet_type instead. New dp_packets are initialized with packet_type Ethernet. Ports that receive L3 packets will have to explicitly adjust the packet_type. Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	flow: New function flow_clear_conntrack().	Ben Pfaff	2017-04-21	1	-0/+1
\| \| \| \| \| \| \|	This will have a new user in an upcoming commit. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Miguel Angel Ajo <majopela@redhat.com>
*	flow: New function ct_state_from_string().	Ben Pfaff	2017-04-21	1	-1/+3
\| \| \| \| \| \| \|	This will have its first user in an upcoming commit. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Miguel Angel Ajo <majopela@redhat.com>
*	Add support for 802.1ad (QinQ tunneling)	Eric Garver	2017-03-16	1	-10/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Flow key handling changes: - Add VLAN header array in struct flow, to record multiple 802.1q VLAN headers. - Add dpif multi-VLAN capability probing. If datapath supports multi-VLAN, increase the maximum depth of nested OVS_KEY_ATTR_ENCAP. Refactor VLAN handling in dpif-xlate: - Introduce 'xvlan' to track VLAN stack during flow processing. - Input and output VLAN translation according to the xbundle type. Push VLAN action support: - Allow ethertype 0x88a8 in VLAN headers and push_vlan action. - Support push_vlan on dot1q packets. Use other_config:vlan-limit in table Open_vSwitch to limit maximum VLANs that can be matched. This allows us to preserve backwards compatibility. Add test cases for VLAN depth limit, Multi-VLAN actions and QinQ VLAN handling Co-authored-by: Thomas F Herbert <thomasfherbert@gmail.com> Signed-off-by: Thomas F Herbert <thomasfherbert@gmail.com> Co-authored-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Eric Garver <e@erig.me> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	odp: Support conntrack orig tuple key.	Jarno Rajahalme	2017-03-08	1	-0/+50
\| \| \| \| \| \|	Userspace support for datapath original direction conntrack tuple. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
*	mpls: Fix MPLS restoration after patch port and group bucket.	Jarno Rajahalme	2016-12-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes problems with MPLS handling related to patch ports and group buckets. If a group bucket or a peer bridge across a patch port pushes MPLS headers to a non-MPLS packet and outputs, the flow translation after returning from the group bucket or patch port would undo the packet transformations so that the processing could continue with the packet as it was before entering the patch port. There were two problems with this: 1. As part of the first MPLS push on a non-MPLS packet, the flow translation would first clear the L3/4 headers of the 'flow' to mark those fields invalid. Later, when committing 'flow' changes to datapath actions before output, the necessary datapath MPLS actions are created and the corresponding changes updated to the 'base flow'. This was done using the same flow_push_mpls() function that clears the L2/3 headers, so also the 'base flow' L2/3 headers were cleared. Then, when translation returns from a patch port or group bucket, the original 'flow' is restored, now showing no sign of the MPLS labels. Since the 'base flow' now has the MPLS labels, following translations know to issue MPLS POP actions before any output actions. However, as part of checking for changes to IP headers we test that the IP protocol type was not changed. But now the 'base flow's 'nw_proto' field is zero and an assert fail crashes OVS. This is solved by not clearing the L3/4 fields of the 'base flow'. This allows the processing after the patch port to continue with L3/4 fields as if no MPLS was done, after first issuing the necessary MPLS POP actions. 2. IP header updates were done before the MPLS POP actions were issued. This caused incorrect packet output after, e.g., group action or patch port. For example, with actions: group 1234: all bucket=push_mpls,output:LOCAL ip actions=group:1234,dec_ttl,output:LOCAL,output:LOCAL the dec_ttl would only be executed before the last output to LOCAL, since at the time of committing IP changes after the group action the packet was still an MPLS packet. This is solved by checking the dl_type of both 'flow' and 'base flow' and issuing MPLS actions if they can transform the packet from an MPLS packet to a non-MPLS packet. For an IP packet the change in ttl can then be correctly committed before the last two output actions. Two test cases are added to prevent future regressions. Reported-by: Thomas Morin <thomas.morin@orange.com> Suggested-by: Takashi YAMAMOTO <yamamoto@ovn.org> Fixes: 8bfd0fdac ("Enhance userspace support for MPLS, for up to 3 labels.") Fixes: 1b035ef20 ("mpls: Allow l3 and l4 actions to prior to a push_mpls action") Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: YAMAMOTO Takashi <yamamoto@ovn.org>
*	flow: Add comments to mf_get_next_in_map().	Bhanuprakash Bodireddy	2016-11-14	1	-5/+27
\| \| \| \| \| \| \| \| \| \|	This patch adds comments to mf_get_next_in_map() to make it more comprehensible. Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
*	flow: Skip invoking expensive count_1bits() with zero input.	Bhanuprakash Bodireddy	2016-11-14	1	-5/+6
\| \| \| \| \| \| \| \| \| \| \|	This patch checks if trash is non-zero and only then resets the flowmap bit and increment the pointer by set bits as found in trash. Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Co-authored-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
*	tnl-neigh-cache: Unwildcard flow members before inspecting them.	Daniele Di Proietto	2016-09-21	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \|	tnl_neigh_snoop() is part of the translation. During translation we have to unwildcard all the fields we examine to make a decision. tnl_arp_snoop() and tnl_nd_snoop() failed to unwildcard fileds in case of failure. The solution is to do unwildcarding before the field is inspected. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Jarno Rajahalme <jarno@ovn.org>
*	meta-flow: Clean up masking with prerequisities checking.	Jarno Rajahalme	2016-07-29	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change mf_are_prereqs_ok() take a flow_wildcards pointer, so that the wildcards can be set at the same time as the prerequisiteis are checked. This makes it easier to write more obviously correct code. Remove the functions mf_mask_field_and_prereqs() and mf_mask_field_and_prereqs__(), and make the callers first check the prerequisites, while supplying 'wc' to mf_are_prereqs_ok(), and if successful, mask the bits of the field that were read or set using mf_mask_field_masked(). Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
*	flow: Introduce parse_dl_type().	Daniele Di Proietto	2016-07-27	1	-0/+1
\| \| \| \| \| \| \| \| \|	The function simply returns the ethernet type of the packet (after eventually discarding the VLAN tag). It will be used by a following commit. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Flavio Leitner <fbl@sysclose.org>
*	flow: Export parse_ipv6_ext_hdrs().	Daniele Di Proietto	2016-07-27	1	-0/+3
\| \| \| \| \| \| \| \|	This will be used by a future commit. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: Flavio Leitner <fbl@sysclose.org>
*	Introduce 128-bit xxregs.	Justin Pettit	2016-07-12	1	-0/+22
\| \| \| \| \| \| \|	These are needed to handle IPv6 addresses. Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
*	flow: New function is_nd().	Ben Pfaff	2016-07-02	1	-0/+21
\| \| \| \| \| \| \|	This simplifies a few pieces of code and will acquire another user in an upcoming commit. Signed-off-by: Ben Pfaff <blp@ovn.org>
*	ofproto-dpif-xlate: Fix IGMP megaflow matching.	Ben Pfaff	2016-05-20	1	-19/+50
\| \| \| \| \| \| \| \| \|	IGMP translations wasn't setting enough bits in the wildcards to ensure different packets were handled differently. Reported-by: "O'Reilly, Darragh" <darragh.oreilly@hpe.com> Reported-at: http://openvswitch.org/pipermail/discuss/2016-April/021036.html Signed-off-by: Ben Pfaff <blp@ovn.org>
*	Break flow.h into private and public parts	Ben Warren	2016-04-14	1	-173/+2
\| \| \| \| \| \| \| \|	Public (struct definitions and some prototypes) go in include/openvswitch Signed-off-by: Ben Warren <ben@skyportsystems.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	hash: New helper functions hash_bytes32() and hash_bytes64().	Ben Pfaff	2016-01-20	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	All of the callers of hash_words() and hash_words64() actually find it easier to pass in the number of bytes instead of the number of 32-bit or 64-bit words. These new functions allow the callers to be a little simpler. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
*	ofproto-dpif: Reject partial ct_labels if unsupported.	Joe Stringer	2015-12-01	1	-8/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If only half of a ct_label is present in a miniflow/minimask (eg, only matching on one specific bit), then rule_check() would allow the flow even if ct_label was unsupported, because it required both 64-bit fields that comprise the ct_label to be present in the miniflow before performing the check. Fix this by populating the stack copy of the label directly from the miniflow fields if available (or zero each 64-bit word if unavailable). Suggested-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jarno@ovn.org>
*	tunneling: extend flow_tnl with ipv6 addresses	Jiri Benc	2015-11-30	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Note that because there's been no prerequisite on the outer protocol, we cannot add it now. Instead, treat the ipv4 and ipv6 dst fields in the way that either both are null, or at most one of them is non-null. [cascardo: abstract testing either dst with flow_tnl_dst_is_set] cascardo: using IPv4-mapped address is an exercise for the future, since this would require special handling of MFF_TUN_SRC and MFF_TUN_DST and OpenFlow messages. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Co-authored-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	vswitchd: Allow modifying ICMP type and code.	Justin Pettit	2015-11-09	1	-2/+2
\| \| \| \| \|	Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Flavio Leitner <fbl@sysclose.org>
*	Add connection tracking label support.	Joe Stringer	2015-10-13	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds a new 128-bit metadata field to the connection tracking interface. When a label is specified as part of the ct action and the connection is committed, the value is saved with the current connection. Subsequent ct lookups with the table specified will expose this metadata as the "ct_label" field in the flow. For example, to allow new TCP connections from port 1->2 and only allow established connections from port 2->1, and to associate a label with those connections: table=0,priority=1,action=drop table=0,arp,action=normal table=0,in_port=1,tcp,action=ct(commit,exec(set_field:1->ct_label)),2 table=0,in_port=2,ct_state=-trk,tcp,action=ct(table=1) table=1,in_port=2,ct_state=+trk,ct_label=1,tcp,action=1 Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	Add connection tracking mark support.	Joe Stringer	2015-10-13	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds a new 32-bit metadata field to the connection tracking interface. When a mark is specified as part of the ct action and the connection is committed, the value is saved with the current connection. Subsequent ct lookups with the table specified will expose this metadata as the "ct_mark" field in the flow. For example, to allow new TCP connections from port 1->2 and only allow established connections from port 2->1, and to associate a mark with those connections: table=0,priority=1,action=drop table=0,arp,action=normal table=0,in_port=1,tcp,action=ct(commit,exec(set_field:1->ct_mark)),2 table=0,in_port=2,ct_state=-trk,tcp,action=ct(table=1) table=1,in_port=2,ct_state=+trk,ct_mark=1,tcp,action=1 Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	Add support for connection tracking.	Joe Stringer	2015-10-13	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds a new action and fields to OVS that allow connection tracking to be performed. This support works in conjunction with the Linux kernel support merged into the Linux-4.3 development cycle. Packets have two possible states with respect to connection tracking: Untracked packets have not previously passed through the connection tracker, while tracked packets have previously been through the connection tracker. For OpenFlow pipeline processing, untracked packets can become tracked, and they will remain tracked until the end of the pipeline. Tracked packets cannot become untracked. Connections can be unknown, uncommitted, or committed. Packets which are untracked have unknown connection state. To know the connection state, the packet must become tracked. Uncommitted connections have no connection state stored about them, so it is only possible for the connection tracker to identify whether they are a new connection or whether they are invalid. Committed connections have connection state stored beyond the lifetime of the packet, which allows later packets in the same connection to be identified as part of the same established connection, or related to an existing connection - for instance ICMP error responses. The new 'ct' action transitions the packet from "untracked" to "tracked" by sending this flow through the connection tracker. The following parameters are supported initally: - "commit": When commit is executed, the connection moves from uncommitted state to committed state. This signals that information about the connection should be stored beyond the lifetime of the packet within the pipeline. This allows future packets in the same connection to be recognized as part of the same "established" (est) connection, as well as identifying packets in the reply (rpl) direction, or packets related to an existing connection (rel). - "zone=[u16\|NXM]": Perform connection tracking in the zone specified. Each zone is an independent connection tracking context. When the "commit" parameter is used, the connection will only be committed in the specified zone, and not in other zones. This is 0 by default. - "table=NUMBER": Fork pipeline processing in two. The original instance of the packet will continue processing the current actions list as an untracked packet. An additional instance of the packet will be sent to the connection tracker, which will be re-injected into the OpenFlow pipeline to resume processing in the specified table, with the ct_state and other ct match fields set. If the table is not specified, then the packet is submitted to the connection tracker, but the pipeline does not fork and the ct match fields are not populated. It is strongly recommended to specify a table later than the current table to prevent loops. When the "table" option is used, the packet that continues processing in the specified table will have the ct_state populated. The ct_state may have any of the following flags set: - Tracked (trk): Connection tracking has occurred. - Reply (rpl): The flow is in the reply direction. - Invalid (inv): The connection tracker couldn't identify the connection. - New (new): This is the beginning of a new connection. - Established (est): This is part of an already existing connection. - Related (rel): This connection is related to an existing connection. For more information, consult the ovs-ofctl(8) man pages. Below is a simple example flow table to allow outbound TCP traffic from port 1 and drop traffic from port 2 that was not initiated by port 1: table=0,priority=1,action=drop table=0,arp,action=normal table=0,in_port=1,tcp,ct_state=-trk,action=ct(commit,zone=9),2 table=0,in_port=2,tcp,ct_state=-trk,action=ct(zone=9,table=1) table=1,in_port=2,ct_state=+trk+est,tcp,action=1 table=1,in_port=2,ct_state=+trk+new,tcp,action=drop Based on original design by Justin Pettit, contributions from Thomas Graf and Daniele Di Proietto. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	flow: Fix MSVC compile errors.	Ben Pfaff	2015-08-31	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes some MSVC build errors introduced by commit 74ff3298c (userspace: Define and use struct eth_addr.) MSVC doesn't like the change in 'const' between function declaration and definition: it reports "formal parameter 2 different from declaration" for each of the functions in flow.h corrected by this (commit. I think it's technically wrong about that, standards-wise.) MSVC doesn't like an empty-brace initializer. (I think it's technically right about that, standards-wise.) This commit attempts to fix both problems, but I have not tested it with MSVC. CC: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Nithin Raju <nithin@vmware.com> Tested-by: Nithin Raju <nithin@vmware.com>
*	userspace: Define and use struct eth_addr.	Jarno Rajahalme	2015-08-28	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Define struct eth_addr and use it instead of a uint8_t array for all ethernet addresses in OVS userspace. The struct is always the right size, and it can be assigned without an explicit memcpy, which makes code more readable. "struct eth_addr" is a good type name for this as many utility functions are already named accordingly. struct eth_addr can be accessed as bytes as well as ovs_be16's, which makes the struct 16-bit aligned. All use seems to be 16-bit aligned, so some algorithms on the ethernet addresses can be made a bit more efficient making use of this fact. As the struct fits into a register (in 64-bit systems) we pass it by value when possible. This patch also changes the few uses of Linux specific ETH_ALEN to OVS's own ETH_ADDR_LEN, and removes the OFP_ETH_ALEN, as it is no longer needed. This work stemmed from a desire to make all struct flow members assignable for unrelated exploration purposes. However, I think this might be a nice code readability improvement by itself. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
*	flow: Workaround for GCC false-positive compilation error.	Jarno Rajahalme	2015-08-27	1	-3/+9
\| \| \| \| \| \| \| \| \|	Without an explicit bounds check GCC 4.9 issues an array out of bounds error. This patch adds explicit checks which will however be optimized away as the relevant parameters are compile-time constants. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	meta-flow: Minor refactoring.	Jarno Rajahalme	2015-08-26	1	-0/+2
\| \| \| \| \| \| \| \|	Change mf_mask_field_and_prereqs() to take a struct flow_wildcards pointer instead of a struct flow pointer so that we can use WC_MASK_FIELD() and WC_MASK_FIELD_MASK() macros to wildcard fields. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	ofproto-dpif-rid: Make lookups cheaper.	Jarno Rajahalme	2015-08-26	1	-21/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes a large-ish copy from the recirculation context lookup, which is performed for each recirculated upcall and revalidation of a recirculating flow. Tunnel metadata has grown large since the addition of Geneve options, and copying that metadata for performing a lookup is not necessary. Change recirc_metadata to use a pointer to struct flow_tnl, and only copy the tunnel metadata when needed, and only copy as little of it as possible. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	flow: Add struct flowmap.	Jarno Rajahalme	2015-08-26	1	-155/+339
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Struct miniflow is now sometimes used just as a map. Define a new struct flowmap for that purpose. The flowmap is defined as an array of maps, and it is automatically sized according to the size of struct flow, so it will be easier to maintain in the future. It would have been tempting to use the existing struct bitmap for this purpose. The main reason this is not feasible at the moment is that some flowmap algorithms are simpler when it can be assumed that no struct flow member requires more bits than can fit to a single map unit. The tunnel member already requires more than 32 bits, so the map unit needs to be 64 bits wide. Performance critical algorithms enumerate the flowmap array units explicitly, as it is easier for the compiler to optimize, compared to the normal iterator. Without this optimization a classifier lookup without wildcard masks would be about 25% slower. With this more general (and maintainable) algorithm the classifier lookups are about 5% slower, when the struct flow actually becomes big enough to require a second map. This negates the performance gained in the "Pre-compute stage masks" patch earlier in the series. Requested-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	classifier: Pre-compute stage masks.	Jarno Rajahalme	2015-08-26	1	-0/+8
\| \| \| \| \| \| \| \| \| \|	This makes stage mask computation happen only when a subtable is inserted and allows simplification of the main lookup function. Classifier benchmark shows that this speeds up the classification (with wildcards) about 5%. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	dpif-netdev: Translate Geneve options per-flow, not per-packet.	Jesse Gross	2015-08-05	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The kernel implementation of Geneve options stores the TLV option data in the flow exactly as received, without any further parsing. This is then translated to known options for the purposes of matching on flow setup (which will then install a datapath flow in the form the kernel is expecting). The userspace implementation behaves a little bit differently - it looks up known options as each packet is received. The reason for this is there is a much tighter coupling between datapath and flow translation and the representation is generally expected to be the same. This works but it incurs work on a per-packet basis that could be done per-flow instead. This introduces a small translation step for Geneve packets between datapath and flow lookup for the userspace datapath in order to allow the same kind of processing that the kernel does. A side effect of this is that unknown options are now shown when flows dumped via ovs-appctl dpif/dump-flows, similar to the kernel. There is a second benefit to this as well: for some operations it is preferable to keep the options exactly as they were received on the wire, which this enables. One example is that for packets that are executed from ofproto-dpif-upcall to the datapath, this avoids the translation of Geneve metadata. Since this conversion is potentially lossy (for unknown options), keeping everything in the same format removes the possibility of dropping options if the packet comes back up to userspace and the Geneve option translation table has changed. To help with these types of operations, most functions can understand both formats of data and seamlessly do the right thing. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
*	flow: Split miniflow's map.	Jarno Rajahalme	2015-07-17	1	-62/+130
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use two maps in miniflow to allow for expansion of struct flow past 512 bytes. We now have one map for tunnel related fields, and another for the rest of the packet metadata and actual packet header fields. This split has the benefit that for non-tunneled packets the overhead should be minimal. Some miniflow utilities now exist in two variants, new ones operating over all the data, and the old ones operating only on a single 64-bit map at a time. The old ones require doubling of code but should execute faster, so those are used in the datapath and classifier's lookup path. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	flow: Make compile with MSVC.	Jarno Rajahalme	2015-07-16	1	-7/+21
\| \| \| \| \| \| \| \| \|	MSVC does not like zero sized arrays in structs. Hence, remove the 'values' member from struct miniflow and add back the getters miniflow_values() and miniflow_get_values(). Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	tunneling: Allow matching and setting tunnel 'OAM' flag.	Jesse Gross	2015-07-15	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Several encapsulation formats have the concept of an 'OAM' bit which typically is used with networking tracing tools to distinguish test packets from real traffic. OVS already internally has support for this, however, it doesn't do anything with it and it also isn't exposed for controllers to use. This enables support through OpenFlow. There are several other tunnel flags which are consumed internally by OVS. It's not clear that it makes sense to use them externally so this does not expose those flags - although it should be easy to do so if necessary in the future. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	flow: Factor out flag parsing and formatting routines.	Jesse Gross	2015-07-15	1	-1/+4
\| \| \| \| \| \| \| \| \|	There are several implementations of functions that parse/format flags and their binary representation. This factors them out into common routines. In addition to reducing code, it also makes things more consistent across different parts of OVS. Signed-off-by: Jesse Gross <jesse@nicira.com>
*	flow: Eliminate miniflow_clone() and minimask_clone().	Jarno Rajahalme	2015-07-15	1	-6/+2
\| \| \| \| \| \| \| \| \| \| \|	miniflow_clone() and minimask_clone() are no longer used, remove them from the API. Now that miniflow data is always inlined, it makes sense to rename miniflow_clone_inline() miniflow_clone(). Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	match: Single malloc minimatch.	Jarno Rajahalme	2015-07-15	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allocate the miniflow and minimask in struct minimatch at once, so that they are consecutive in memory. This halves the number of allocations, and allows smaller minimatches to share the same cache line. After this a minimatch has one heap allocation for all it's data. Previously it had either none (when data was small enough to fit in struct miniflow's inline buffer), or two (when the inline buffer was insufficient). Hopefully always having one performs almost the same as none or two, in average. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	flow: Always inline miniflows.	Jarno Rajahalme	2015-07-15	1	-71/+26
\| \| \| \| \| \| \| \| \| \| \|	Now that performance critical code already inlines miniflows and minimasks, we can simplify struct miniflow by always dynamically allocating miniflows and minimasks to the correct size. This changes the struct minimatch to always contain pointers to its miniflow and minimask. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
*	hash: Add symmetric L3/L4 hash functions for multipath, bundle hashing.	Jeroen van Bemmel	2015-07-08	1	-0/+2
\| \| \| \| \| \|	Signed-off-by: Jeroen van Bemmel <jvb127@gmail.com> [blp@nicira.com made code style fixes, expanded documentation] Signed-off-by: Ben Pfaff <blp@nicira.com>
*	mcast-snooping: Add Multicast Listener Discovery support	Thadeu Lima de Souza Cascardo	2015-07-01	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for MLDv1 and MLDv2. The behavior is not that different from IGMP. Packets to all-hosts address and queries are always flooded, reports go to routers, routers are added when a query is observed, and all MLD packets go through slow path. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Cc: Flavio Leitner <fbl@redhat.com> Cc: Ben Pfaff <blp@nicira.com> [blp@nicira.com moved an assignment out of an 'if' statement] Signed-off-by: Ben Pfaff <blp@nicira.com>
*	tunnel: Geneve TLV handling support for OpenFlow.	Jesse Gross	2015-06-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current support for Geneve in OVS is exactly equivalent to VXLAN: it is possible to set and match on the VNI but not on any options contained in the header. This patch enables the use of options. The goal for Geneve support is not to add support for any particular option but to allow end users or controllers to specify what they would like to match. That is, the full range of Geneve's capabilities should be exposed without modifying the code (the one exception being options that require per-packet computation in the fast path). The main issue with supporting Geneve options is how to integrate the fields into the existing OpenFlow pipeline. All existing operations are referred to by their NXM/OXM field name - matches, action generation, arithmetic operations (i.e. tranfer to a register). However, the Geneve option space is exactly the same as the OXM space, so a direct mapping is not feasible. Instead, we create a pool of 64 NXMs that are then dynamically mapped on Geneve option TLVs using OpenFlow. Once mapped, these fields become first-class citizens in the OpenFlow pipeline. An example of how to use Geneve options: ovs-ofctl add-geneve-map br0 {class=0xffff,type=0,len=4}->tun_metadata0 ovs-ofctl add-flow br0 in_port=LOCAL,actions=set_field:0xffffffff->tun_metadata0,1 This will add a 4 bytes option (filled will all 1's) to all packets coming from the LOCAL port and then send then out to port 1. A limitation of this patch is that although the option table is specified for a particular switch over OpenFlow, it is currently global to all switches. This will be addressed in a future patch. Based on work originally done by Madhu Challa. Ben Pfaff also significantly improved the comments. Signed-off-by: Madhu Challa <challa@noironetworks.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>