summaryrefslogtreecommitdiff
path: root/lib/packets.h
Commit message (Collapse)AuthorAgeFilesLines
* ovn: Add 'na' action and lflow for NDZong Kai LI2016-07-021-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch tries to support ND versus ARP for OVN. It adds a new OVN action 'na' in ovn-controller side, and modify lflows for 'na' action and relevant packets in ovn-northd. First, for ovn-northd, it will generate lflows per each lport with its IPv6 addresses and mac addresss, with 'na' action, such as: match=(icmp6 && icmp6.type == 135 && (nd.target == fd81:ce49:a948:0:f816:3eff:fe46:8a42 || nd.target == fd81:ce49:b123:0:f816:3eff:fe46:8a42)), action=(na { eth.src = fa:16:3e:46:8a:42; nd.tll = fa:16:3e:46:8a:42; outport = inport; inport = ""; /* Allow sending out inport. */ output; };) and new lflows will be set in tabel ls_in_arp_nd_rsp, which is renamed from previous ls_in_arp_rsp. Later, for ovn-controller, when it received a ND packet, it frames a template NA packet for reply. The NA packet will be initialized based on ND packet, such as NA packet will use: - ND packet eth.src as eth.dst, - ND packet eth.dst as eth.src, - ND packet ip6.src as ip6.dst, - ND packet nd.target as ip6.src, - ND packet eth.dst as nd.tll. Finally, nested actions in 'na' action will update necessary fileds for NA packet, such as: - eth.src, nd.tll - inport, outport Since patch port for IPv6 router interface is not ready yet, this patch will only try to deal with ND from VM. This patch will set RSO flags to 011 for NA packets. This patch also modified current ACL lflows for ND, not to do conntrack on ND and NA packets in following tables: - S_SWITCH_IN_PRE_ACL - S_SWITCH_OUT_PRE_ACL - S_SWITCH_IN_ACL - S_SWITCH_OUT_ACL Signed-off-by: Zong Kai LI <zealokii@gmail.com> [blp@ovn.org made several minor simplifications and improvements] Signed-off-by: Ben Pfaff <blp@ovn.org>
* tunnel: Add IP ECN related functions.Pravin B Shelar2016-05-181-0/+7
| | | | | | | | Set and get functions for IP explicit congestion notification flag. These function would be used by STT reassembly code. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* netdev-vport: Factor-out tunnel Push-pop code into separate module.Pravin B Shelar2016-05-181-0/+9
| | | | | | | | It is better to move tunnel push-pop action specific functions into separate module. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* packets: use flow protocol when recalculating ipv6 checksumsSimon Horman2016-04-231-1/+1
| | | | | | | | | | | | | | | | | | When using masked actions the ipv6_proto field of an action to set IPv6 fields may be zero rather than the prevailing protocol which will result in skipping checksum recalculation. This patch resolves the problem by relying on the protocol in the packet rather than that in the set field action. A similar fix for the kernel datapath has been accepted into David Miller's 'net' tree as b4f70527f052 ("openvswitch: use flow protocol when recalculating ipv6 checksums"). Cc: Jarno Rajahalme <jrajahalme@nicira.com> Fixes: 6d670e7f0d45 ("lib/odp: Masked set action execution and printing.") Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: Ben Pfaff <blp@ovn.org>
* Break packets.h into private and public partsBen Warren2016-04-141-42/+1
| | | | | | | | Public (struct definitions and some prototypes) go in include/openvswitch Signed-off-by: Ben Warren <ben@skyportsystems.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Move lib/geneve.h to include/openvswitch directoryBen Warren2016-03-191-1/+1
| | | | | Signed-off-by: Ben Warren <ben@skyportsystems.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovn: Add l3 port security for IPv4 and IPv6Numan Siddique2016-03-181-0/+16
| | | | | | | | | | | | | | | | | | This patch extends the port security to support L3. The ingress stage 'ls_in_port_sec' is renamed to 'ls_in_port_sec_l2' and 2 new stages 'ls_in_port_sec_ip' (table 1) and 'ls_in_port_sec_nd' (table 2) are added. 'ls_in_port_sec_ip' adds flows to restrict the IPv4 and IPv6 traffic to valid IPv4 and IPv6 addresses of the port. 'ls_in_port_sec_nd' adds flows to restricts the ARP and IPv6 ND packets. For egress pipeline, 'ls_out_port_sec' is renamed to 'ls_out_port_sec_l2' and a new stage 'ls_out_port_sec_ip' is added before 'ls_out_port_sec_l2' to restrict the IPv4 and IPv6 traffic for valid IPs. Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* actions: Implement OVN "arp" action.Ben Pfaff2016-03-111-1/+2
| | | | | | | An upcoming commit will use this as a building block in adding ARP support to the OVN L3 logical router implementation. Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovn-northd: Allow lport 'addresses' to store multiple ips in each setNuman Siddique2016-02-251-0/+10
| | | | | | | | | | If a logical port has two ipv4 addresses and one ipv6 address it will be stored as ["MAC IPv41 IPv42 IPv61"] instead of ["MAC IPv41", "MAC IPv42", "MAC IPv61"]. Signed-off-by: Numan Siddique <nusiddiq@redhat.com> [blp@ovn.org made changes to comments and ovn.at] Signed-off-by: Ben Pfaff <blp@ovn.org>
* dpif-netdev: Delay packets' metadata initialization.Daniele Di Proietto2016-02-021-0/+8
| | | | | | | | | | | | | | | | | | | | When a group of packets arrives from a port, we loop through them to initialize metadata and then we loop through them again to extract the flow and perform the exact match classification. This commit combines the two loops into one, and initializes packet->md in emc_processing() to improve performance. Since emc_processing() might also be called after recirculation (in which case the metadata is already valid), an extra parameter is added to support both cases. This commits also implements simple prefetching of packet metadata, to further improve performance. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Andy Zhou <azhou@ovn.org> Acked-by: Chandran, Sugesh <sugesh.chandran@intel.com>
* packets: Add new functions for IPv4 and IPv6 address parsing.Ben Pfaff2015-12-151-0/+7
| | | | | | | These will be used in an upcoming patch to reduce duplicated code. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* packets: New macro ETH_ADDR_STRLEN.Ben Pfaff2015-12-151-0/+1
| | | | | | | An upcoming commit will introduce another user. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* ovn: Use constants for conntrack state bits.Russell Bryant2015-12-151-8/+21
| | | | | | | | | | | A previous commit fixed this code to match changes to the conntrack state bit assignments. This patch further updates the code to use the defined constants to ensure this code adapts automatically to any possible future changes. Signed-off-by: Russell Bryant <russell@ovn.org> Requested-by: Joe Stringer <joe@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* ofproto-dpif-xlate: Support IPv6 when sending to tunnelThadeu Lima de Souza Cascardo2015-12-041-0/+24
| | | | | | | | | When doing push/pop and building tunnel header, do IPv6 route lookups and send Neighbor Solicitations if needed. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Cc: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-vport: Add IPv6 support for build/push/pop tunnel headerThadeu Lima de Souza Cascardo2015-12-041-0/+4
| | | | | | | This includes VXLAN, GRE and Geneve. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* packets: Introduce in6_addr_mapped_ipv4() and use where appropriate.Ben Pfaff2015-12-041-5/+10
| | | | | | | This allows code to be written more naturally in some cases. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
* tunneling: extend flow_tnl with ipv6 addressesJiri Benc2015-11-301-2/+16
| | | | | | | | | | | | | | | | Note that because there's been no prerequisite on the outer protocol, we cannot add it now. Instead, treat the ipv4 and ipv6 dst fields in the way that either both are null, or at most one of them is non-null. [cascardo: abstract testing either dst with flow_tnl_dst_is_set] cascardo: using IPv4-mapped address is an exercise for the future, since this would require special handling of MFF_TUN_SRC and MFF_TUN_DST and OpenFlow messages. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Co-authored-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* packets: Add ipv6_string_mapped.Thadeu Lima de Souza Cascardo2015-11-301-0/+1
| | | | | | | | | ipv6_string_mapped stores an IPv6 or IPv4 representation of an IPv6 address into a string. If the address is IPv4-mapped, it's represented in IPv4 dotted-decimal format. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* conntrack: Add support for NAT.Jarno Rajahalme2015-11-251-1/+6
| | | | | | | | | | | | | Extend OVS conntrack interface to cover NAT. New nested NAT action may be included with a CT action. A bare NAT action only mangles existing connections. If a NAT action with src or dst range attribute is included, new (non-committed) connections are mangled according to the NAT attributes. This work extends on a branch by Thomas Graf at https://github.com/tgraf/ovs/tree/nat. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* packets: Reorder CS_* flags to remove gap.Jarno Rajahalme2015-11-251-3/+3
| | | | | | | | | | | | This changes the conntrack state flags used in the OpenFlow interface to match the ones we currently use in the datapath. While these do not need to be synced, it is nice to get rid of the gap. This should be merged before the first OVS release with connection tracking, or not at all. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* packets: Add ipv6_parse_masked() function.Justin Pettit2015-11-241-0/+2
| | | | | Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* packets: Change IPv6 functions to more closely resemble IPv4 ones.Justin Pettit2015-11-241-5/+4
| | | | | Signed-off-by: Justin Petitt <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* packets: Add support for modifying ICMP type and code.Justin Pettit2015-11-091-0/+1
| | | | | Signed-off-by: Justin Pettit <jpettit@nicira.com> Acked-by: Flavio Leitner <fbl@sysclose.org>
* packets: New function ip_parse_masked().Ben Pfaff2015-10-161-0/+2
| | | | | Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Justin Pettit <jpettit@nicira.com>
* Add connection tracking label support.Joe Stringer2015-10-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | This patch adds a new 128-bit metadata field to the connection tracking interface. When a label is specified as part of the ct action and the connection is committed, the value is saved with the current connection. Subsequent ct lookups with the table specified will expose this metadata as the "ct_label" field in the flow. For example, to allow new TCP connections from port 1->2 and only allow established connections from port 2->1, and to associate a label with those connections: table=0,priority=1,action=drop table=0,arp,action=normal table=0,in_port=1,tcp,action=ct(commit,exec(set_field:1->ct_label)),2 table=0,in_port=2,ct_state=-trk,tcp,action=ct(table=1) table=1,in_port=2,ct_state=+trk,ct_label=1,tcp,action=1 Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* Add connection tracking mark support.Joe Stringer2015-10-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | This patch adds a new 32-bit metadata field to the connection tracking interface. When a mark is specified as part of the ct action and the connection is committed, the value is saved with the current connection. Subsequent ct lookups with the table specified will expose this metadata as the "ct_mark" field in the flow. For example, to allow new TCP connections from port 1->2 and only allow established connections from port 2->1, and to associate a mark with those connections: table=0,priority=1,action=drop table=0,arp,action=normal table=0,in_port=1,tcp,action=ct(commit,exec(set_field:1->ct_mark)),2 table=0,in_port=2,ct_state=-trk,tcp,action=ct(table=1) table=1,in_port=2,ct_state=+trk,ct_mark=1,tcp,action=1 Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* Add support for connection tracking.Joe Stringer2015-10-131-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a new action and fields to OVS that allow connection tracking to be performed. This support works in conjunction with the Linux kernel support merged into the Linux-4.3 development cycle. Packets have two possible states with respect to connection tracking: Untracked packets have not previously passed through the connection tracker, while tracked packets have previously been through the connection tracker. For OpenFlow pipeline processing, untracked packets can become tracked, and they will remain tracked until the end of the pipeline. Tracked packets cannot become untracked. Connections can be unknown, uncommitted, or committed. Packets which are untracked have unknown connection state. To know the connection state, the packet must become tracked. Uncommitted connections have no connection state stored about them, so it is only possible for the connection tracker to identify whether they are a new connection or whether they are invalid. Committed connections have connection state stored beyond the lifetime of the packet, which allows later packets in the same connection to be identified as part of the same established connection, or related to an existing connection - for instance ICMP error responses. The new 'ct' action transitions the packet from "untracked" to "tracked" by sending this flow through the connection tracker. The following parameters are supported initally: - "commit": When commit is executed, the connection moves from uncommitted state to committed state. This signals that information about the connection should be stored beyond the lifetime of the packet within the pipeline. This allows future packets in the same connection to be recognized as part of the same "established" (est) connection, as well as identifying packets in the reply (rpl) direction, or packets related to an existing connection (rel). - "zone=[u16|NXM]": Perform connection tracking in the zone specified. Each zone is an independent connection tracking context. When the "commit" parameter is used, the connection will only be committed in the specified zone, and not in other zones. This is 0 by default. - "table=NUMBER": Fork pipeline processing in two. The original instance of the packet will continue processing the current actions list as an untracked packet. An additional instance of the packet will be sent to the connection tracker, which will be re-injected into the OpenFlow pipeline to resume processing in the specified table, with the ct_state and other ct match fields set. If the table is not specified, then the packet is submitted to the connection tracker, but the pipeline does not fork and the ct match fields are not populated. It is strongly recommended to specify a table later than the current table to prevent loops. When the "table" option is used, the packet that continues processing in the specified table will have the ct_state populated. The ct_state may have any of the following flags set: - Tracked (trk): Connection tracking has occurred. - Reply (rpl): The flow is in the reply direction. - Invalid (inv): The connection tracker couldn't identify the connection. - New (new): This is the beginning of a new connection. - Established (est): This is part of an already existing connection. - Related (rel): This connection is related to an existing connection. For more information, consult the ovs-ofctl(8) man pages. Below is a simple example flow table to allow outbound TCP traffic from port 1 and drop traffic from port 2 that was not initiated by port 1: table=0,priority=1,action=drop table=0,arp,action=normal table=0,in_port=1,tcp,ct_state=-trk,action=ct(commit,zone=9),2 table=0,in_port=2,tcp,ct_state=-trk,action=ct(zone=9,table=1) table=1,in_port=2,ct_state=+trk+est,tcp,action=1 table=1,in_port=2,ct_state=+trk+new,tcp,action=drop Based on original design by Justin Pettit, contributions from Thomas Graf and Daniele Di Proietto. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* lib: Add ipv6 helper functions.Jiri Benc2015-10-051-0/+8
| | | | | | | | | | | ipv6_addr_is_set is going to be used by next patches. [cascardo: compare with in6addr_any in ipv6_addr_is_set] [cascardo: keep only ipv6_addr_is_* functions] Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
* packets: Provide functions to work with IPv4-mapped IPv6 addresses.Thadeu Lima de Souza Cascardo2015-10-051-0/+21
| | | | | | | | | Move in6_addr_set_mapped_ipv4 out of mcast-snooping code to packets.h and provide an in6_addr_get_mapped_ipv4 function that gets the corresponding IPv4 address or the ANY address if it's not IPv4 mapped. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
* netdev-dpdk: Fix build failure due to new struct eth_addr.Aaron Conole2015-09-051-2/+3
| | | | | | | | | | | | The netdev-dpdk uses the struct ether_addr rather than struct eth_addr internal ovs datatype. To facilitate using either the .ea OR the struct ether_addr.addr_bytes argument for printing/logging, add a new ETH_ADDR_BYTES_ARG() define. Signed-off-by: Aaron Conole <aconole@redhat.com> [blp@nicira.com made stylistic changes] Signed-off-by: Ben Pfaff <blp@nicira.com>
* packets: Avoid compile errors.Aaron Conole2015-09-041-2/+2
| | | | | | | | | | | | | Commit 74ff3298c880 (userspace: Define and use struct eth_addr.) introduced a compilation issue due to a bad unsigned 64-bit constant, as well as an implicit narrow. This commit uses the C99 ULL suffix to tell the compiler to treat the constant as 64-bits, and also masks portions of the uint64_t argument to the htons() calls to avoid compiler errors. Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
* userspace: Define and use struct eth_addr.Jarno Rajahalme2015-08-281-91/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Define struct eth_addr and use it instead of a uint8_t array for all ethernet addresses in OVS userspace. The struct is always the right size, and it can be assigned without an explicit memcpy, which makes code more readable. "struct eth_addr" is a good type name for this as many utility functions are already named accordingly. struct eth_addr can be accessed as bytes as well as ovs_be16's, which makes the struct 16-bit aligned. All use seems to be 16-bit aligned, so some algorithms on the ethernet addresses can be made a bit more efficient making use of this fact. As the struct fits into a register (in 64-bit systems) we pass it by value when possible. This patch also changes the few uses of Linux specific ETH_ALEN to OVS's own ETH_ADDR_LEN, and removes the OFP_ETH_ALEN, as it is no longer needed. This work stemmed from a desire to make all struct flow members assignable for unrelated exploration purposes. However, I think this might be a nice code readability improvement by itself. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
* ofproto-dpif-rid: Make lookups cheaper.Jarno Rajahalme2015-08-261-2/+63
| | | | | | | | | | | | | | This patch removes a large-ish copy from the recirculation context lookup, which is performed for each recirculated upcall and revalidation of a recirculating flow. Tunnel metadata has grown large since the addition of Geneve options, and copying that metadata for performing a lookup is not necessary. Change recirc_metadata to use a pointer to struct flow_tnl, and only copy the tunnel metadata when needed, and only copy as little of it as possible. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* dpif-netdev: Translate Geneve options per-flow, not per-packet.Jesse Gross2015-08-051-40/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel implementation of Geneve options stores the TLV option data in the flow exactly as received, without any further parsing. This is then translated to known options for the purposes of matching on flow setup (which will then install a datapath flow in the form the kernel is expecting). The userspace implementation behaves a little bit differently - it looks up known options as each packet is received. The reason for this is there is a much tighter coupling between datapath and flow translation and the representation is generally expected to be the same. This works but it incurs work on a per-packet basis that could be done per-flow instead. This introduces a small translation step for Geneve packets between datapath and flow lookup for the userspace datapath in order to allow the same kind of processing that the kernel does. A side effect of this is that unknown options are now shown when flows dumped via ovs-appctl dpif/dump-flows, similar to the kernel. There is a second benefit to this as well: for some operations it is preferable to keep the options exactly as they were received on the wire, which this enables. One example is that for packets that are executed from ofproto-dpif-upcall to the datapath, this avoids the translation of Geneve metadata. Since this conversion is potentially lossy (for unknown options), keeping everything in the same format removes the possibility of dropping options if the packet comes back up to userspace and the Geneve option translation table has changed. To help with these types of operations, most functions can understand both formats of data and seamlessly do the right thing. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
* mcast-snooping: Add Multicast Listener Discovery supportThadeu Lima de Souza Cascardo2015-07-011-0/+40
| | | | | | | | | | | | | Add support for MLDv1 and MLDv2. The behavior is not that different from IGMP. Packets to all-hosts address and queries are always flooded, reports go to routers, routers are added when a query is observed, and all MLD packets go through slow path. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Cc: Flavio Leitner <fbl@redhat.com> Cc: Ben Pfaff <blp@nicira.com> [blp@nicira.com moved an assignment out of an 'if' statement] Signed-off-by: Ben Pfaff <blp@nicira.com>
* mcast-snooping: Use IPv6 address for MDBThadeu Lima de Souza Cascardo2015-07-011-0/+1
| | | | | | | | | | | Use IPv6 internally for storing multicast addresses. IPv4 addresses are translated to their IPv4-mapped equivalent. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Cc: Flavio Leitner <fbl@redhat.com> Cc: Ben Pfaff <blp@nicira.com> [blp@nicira.com added a "sparse" implementation of IN6_IS_ADDR_V4MAPPED.] Signed-off-by: Ben Pfaff <blp@nicira.com>
* tunnels: Don't initialize unnecessary packet metadata.Jesse Gross2015-07-011-3/+12
| | | | | | | | | | | | | | | | | | | The addition of Geneve options to packet metadata significantly expanded its size. It was reported that this can decrease performance for DPDK ports by up to 25% since we need to initialize the whole structure on each packet receive. It is not really necessary to zero out the entire structure because miniflow_extract() only copies the tunnel metadata when particular fields indicate that it is valid. Therefore, as long as we zero out these fields when the metadata is initialized and ensure that the rest of the structure is correctly set in the presence of a tunnel, we can avoid touching the tunnel fields on packet reception. Reported-by: Ciara Loftus <ciara.loftus@intel.com> Tested-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* tunneling: Userspace datapath support for Geneve options.Jesse Gross2015-06-261-0/+1
| | | | | | | | | | | | | | | | | | Currently the userspace datapath only supports Geneve in a basic mode - without options - since the rest of userspace previously didn't support options either. This enables the userspace datapath to send and receive options as well. The receive path for extracting the tunnel options isn't entirely optimal because it does a lookup on the options on a per-packet basis, rather than per-flow like the kernel does. This is not as straightforward to do in the userspace datapath since there is no translation step between packet formats used in packet vs. flow lookup. This can be optimized in the future and in the meantime option support is still useful for testing and simulation. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* pkt-metadata: Avoid introducing overhead for userspace tunnels.Jesse Gross2015-06-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The addition of Geneve metadata requires a large amount of additional space to handle the maximum set of options. In most cases, this is not a big deal since it is only temporary storage on the stack or can be automatically stripped out for miniflows. However, userspace tunnels need to deal with this on a per-packet basis, so we should avoid introducing additional overhead if possible. Two small changes are aimed at this: * Move struct flow_tnl to the end of the packet metadata. Since the Geneve metadata is already at the end of flow_tnl and pkt_metadata is at the end of struct dp_packet, this avoids putting a large amount metadata (which might be empty) in hot cache lines. * Only push the new metadata into a miniflow if any options are present during miniflow_extract(). This does not necessarily provide the most fine-grained flow generation but it is a quick check and the userspace implementation of Geneve does not currently support options anyways. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* tunnel: Geneve TLV handling support for OpenFlow.Jesse Gross2015-06-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current support for Geneve in OVS is exactly equivalent to VXLAN: it is possible to set and match on the VNI but not on any options contained in the header. This patch enables the use of options. The goal for Geneve support is not to add support for any particular option but to allow end users or controllers to specify what they would like to match. That is, the full range of Geneve's capabilities should be exposed without modifying the code (the one exception being options that require per-packet computation in the fast path). The main issue with supporting Geneve options is how to integrate the fields into the existing OpenFlow pipeline. All existing operations are referred to by their NXM/OXM field name - matches, action generation, arithmetic operations (i.e. tranfer to a register). However, the Geneve option space is exactly the same as the OXM space, so a direct mapping is not feasible. Instead, we create a pool of 64 NXMs that are then dynamically mapped on Geneve option TLVs using OpenFlow. Once mapped, these fields become first-class citizens in the OpenFlow pipeline. An example of how to use Geneve options: ovs-ofctl add-geneve-map br0 {class=0xffff,type=0,len=4}->tun_metadata0 ovs-ofctl add-flow br0 in_port=LOCAL,actions=set_field:0xffffffff->tun_metadata0,1 This will add a 4 bytes option (filled will all 1's) to all packets coming from the LOCAL port and then send then out to port 1. A limitation of this patch is that although the option table is specified for a particular switch over OpenFlow, it is currently global to all switches. This will be addressed in a future patch. Based on work originally done by Madhu Challa. Ben Pfaff also significantly improved the comments. Signed-off-by: Madhu Challa <challa@noironetworks.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* openflow: Table maintenance commands for Geneve options.Jesse Gross2015-06-251-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to work with Geneve options, we need to maintain a mapping table between an option (defined by <class, type, length>) and an NXM field that can be operated on for the purposes of matches, actions, etc. This mapping must be explicitly specified by the user. Conceptually, this table could be communicated using either OpenFlow or OVSDB. Using OVSDB requires less code and definition of extensions than OpenFlow but introduces the possibility that mapping table updates and flow modifications are desynchronized from each other. This is dangerous because the mapping table signifcantly impacts the way that flows using Geneve options are installed and processed by OVS. Therefore, the mapping table is maintained using OpenFlow commands instead, which opens the possibility of using synchronization between table changes and flow modifications through barriers, bundles, etc. There are two primary groups of OpenFlow messages that are introduced as Nicira extensions: modification commands (add, delete, clear mappings) and table status request/reply to dump the current table along with switch information. Note that mappings should not be changed while they are in active use by a flow. The result of doing so is undefined. This only adds the OpenFlow infrastructure but doesn't actually do anything with the information yet after the messages have been decoded. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* Merge remote-tracking branch 'origin/master' into ovn4Justin Pettit2015-06-181-0/+26
|\
| * Add IGMPv3 support.Thadeu Lima de Souza Cascardo2015-06-171-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support IGMPv3 messages with multiple records. Make sure all IGMPv3 messages go through slow path, since they may carry multiple multicast addresses, unlike IGMPv2. Tests done: * multiple addresses in IGMPv3 report are inserted in mdb; * address is removed from IGMPv3 if record is INCLUDE_MODE; * reports sent on a burst with same flow all go to userspace; * IGMPv3 reports go to mrouters, i.e., ports that have issued a query. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ben Pfaff <blp@nicira.com>
* | packets: Generalize compose_arp().Ben Pfaff2015-06-161-3/+8
|/ | | | | | | | | Until now, compose_arp() has only been able to compose ARP requests. This extends it to composing general ARP packets, in particular replies. An upcoming commit will make use of this capability. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>
* Add support functions for 8021.ad push and pop vlan.Thomas F. Herbert2015-06-071-0/+7
| | | | | | | | Changes to allow the tpid to be specified and all vlan tpid checking to be generalized. Signed-off-by: Thomas F Herbert <thomasfherbert@gmail.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
* packet: Avoid array of struct with zero length member.Jesse Gross2015-04-071-1/+1
| | | | | | | | | | Windows doesn't like that the Geneve header has an array of options with each have a zero length member (the variable data). Nothing is accessing the data now, so just replace the member with a comment - we can use pointer arithmetic when necessary. Reported-by: Gurucharan Shetty <shettyg@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com>
* packet: Add IP pseudoheader checksum calculation.Jesse Gross2015-04-071-0/+1
| | | | | | | | | As OVS adds userspace support for being the endpoint in protocols like tunnels, it will need to be able to calculate pseudoheaders as part of the checksum calculation. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* tunneling: Add userspace tunnel support for Geneve.Jesse Gross2015-04-071-0/+19
| | | | | | | | | | | This adds basic userspace dataplane support for the Geneve tunneling protocol. The rest of userspace only has the ability to handle Geneve without options and this follows that pattern for the time being. However, when the rest of userspace is updated it should be easy to extend the dataplane as well. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* dp-packet: Remove ofpbuf dependency.Pravin B Shelar2015-03-031-17/+17
| | | | | | | | | | | | | Currently dp-packet make use of ofpbuf for managing packet buffers. That complicates ofpbuf, by making dp-packet independent of ofpbuf both libraries can be optimized for their own use case. This avoids mapping operation between ofpbuf and dp_packet in datapath upcalls. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* ofproto: Add NXM_NX_TUN_GBP_ID and NXM_NX_TUN_GBP_FLAGSMadhu Challa2015-02-141-0/+3
| | | | | | | | | | | | | | | | | | | Introduces two new NXMs to represent VXLAN-GBP [0] fields. actions=load:0x10->NXM_NX_TUN_GBP_ID[],NORMAL tun_gbp_id=0x10,actions=drop This enables existing VXLAN tunnels to carry security label information such as a SELinux context to other network peers. The values are carried to/from the datapath using the attribute OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS. [0] https://tools.ietf.org/html/draft-smith-vxlan-group-policy-00 Signed-off-by: Madhu Challa <challa@noironetworks.com> Acked-by: Ben Pfaff <blp@nicira.com> Signed-off-by: Thomas Graf <tgraf@noironetworks.com>