summaryrefslogtreecommitdiff
path: root/lib/packets.h
Commit message (Collapse)AuthorAgeFilesLines
* userspace: Add SRv6 tunnel support.Nobuhiro MIKI2023-03-291-0/+15
| | | | | | | | | | | | | | | | | SRv6 (Segment Routing IPv6) tunnel vport is responsible for encapsulation and decapsulation the inner packets with IPv6 header and an extended header called SRH (Segment Routing Header). See spec in: https://datatracker.ietf.org/doc/html/rfc8754 This patch implements SRv6 tunneling in userspace datapath. It uses `remote_ip` and `local_ip` options as with existing tunnel protocols. It also adds a dedicated `srv6_segs` option to define a sequence of routers called segment list. Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* flow: Support rt_hdr in parse_ipv6_ext_hdrs().Nobuhiro MIKI2023-03-291-0/+9
| | | | | | | | | | | | Checks whether IPPROTO_ROUTING exists in the IPv6 extension headers. If it exists, the first address is retrieved. If NULL is specified for "frag_hdr" and/or "rt_hdr", those addresses in the header are not reported to the caller. Of course, "frag_hdr" and "rt_hdr" are properly parsed inside this function. Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* odp-execute: Add ISA implementation of set_masked IPv6 actionEmma Finn2022-12-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | This commit adds support for the AVX512 implementation of the ipv6_set_addrs action as well as an AVX512 implementation of updating the L4 checksums. Here are some relative performance numbers for this patch: +-----------------------------+----------------+ | Actions | AVX with patch | +-----------------------------+----------------+ | ipv6_src | 1.14x | +-----------------------------+----------------+ | ipv6_src + ipv6_dst | 1.40x | +-----------------------------+----------------+ | ipv6_label | 1.14x | +-----------------------------+----------------+ | mod_ipv6 4 x field | 1.43x | +-----------------------------+----------------+ Signed-off-by: Emma Finn <emma.finn@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* Encap & Decap actions for MPLS packet type.Martin Varghese2022-01-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The encap & decap actions are extended to support MPLS packet type. Encap & decap actions adds and removes MPLS header at start of the packet. The existing PUSH MPLS & POP MPLS actions inserts & removes MPLS header between ethernet header and the IP header. Though this behaviour is fine for L3 VPN where an IP packet is encapsulated inside a MPLS tunnel, it does not suffice the L2 VPN requirements. In L2 VPN the ethernet packets must be encapsulated inside MPLS tunnel. In this change the encap & decap actions are extended to support MPLS packet type. The encap & decap adds and removes MPLS header at the start of packet as depicted below. Encapsulation: Actions - encap(mpls),encap(ethernet) Incoming packet -> | ETH | IP | Payload | 1 Actions - encap(mpls) [Datapath action - ADD_MPLS:0x8847] Outgoing packet -> | MPLS | ETH | Payload| 2 Actions - encap(ethernet) [ Datapath action - push_eth ] Outgoing packet -> | ETH | MPLS | ETH | Payload| Decapsulation: Incoming packet -> | ETH | MPLS | ETH | IP | Payload | Actions - decap(),decap(packet_type(ns=0,type=0)) 1 Actions - decap() [Datapath action - pop_eth) Outgoing packet -> | MPLS | ETH | IP | Payload| 2 Actions - decap(packet_type(ns=0,type=0)) [Datapath action - POP_MPLS:0x6558] Outgoing packet -> | ETH | IP | Payload| Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* packets: Correct VXLAN_GPE_FLAGS_P macro name.lic1212021-10-121-1/+1
| | | | | | | | Fix macro name from "VLXAN_GPE_FLAGS_P" to "VXLAN_GPE_FLAGS_P". Signed-off-by: lic121 <lic121@chinatelecom.cn> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* dpif-netdev: Provide orig_in_port in metadata for tunneled packets.Sriharsha Basavapatna2021-06-241-1/+7
| | | | | | | | | | | | | | | | | | | When an encapsulated packet is recirculated through a TUNNEL_POP action, the metadata gets reinitialized and the originating physical port information is lost. When this flow gets processed by the vport and it needs to be offloaded, we can't figure out the physical port through which the tunneled packet was received. Add a new member to the metadata: 'orig_in_port'. This is passed to the next stage during recirculation and the offload layer can use it to offload the flow to this physical port. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Eli Britstein <elibr@nvidia.com> Reviewed-by: Gaetan Rivet <gaetanr@nvidia.com> Tested-by: Emma Finn <emma.finn@intel.com> Tested-by: Marko Kovacevic <marko.kovacevic@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* packets: Un-inline functions needed by DDlog.Leonid Ryzhyk2020-10-261-86/+11
| | | | | | | | | | | | | | | | DDlog uses these functions from Rust, but Rust can't use inline functions (since it doesn't compile C headers but only links against a C-compatible ABI). Thus, move the implementations of these functions to a .c file. I don't think any of these functions is likely to be an important part of a "fast path" in OVS, but if that's wrong, then we could take another approach. Signed-off-by: Leonid Ryzhyk <lryzhyk@vmware.com> Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Numan Siddique <numans@ovn.org>
* netdev-offload-tc: Expand tunnel source IPs masked matchTonghao Zhang2020-06-031-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | To support more use case, for example, DDOS, which packets should be dropped in hardware, this patch allows users to match only the tunnel source IPs with masked value. $ ovs-appctl dpctl/add-flow "tunnel(src=2.2.2.0/255.255.255.0,tp_dst=4789,ttl=64),\ recirc_id(2),in_port(3),eth(),eth_type(0x0800),ipv4()" "" $ ovs-appctl dpctl/dump-flows tunnel(src=2.2.2.0/255.255.255.0,ttl=64,tp_dst=4789) ... actions:drop $ tc filter show dev vxlan_sys_4789 ingress ... eth_type ipv4 enc_src_ip 2.2.2.0/24 enc_dst_port 4789 enc_ttl 64 in_hw in_hw_count 2 action order 1: gact action drop ... Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
* userspace: Add GTP-U support.William Tu2020-03-251-0/+68
| | | | | | | | | | | | | | | | | | | | | | | GTP, GPRS Tunneling Protocol, is a group of IP-based communications protocols used to carry general packet radio service (GPRS) within GSM, UMTS and LTE networks. GTP protocol has two parts: Signalling (GTP-Control, GTP-C) and User data (GTP-User, GTP-U). GTP-C is used for setting up GTP-U protocol, which is an IP-in-UDP tunneling protocol. Usually GTP is used in connecting between base station for radio, Serving Gateway (S-GW), and PDN Gateway (P-GW). This patch implements GTP-U protocol for userspace datapath, supporting only required header fields and G-PDU message type. See spec in: https://tools.ietf.org/html/draft-hmm-dmm-5g-uplane-analysis-00 Tested-at: https://travis-ci.org/github/williamtu/ovs-travis/builds/666518784 Signed-off-by: Feng Yang <yangfengee04@gmail.com> Co-authored-by: Feng Yang <yangfengee04@gmail.com> Signed-off-by: Yi Yang <yangyi01@inspur.com> Co-authored-by: Yi Yang <yangyi01@inspur.com> Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Ben Pfaff <blp@ovn.org>
* packets: Fix typo in comment.Ben Pfaff2020-03-051-1/+1
| | | | | | Acked-by: Han Zhou <hzhou@ovn.org> Reported-by: Toms Atteka <tatteka@vmware.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* lib: packets: export compose_ipv6 routine to OVNLorenzo Bianconi2019-10-141-0/+3
| | | | | | | | | Remove static qualifier from compose_ipv6 definition and export it to OVN. compose_ipv6 will be used in order to add IPv6 prefix delegation support to OVN Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* flow: Fix using pointer to member of packed struct icmp6_hdr.Ilya Maximets2019-10-101-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | OVS has no structure definition for ICMPv6 header with additional data. More precisely, it has, but this structure named as 'icmp6_error_header' and only suitable to store error related extended information. 'flow_compose_l4' stores additional information in reserved bits by using system defined structure 'icmp6_hdr', which is marked as 'packed' and this leads to build failure with gcc >= 9: lib/flow.c:3041:34: error: taking address of packed member of 'struct icmp6_hdr' may result in an unaligned pointer value [-Werror=address-of-packed-member] uint32_t *reserved = &icmp->icmp6_dataun.icmp6_un_data32[0]; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix that by renaming 'icmp6_error_header' to 'icmp6_data_header' and allowing it to store not only errors, but any type of additional information by analogue with 'struct icmp6_hdr'. All the usages of 'struct icmp6_hdr' replaced with this new structure. Removed redundant conversions between network and host representations. Now fields are always in be. This also, probably, makes flow_compose_l4 more robust by avoiding possible unaligned accesses to 32 bit value. Fixes: 9b2b84973db7 ("Support for match & set ICMPv6 reserved and options type fields") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: William Tu <u9012063@gmail.com> Acked-by: Ben Pfaff <blp@ovn.org>
* conntrack: Optimize recirculations.Darrell Ball2019-09-251-0/+11
| | | | | | | | | | | | Cache the 'conn' context and use it when it is valid. The cached 'conn' context will get reset if it is not expected to be valid; the cost to do this is negligible. Besides being most optimal, this also handles corner cases, such as decapsulation leading to the same tuple, as in tunnel VPN cases. A negative test is added to check the resetting of the cached 'conn'. Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* conntrack: Fix ICMPv4 error data L4 length check.Darrell Ball2019-08-291-0/+3
| | | | | | | | | | | | | | | | | | | | The ICMPv4 error data L4 length check was found to be too strict for TCP, expecting a minimum of 20 rather than 8 bytes. This worked by hapenstance for other inner protocols. The approach is to explicitly handle the ICMPv4 error data L4 length check and to do this for all supported inner protocols in the same way. Making the code common between protocols also allows the existing ICMPv4 related UDP tests to cover TCP and ICMP inner protocol cases. Note that ICMPv6 does not have an 8 byte limit for error L4 data. Fixes: a489b16854b5 ("conntrack: New userspace connection tracker.") CC: Daniele Di Proietto <diproiettod@ovn.org> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361949.html Reported-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Co-authored-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovs-tc: offload MPLS push actions to TC datapathJohn Hurley2019-08-011-0/+24
| | | | | | | | | | | | | | | | TC can now be used to push an MPLS header onto a packet. The MPLS label is the only information that needs to be passed here with the rest reverting to default values if none are supplied. OvS, however, gives the entire MPLS header to be pushed along with the MPLS protocol to use. TC can optionally accept these values so can be made replicate the OvS datapath rule. Convert OvS MPLS push datapath rules to TC format and offload to a TC datapath. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
* packets: Add IGMPv3 query packet definitionsDumitru Ceara2019-07-161-1/+18
| | | | | | Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* OVN: Make periodic RAs consistent with RA responder.Mark Michelson2019-03-251-3/+0
| | | | | | | | | | | | This commit makes periodic RAs from OVN consistent with the RAs sent in response to RSs. Specifically, this ensures that prefix flags are set correctly for each address mode. This commit also gets rid of some redundant definitions for RA prefix option flags from packets.h in favor of the ones in ovn-l7.h. Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* packets: Change return type for 'packet_csum_upperlayer6()'.Darrell Ball2019-02-221-1/+1
| | | | | Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Support for match & set ICMPv6 reserved and options type fieldsVishal Deep Ajmera2019-02-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Currently OVS supports all ARP protocol fields as OXM match fields to implement the relevant ARP procedures for IPv4. This includes support for matching copying and setting ARP fields. In IPv6 ARP has been replaced by ICMPv6 neighbor discovery (ND) procedures, neighbor advertisement and neighbor solicitation. The support for ICMPv6 fields in OVS is not complete for the use cases equivalent to ARP in IPv4. OVS lacks support for matching, copying and setting the “ND option type” and “ND reserved” fields. Without these user cannot implement all ICMPv6 ND procedures for IPv6 support. This commit adds additional OXM fields to OVS for ICMPv6 “ND option type“ and ICMPv6 “ND reserved” using the OXM extension mechanism. This allows support for parsing these fields from an ICMPv6 packet header and extending the OpenFlow protocol with specifications for these new OXM fields for matching, copying and setting. Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Co-authored-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com> Signed-off-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* erspan: add big endian bit fields.William Tu2018-08-211-0/+21
| | | | | | | | | | | | | | | Big-endian systems arrange bit fields in the opposite order. The patch follows the linux kernel's approach by defining the big and little endian bit-field of ERSPAN header using #ifdef. Tested on zelenka.debian.org (https://db.debian.org/machines.cgi?host=zelenka). Tested-by: Ben Pfaff <blp@ovn.org> Reported-by: James Page <james.page@canonical.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-August/351382.html Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* userspace: add erspan tunnel support.William Tu2018-05-211-0/+126
| | | | | | | | | | | | ERSPAN is a tunneling protocol based on GRE tunnel. The patch add erspan tunnel support for ovs-vswitchd with userspace datapath. Configuring erspan tunnel is similar to gre tunnel, but with additional erspan's parameters. Matching a flow on erspan's metadata is also supported, see ovs-fields for more details. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Make <host>:<port> parsing uniform treewide.Ben Pfaff2018-04-161-10/+0
| | | | | | | | | | | | | | | | I didn't realize until now that the tree had two different ways of parsing strings in the form <host>:<port> and <port>:<host>. There are the long-standing inet_parse_active() and inet_parse_passive() functions, and more recently the ipv46_parse() function. This commit eliminates the latter and changes the code to use the former. The two implementations interpreted some input differently. In particular, the older functions required IPv6 addresses to be [bracketed], but the newer ones do not. For compatibility this patch changes the merged code to use the more liberal interpretation. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>
* packets: Prefetch the packet metadata in cacheline1.Bhanuprakash Bodireddy2018-01-121-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pkt_metadata_prefetch_init() is used to prefetch the packet metadata before initializing the metadata in pkt_metadata_init(). This is done for every packet in userspace datapath and is performance critical. Commit 99fc16c0 prefetches only cachline0 and cacheline2 as the metadata part of respective cachelines will be initialized by pkt_metadata_init(). However in VXLAN case when popping the vxlan header, netdev_vxlan_pop_header() invokes pkt_metadata_init_tnl() which zeroes out metadata part of cacheline1 that wasn't prefetched earlier and causes performance degradation. By prefetching cacheline1, 9% performance improvement is observed with vxlan decapsulation test case for packet sizes of 118 bytes. Performance variation is observed based on CFLAGS. CFLAGS="-O2" CFLAGS="-O2 -msse4.2" Master 4.667 Mpps Master 4.710 Mpps With Patch 5.045 Mpps With Patch 5.097 Mpps CFLAGS="-O2 -march=native" CFLAGS="-Ofast -march=native" Master 5.072 Mpps Master 5.349 Mpps With Patch 5.193 Mpps With Patch 5.378 Mpps Fixes: 99fc16c0 ("Reorganize the pkt_metadata structure.") Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Acked-by: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* nsh: rework NSH netlink keys and actionsYi Yang2018-01-081-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes OVS_KEY_ATTR_NSH to nested attribute and adds three new NSH sub attribute keys: OVS_NSH_KEY_ATTR_BASE: for length-fixed NSH base header OVS_NSH_KEY_ATTR_MD1: for length-fixed MD type 1 context OVS_NSH_KEY_ATTR_MD2: for length-variable MD type 2 metadata Its intention is to align to NSH kernel implementation. NSH match fields, set and PUSH_NSH action all use the below nested attribute format: OVS_KEY_ATTR_NSH begin OVS_NSH_KEY_ATTR_BASE OVS_NSH_KEY_ATTR_MD1 OVS_KEY_ATTR_NSH end or OVS_KEY_ATTR_NSH begin OVS_NSH_KEY_ATTR_BASE OVS_NSH_KEY_ATTR_MD2 OVS_KEY_ATTR_NSH end In addition, NSH encap and decap actions are renamed as push_nsh and pop_nsh to meet action naming convention. Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* OVN: Add support for periodic router advertisements.Mark Michelson2018-01-051-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | This change adds three new options to the Northbound Logical_Router_Port's ipv6_ra_configs option: * send_periodic: If set to "true", then OVN will send periodic router advertisements out of this router port. * max_interval: The maximum amount of time to wait between sending periodic router advertisements. * min_interval: The minimum amount of time to wait between sending periodic router advertisements. When send_periodic is true, then IPv6 RA configs, as well as some layer 2 and layer 3 information about the router port, are copied to the southbound database. From there, ovn-controller can use this information to know when to send periodic RAs and what to send in them. Because periodic RAs originate from each ovn-controller, the new keep-local flag is set on the packet so that ports don't receive an overabundance of RAs. Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* lib: Adding explicit typecasts to fix C++ compilation issuesShireesh Singh2017-12-191-5/+10
| | | | | | | | | | | C++ does not allow implicit conversion from void pointer to a specific pointer type. This change adds explicit typecasts to appropriate types wherever needed. Signed-off-by: Shireesh Kumar Singh <shireeshkum@vmware.com> Signed-off-by: Sairam Venugopal <vsairam@vmware.com> Co-authored-by: Sairam Venugopal <vsairam@vmware.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* types: New macros ETH_ADDR_C and ETH_ADDR64_C.Ben Pfaff2017-11-291-7/+7
| | | | | | | | | These macros expand to constants of type struct eth_addr and struct eth_addr64, respectively, and make it more convenient to initialize or assign to an Ethernet address object. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>
* packets: Fix C++ compilation issues when include packets.hYi-Hung Wei2017-11-021-1/+3
| | | | | | | | | | | | | | | | | | | | | This patch fixes three C++ compilation errors when it includes "lib/packets.h". 1) Fix in "include/openvswitch/util.h" is to avoid duplicated named_member__ in struct pkt_metadata. 2) Fix in "lib/packets.h" is because designated initializers are not implemented in GNU C++ [1]. 3) Fix in "lib/util.h" is because __builtin_types_compatible_p and __builtin_choose_expr are only supported in GCC. I use one solution for C++ that is type-safe and works at compile time from [2]. [1]: https://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html [2]: https://goo.gl/xNe48A Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Add general-purpose IP/port parsing function.Mark Michelson2017-11-021-0/+10
| | | | | | | | | | | | | | OVS has functions for parsing IPv4 addresses, parsing IPv4 addresses with a port, and parsing IPv6 addresses. What is lacking though is a function that can take an IPv4 or IPv6 address, with or without a port. This commit adds ipv46_parse(), which breaks the given input string into its component parts and stores them in a sockaddr_storage structure. The function accepts flags that determine how it should behave if a port is present in the input string. Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* packets: Add ESP header and trailer.Ian Stokes2017-10-311-0/+14
| | | | | | | | This patch introduces structs for both ESP headers and ESP trailers along with expected size assertions. Signed-off-by: Ian Stokes <ian.stokes@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Generic encap and decap support for NSHJan Scheurich2017-08-071-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | This commit adds translation and netdev datapath support for generic encap and decap actions for the NSH MD1 header. The generic encap and decap actions are mapped to specific encap_nsh and decap_nsh actions in the datapath. The translation follows that general scheme that decap() of an NSH packet triggers recirculation after decapsulation, while encap(nsh) just modifies struct flow and sets the ctx->pending_encap flag to generate the encap_nsh action at the next commit to be able to include subsequent set_field actions for NSH headers. Support for the flexible MD2 format using TLV properties is foreseen in encap(nsh), but not yet fully implemented. The CLI syntax for encap of NSH is encap(nsh(md_type=1)) encap(nsh(md_type=2[,tlv(<tlv_class>,<tlv_type>,<hex_string>),...])) Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* userspace: Add support for NSH MD1 match fieldsJan Scheurich2017-08-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for NSH packet header fields to the OVS control plane and the userspace datapath. Initially we support the fields of the NSH base header as defined in https://www.ietf.org/id/draft-ietf-sfc-nsh-13.txt and the fixed context headers specified for metadata format MD1. The variable length MD2 format is parsed but the TLV context headers are not yet available for matching. The NSH fields are modelled as experimenter fields with the dedicated experimenter class 0x005ad650 proposed for NSH in ONF. The following fields are defined: NXOXM code ofctl name Size Comment ===================================================================== NXOXM_NSH_FLAGS nsh_flags 8 Bits 2-9 of 1st NSH word (0x005ad650,1) NXOXM_NSH_MDTYPE nsh_mdtype 8 Bits 16-23 (0x005ad650,2) NXOXM_NSH_NEXTPROTO nsh_np 8 Bits 24-31 (0x005ad650,3) NXOXM_NSH_SPI nsh_spi 24 Bits 0-23 of 2nd NSH word (0x005ad650,4) NXOXM_NSH_SI nsh_si 8 Bits 24-31 (0x005ad650,5) NXOXM_NSH_C1 nsh_c1 32 Maskable, nsh_mdtype==1 (0x005ad650,6) NXOXM_NSH_C2 nsh_c2 32 Maskable, nsh_mdtype==1 (0x005ad650,7) NXOXM_NSH_C3 nsh_c3 32 Maskable, nsh_mdtype==1 (0x005ad650,8) NXOXM_NSH_C4 nsh_c4 32 Maskable, nsh_mdtype==1 (0x005ad650,9) Co-authored-by: Johnson Li <johnson.li@intel.com> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* packets: Reorganize the pkt_metadata structure.Bhanuprakash Bodireddy2017-08-031-2/+21
| | | | | | | | | | | | | | | | | pkt_metadata_init() is called for every packet in userspace datapath and initializes few members in pkt_metadata. Before this the members that needs to be initialized are prefetched using pkt_metadata_prefetch_init(). The above functions are critical to the userspace datapath performance and should be in sync. Any changes to the pkt_metadata should also include changes to metadata_init() and prefetch_init() if necessary. This commit slightly refactors the pkt_metadata structure and introduces cache line markers to catch any violations to the structure. Also only prefetch the cachelines having the members that needs to be zeroed out. Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* OF support and translation of generic encap and decapJan Scheurich2017-08-021-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds support for the OpenFlow actions generic encap and decap (as specified in ONF EXT-382) to the OVS control plane. CLI syntax for encap action with properties: encap(<header>) encap(<header>(<prop>=<value>,<tlv>(<class>,<type>,<value>),...)) For example: encap(ethernet) encap(nsh(md_type=1)) encap(nsh(md_type=2,tlv(0x1000,10,0x12345678),tlv(0x2000,20,0xfedcba9876543210))) CLI syntax for decap action: decap() decap(packet_type(ns=<pt_ns>,type=<pt_type>)) For example: decap() decap(packet_type(ns=0,type=0xfffe)) decap(packet_type(ns=1,type=0x894f)) The first header supported for encap and decap is "ethernet" to convert packets between packet_type (1,Ethertype) and (0,0). This commit also implements a skeleton for the translation of generic encap and decap actions in ofproto-dpif and adds support to encap and decap an Ethernet header. In general translation of encap commits pending actions and then rewrites struct flow in accordance with the new packet type and header. In the case of encap(ethernet) it suffices to change the packet type from (1, Ethertype) to (0,0) and set the dl_type accordingly. A new pending_encap flag in xlate ctx is set to mark that an corresponding datapath encap action must be triggered at the next commit. In the case of encap(ethernet) ofproto generetas a push_eth action. The general case for translation of decap() is to emit a datapath action to decap the current outermost header and then recirculate the packet to reparse the inner headers. In the special case of an Ethernet packet, decap() just changes the packet type from (0,0) to (1, dl_type) without a need to recirculate. The emission of the pop_eth action for the datapath is postponed to the next commit. Hence encap(ethernet) and decap() on an Ethernet packet are OF octions that only incur a cost in the dataplane when a modifed packet is actually committed, e.g. because it is sent out. They can freely be used for normalizing the packet type in the OF pipeline without degrading performance. Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* packets: Do not initialize ct_orig_tuple.Daniele Di Proietto2017-08-021-1/+9
| | | | | | | | | | | | | | | | | | | | | | Commit "odp: Support conntrack orig tuple key." introduced new fields in struct 'pkt_metadata'. pkt_metadata_init() is called for every packet in the userspace datapath. When testing a simple single flow case with DPDK, we observe a lower throughput after the above commit (it was 14.88 Mpps before, it is 13 Mpps after). This patch skips initializing ct_orig_tuple in pkt_metadata_init(). It should be enough to initialize ct_state, because nobody should look at ct_orig_tuple unless ct_state is != 0. It's discussed at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/332419.html Fixes: daf4d3c18da4("odp: Support conntrack orig tuple key.") Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Co-authored-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* dpdk: Add more ICMP Related NAT support.Darrell Ball2017-06-021-0/+7
| | | | | | | | | This patch includes more complete support for icmp4 and icmp6 related NAT handling. Signed-off-by: Darrell Ball <dlu998@gmail.com> Acked-by: Daniele Di Proietto <diproiettod@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* userspace: add vxlan gpe support to vportGeorg Schmuecking2017-06-021-1/+53
| | | | | | | | | | | | | | This patch is based on the "datapath: enable vxlangpe creation in compat mode" from Yi Yang. It introduces an extension option "gpe" to the vxlan port in the netdev-dpdk datapath. Description of vxlan gpe protocoll was added to header file lib/packets.h. In the vxlan specific methods the different packet are introduced and handled. Added VXLAN GPE tunnel push test. Signed-off-by: Yi Yang <yi.y.yang at intel.com> Signed-off-by: Georg Schmuecking <georg.schmuecking@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* userspace: Switching of L3 packets in L2 pipelineJan Scheurich2017-06-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ports have a new layer3 attribute if they send/receive L3 packets. The packet_type included in structs dp_packet and flow is considered in ofproto-dpif. The classical L2 match fields (dl_src, dl_dst, dl_type, and vlan_tci, vlan_vid, vlan_pcp) now have Ethernet as pre-requisite. A dummy ethernet header is pushed to L3 packets received from L3 ports before the the pipeline processing starts. The ethernet header is popped before sending a packet to a L3 port. For datapath ports that can receive L2 or L3 packets, the packet_type becomes part of the flow key for datapath flows and is handled appropriately in dpif-netdev. In the 'else' branch in flow_put_on_pmd() function, the additional check flow_equal(&match.flow, &netdev_flow->flow) was removed, as a) the dpcls lookup is sufficient to uniquely identify a flow and b) it caused false negatives because the flow in netdev->flow may not properly masked. In dpif_netdev_flow_put() we now use the same method for constructing the netdev_flow_key as the one used when adding the flow to the dplcs to make sure these always match. The function netdev_flow_key_from_flow() used so far was not only inefficient but sometimes caused mismatches and subsequent flow update failures. The kernel datapath does not support the packet_type match field. Instead it encodes the packet type implictly by the presence or absence of the Ethernet attribute in the flow key and mask. This patch filters the PACKET_TYPE attribute out of netlink flow key and mask to be sent to the kernel datapath. Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* packets: Remove unnecessary "packed" annotations.Ben Pfaff2017-05-301-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I know of two reasons to mark a structure as "packed". The first is because the structure must match some defined interface and therefore compiler-inserted padding between or after members would cause its layout to diverge from that interface. This is not a problem in a structure that follows the general alignment rules that are seen in ABIs for all the architectures that OVS cares about: basically, that a struct member needs to be aligned on a boundary that is a multiple of the member's size. The second reason is because instances of the struct tend to be at misaligned addresses. struct eth_header and struct vlan_eth_header are normally aligned on 16-bit boundaries (at least), and they contain only 16-bit members, so there's no need to pack them. This commit removes the packed annotation. This commit also removes the packed annotation from struct llc_header. Since that struct only contains 8-bit members, I don't know of any benefit to packing it, period. This commit also removes a few more packed annotations that are much less important. When these packed annotations were removed, it caused a few warnings related to casts from 'uint8_t *' to more strictly aligned pointer types, related to struct ovs_action_push_tnl. That's because that struct had a trailing member used to store packet headers, that was declared as a uint8_t[]. Before, when this was cast to 'struct eth_header *', there was no change in alignment since eth_header was packed; now that eth_header is not packed, the compiler considers it suspicious. This commit avoids that problem by changing the member from uint8_t[] to uint32_t[], which assures the compiler that it is properly aligned. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* userspace: Support for push_eth and pop_eth actionsJan Scheurich2017-05-081-0/+4
| | | | | | | | | | | | | | | Add support for actions push_eth and pop_eth to the netdev datapath and the supporting libraries. This patch relies on the support for these actions in the kernel datapath to be present. Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* lib: rename ovs_nd_opt to ovs_nd_lla_optZong Kai LI2017-05-041-8/+8
| | | | | | | | | | Since ovs_nd_mtu_opt and ovs_nd_prefix_opt is introducted, rename ovs_nd_opt to ovs_nd_lla_opt to specify it's Source/Target Link-layer Address Option. Signed-off-by: Zongkai LI <zealokii@gmail.com> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* packets: add compose_nd_raZong Kai LI2017-05-041-3/+63
| | | | | | | | | | | | | | | This patch introduces methods to compose a Router Advertisement (RA) packet, introduces flags for RA. RA packet composed structures against specification in RFC4861. Caller can use compse_nd_ra_with_sll_mtu_opts to compose a RA packet with Source Link-layer Address Option and MTU Option. Caller can use packet_put_ra_prefix_opt to append a Prefix Information Option to a RA packet. Signed-off-by: Zongkai LI <zealokii@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* userspace: Add packet_type in dp_packet and flowJan Scheurich2017-05-031-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a packet_type attribute to the structs dp_packet and flow to explicitly carry the type of the packet as prepration for the introduction of the so-called packet type-aware pipeline (PTAP) in OVS. The packet_type is a big-endian 32 bit integer with the encoding as specified in OpenFlow verion 1.5. The upper 16 bits contain the packet type name space. Pre-defined values are defined in openflow-common.h: enum ofp_header_type_namespaces { OFPHTN_ONF = 0, /* ONF namespace. */ OFPHTN_ETHERTYPE = 1, /* ns_type is an Ethertype. */ OFPHTN_IP_PROTO = 2, /* ns_type is a IP protocol number. */ OFPHTN_UDP_TCP_PORT = 3, /* ns_type is a TCP or UDP port. */ OFPHTN_IPV4_OPTION = 4, /* ns_type is an IPv4 option number. */ }; The lower 16 bits specify the actual type in the context of the name space. Only name spaces 0 and 1 will be supported for now. For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet). This is the default packet_type in OVS and the only one supported so far. Packets of type (OFPHTN_ONF, 0) are called Ethernet packets. In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet. A packet of type (OFPHTN_ETHERTYPE, <Ethertype>) is a standard L2 packet whith the Ethernet header (and any VLAN tags) removed to expose the L3 (or L2.5) payload of the packet. These will simply be called L3 packets. The Ethernet address fields dl_src and dl_dst in struct flow are not applicable for an L3 packet and must be zero. However, to maintain compatibility with the large code base, we have chosen to copy the Ethertype of an L3 packet into the the dl_type field of struct flow. This does not mean that it will be possible to match on dl_type for L3 packets with PTAP later on. Matching must be done on packet_type instead. New dp_packets are initialized with packet_type Ethernet. Ports that receive L3 packets will have to explicitly adjust the packet_type. Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovn-controller: Add 'dns_lookup' actionNuman Siddique2017-05-021-0/+24
| | | | | | | | | | | | | | | | | | | | | | This patch adds a new OVN action 'dns_lookup' to support native DNS. ovn-controller parses this action and adds a NXT_PACKET_IN2 OF flow with 'pause' flag set. A new table 'DNS' is added in the SB DB to look up and resolve the DNS queries. When a valid DNS packet is received by ovn-controller, it looks up the DNS name in the 'DNS' table and if successful, it frames a DNS reply, resumes the packet and stores 1 in the 1-bit subfield. If the packet is invalid or cannot be resolved, it resumes the packet without any modifications and stores 0 in the 1-bit subfield. reg0[4] = dns_lookup(); next; An upcoming patch will use this action and adds logical flows. Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Acked-by: Gurucharan Shetty <guru@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* packets: Reduce redundant copies of connection states.Ben Pfaff2017-04-211-22/+23
| | | | | | | | I was about to add another complete list of all the connection states but this eliminates the need. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Miguel Angel Ajo <majopela@redhat.com>
* ovn-northd ipam: Support IPv6 dynamic assignmentNuman Siddique2017-04-141-0/+20
| | | | | | | | | | | OVN will generate the IPv6 address for a logical port if requested using the IPv6 prefix and the MAC address (as IEEE EUI64 identifier). To generate the IPv6 address, CMS should define the IPv6 prefix in the 'Logical_switch.other_config:ipv6_prefix' column. Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* types: New struct eth_addr64 for EUI-64 identifiers.Ben Pfaff2017-04-071-1/+33
| | | | | | | This will see its first real user in the following commit. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
* conntrack: Add formatting support for IGMP, DCCP, and UDPLITE.Jarno Rajahalme2017-03-281-0/+12
| | | | | | | | Print names for protocols that are supported by (Linux) conntrack (DCCP, UDPLITE) and IGMP, which has been seen in logs. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* odp: Support conntrack orig tuple key.Jarno Rajahalme2017-03-081-0/+5
| | | | | | Userspace support for datapath original direction conntrack tuple. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* flow: Make room after ct_state.Jarno Rajahalme2017-03-081-1/+1
| | | | | | | 'ct_state' currently only needs 8 bits, so we can make room for a new CT field introduced in the next patch. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>