diff options
Diffstat (limited to 'doc/route.txt')
-rw-r--r-- | doc/route.txt | 397 |
1 files changed, 368 insertions, 29 deletions
diff --git a/doc/route.txt b/doc/route.txt index 1f68116..b9b50b6 100644 --- a/doc/route.txt +++ b/doc/route.txt @@ -35,50 +35,316 @@ commonly used In user space the term _network interface_ is very common. The routing netlink protocol uses the term _link_ and so does the _iproute2_ utility and most routing daemons. -=== Protocol Definition +=== Netlink Protocol -This section describes the protocol semantics of the netlink link configuration -interface. The following netlink message types are defined: +This section describes the protocol semantics of the netlink based link +configuration interface. The following messages are defined: [options="header", cols="1,2,2"] |============================================================================== | Message Type | User -> Kernel | Kernel -> User -| +RTM_NEWLINK+ | Create new virtual network device | Notification: Link changed or added -| +RTM_DELLINK+ | Delete virtual network device | Notification: Link deleted or disappeared +| +RTM_NEWLINK+ | Create or update virtual network device +| Reply to +RTM_GETLINK+ request or notification of link added or updated +| +RTM_DELLINK+ | Delete virtual network device +| Notification of link deleted or disappeared | +RTM_GETLINK+ | Retrieve link configuration and statistics | | +RTM_SETLINK+ | Modify link configuration | |============================================================================== -See the link:core.html#core_msg_types[Message Types] section of the Netlink -Library documentation for more information on common semantics of these message -types. +See link:core.html#core_msg_types[Netlink Library - Message Types] for more +information on common semantics of these message types. + +==== Link Message Format + +All netlink link messages share a common header (+struct ifinfomsg+) which +is appended after the netlink header (+struct nlmsghdr+). + +image:ifinfomsg.png["Link Message Header"] + +The meaning of each field may differ depending on the message type. A ++struct ifinfomsg+ is defined in +<linux/rtnetlink.h>+ to represent the +header. + +Address Family (8bit):: +The address family is usually set to +AF_UNSPEC+ but may be specified in ++RTM_GETLINK+ requests to limit the returned links to a specific address +family. + +Link Layer Type (16bit):: +Currently only used in kernel->user messages to report the link layer type +of a link. The value corresponds to the +ARPHRD_*+ defines found in ++<linux/if_arp.h>+. Translation from/to strings can be done using the +functions nl_llproto2str()/nl_str2llproto(). + +Link Index (32bit):: +Carries the interface index and is used to identify existing links. + +Flags (32bit):: +In kernel->user messages the value of this field represents the current +state of the link flags. In user->kernel messages this field is used to +change flags or set the initial flag state of new links. Note that in order +to change a flag, the flag must also be set in the _Flags Change Mask_ field. + +Flags Change Mask (32bit):: +The primary use of this field is to specify a mask of flags that should be +changed based on the value of the _Flags_ field. A special meaning is given +to this field when present in link notifications, see TODO. + +Attributes (variable):: +All link message types may carry netlink attributes. They are defined in the +header file <linux/if_link.h> and share the prefix +IFLA_+. + +==== Link Message Types + +.RTM_GETLINK (user->kernel) + +Lookup link by 1. interface index or 2. link name (+IFLA_IFNAME+) and return +a single +RTM_NEWLINK+ message containing the link configuration and statistics +or a netlink error message if no such link was found. + +*Parameters:* + +* *Address family* +** If the address family is set to +PF_BRIDGE+, only bridging devices will be + returned. +** If the address family is set to +PF_INET6+, only ipv6 enabled devices will + be returned. + +*Flags:* + +* +NLM_F_DUMP+ If set, all links will be returned in form of a multipart + message. + +*Returns:* + +* +EINVAL+ if neither interface nor link name are set +* +ENODEV+ if no link was found +* +ENOBUFS+ if allocation failed + +.RTM_NEWLINK (user->kernel) + +Creates a new or updates an existing link. Only virtual links may be created +but all links may be updated. + +*Flags:* + +- +NLM_F_CREATE+ Create link if it does not exist +- +NLM_F_EXCL+ Return +EEXIST+ if link already exists + +*Returns:* + +- +EINVAL+ malformed message or invalid configuration parameters +- +EAFNOSUPPORT+ if a address family specific configuration (+IFLA_AF_SPEC+) + is not supported. +- +EOPNOTSUPP+ if the link does not support modification of parameters +- +EEXIST+ if +NLM_F_EXCL+ was set and the link exists alraedy +- +ENODEV+ if the link does not exist and +NLM_F_CREATE+ is not set + +.RTM_NEWLINK (kernel->user) + +This message type is used in reply to a +RTM_GETLINK+ request and carries +the configuration and statistics of a link. If multiple links need to +be sent, the messages will be sent in form of a multipart message. + +The message type is also used for notifications sent by the kernel to the +multicast group +RTNLGRP_LINK+ to inform about various link events. It is +therefore recommended to always use a separate link socket for link +notifications in order to separate between the two message types. + +TODO: document how to detect different notifications + +.RTM_DELLINK (user->kernel) + +Lookup link by 1. interface index or 2. link name (+IFLA_IFNAME+) and delete +the virtual link. + +*Returns:* -.Link Message Header +* +EINVAL+ if neither interface nor link name are set +* +ENODEV+ if no link was found +* +ENOTSUPP+ if the operation is not supported (not a virtual link) -All netlink link messages share the following common header which is appended -after the netlink message header (+struct nlmsghdr+). It is defined in the -header +<linux/rtnetlink.h>+ +.RTM_DELLINK (kernel->user) + +Notification sent by the kernel to the multicast group +RTNLGRP_LINK+ when + +a. a network device was unregistered (change == ~0) +b. a bridging device was deleted (address family will be +PF_BRIDGE+) [source,c] ----- -struct ifinfomsg { - unsigned char ifi_family; - unsigned char __ifi_pad; - unsigned short ifi_type; /* ARPHRD_* */ - int ifi_index; /* Link index */ - unsigned ifi_flags; /* IFF_* flags */ - unsigned ifi_change; /* IFF_* change mask */ -}; +#define IFF_UP 0x1 /* interface is up */ +#define IFF_BROADCAST 0x2 /* broadcast address valid */ +#define IFF_DEBUG 0x4 /* turn on debugging */ +#define IFF_LOOPBACK 0x8 /* is a loopback net */ +#define IFF_POINTOPOINT 0x10 /* interface is has p-p link */ +#define IFF_NOTRAILERS 0x20 /* avoid use of trailers */ +#define IFF_RUNNING 0x40 /* interface RFC2863 OPER_UP */ +#define IFF_NOARP 0x80 /* no ARP protocol */ +#define IFF_PROMISC 0x100 /* receive all packets */ +#define IFF_ALLMULTI 0x200 /* receive all multicast packets*/ + +#define IFF_MASTER 0x400 /* master of a load balancer */ +#define IFF_SLAVE 0x800 /* slave of a load balancer */ + +#define IFF_MULTICAST 0x1000 /* Supports multicast */ + +#define IFF_PORTSEL 0x2000 /* can set media type */ +#define IFF_AUTOMEDIA 0x4000 /* auto media select active */ +#define IFF_DYNAMIC 0x8000 /* dialup device with changing addresses*/ + +#define IFF_LOWER_UP 0x10000 /* driver signals L1 up */ +#define IFF_DORMANT 0x20000 /* driver signals dormant */ + +#define IFF_ECHO 0x40000 /* echo sent packets */ ----- -The meaning of each field may differ depending on the message type. +=== Get / List -.Attributes +[[link_list]] +==== Get list of links -All link message types may carry netlink attributes. They are defined in the -header file <linux/if_link.h> and share the prefix +IFLA_+. +To retrieve the list of links in the kernel, allocate a new link cache +using +rtnl_link_alloc_cache()+ to hold the links. It will automatically +construct and send a +RTM_GETLINK+ message requesting a dump of all links +from the kernel and feed the returned +RTM_NEWLINK+ to the internal link +message parser which adds the returned links to the cache. + +[source,c] +----- +#include <netlink/route/link.h> + +int rtnl_link_alloc_cache(struct nl_sock *sk, int family, struct nl_cache **result) +----- + +The cache will contain link objects (+struct rtnl_link+, see <<link_object>>) +and can be accessed using the standard cache functions. By setting the ++family+ parameter to an address familly other than +AF_UNSPEC+, the resulting +cache will only contain links supporting the specified address family. + +The following direct search functions are provided to search by interface +index and by link name: + +[source,c] +----- +#include <netlink/route/link.h> + +struct rtnl_link *rtnl_link_get(struct nl_cache *cache, int ifindex); +struct rtnl_link *rtnl_link_get_by_name(struct nl_cache *cache, const char *name); +----- + +.Example: Link Cache + +[source,c] +----- +struct nl_cache *cache; +struct rtnl_link *link; + +if (rtnl_link_alloc_cache(sock, AF_UNSPEC, &cache)) < 0) + /* error */ + +if (!(link = rtnl_link_get_by_name(cache, "eth1"))) + /* link does not exist */ + +/* do something with link */ + +rtnl_link_put(link); +nl_cache_put(cache); +----- + +==== Lookup Single Link (Direct Lookup) + +If only a single link is of interest, the link can be looked up directly +without the use of a link cache using the function +rtnl_link_get_kernel()+. + +[source,c] +----- +#include <netlink/route/link.h> + +int rtnl_link_get_kernel(struct nl_sock *sk, int ifindex, const char *name, struct rtnl_link **result); +----- + +It will construct and send a +RTM_GETLINK+ request using the parameters +provided and wait for a +RTM_NEWLINK+ or netlink error message sent in +return. If the link exists, the link is returned as link object +(see <<link_object>>). + +.Example: Direct link lookup +[source,c] +----- +struct rtnl_link *link; + +if (rtnl_link_get_kernel(sock, 0, "eth1", &link) < 0) + /* error */ + +/* do something with link */ + +rtnl_link_put(link); +----- + +NOTE: While using this function can save a substantial amount of bandwidth + on the netlink socket, the result will not be cached, subsequent calls + to rtnl_link_get_kernel() will always trigger sending a +RTM_GETLINK+ + request. + +==== Translating interface index to link name + +Applications which require to translate interface index to a link name or +vice verase may use the following functions to do so. Both functions require +a filled link cache to work with. + +[source,c] +----- +char *rtnl_link_i2name (struct nl_cache *cache, int ifindex, char *dst, size_t len); +int rtnl_link_name2i (struct nl_cache *cache, const char *name); +----- +=== Add / Modify + +Several types of virtual link can be added on the fly using the function ++rtnl_link_add()+. + +[source,c] +----- +#include <netlink/route/link.h> + +int rtnl_link_add(struct nl_sock *sk, struct rtnl_link *link, int flags); +----- + +=== Delete + +The deletion of virtual links such as VLAN devices or dummy devices is done +using the function +rtnl_link_delete()+. The link passed on to the function +can be a link from a link cache or it can be construct with the minimal +attributes needed to identify the link. + +[source,c] +----- +#include <netlink/route/link.h> + +int rtnl_link_delete(struct nl_sock *sk, const struct rtnl_link *link); +----- + +The function will construct and send a +RTM_DELLINK+ request message and +returns any errors returned by the kernel. + +.Example: Delete link by name +[source,c] +----- +struct rtnl_link *link; +if (!(link = rtnl_link_alloc())) + /* error */ + +rtnl_link_set_name(link, "my_vlan"); + +if (rtnl_link_delete(sock, link) < 0) + /* error */ + +rtnl_link_put(link); +----- + +[[link_object]] === Link Object Name:: @@ -86,6 +352,8 @@ The name of a network device is the human readable representation of a network device and secondary identification parameter besides the interface index. + +Kernels >= 2.6.11 support identification by link name. ++ [source,c] ----- void rtnl_link_set_name(struct rtnl_link *link, const char *name); @@ -141,8 +409,81 @@ void rtnl_link_set_weight(struct rtnl_link *link, unsigned int weight); unsigned int rtnl_link_get_weight(struct rtnl_link *link); ----- -=== Link Cache +=== Modules + +[[link_bonding]] +==== Bonding + +.Example: Add bonding link +[source,c] +----- +#include <netlink/route/link.h> + +struct rtnl_link *link; + +link = rtnl_link_alloc(); +rtnl_link_set_name(link, "my_bond"); +rtnl_link_set_type(link, "bond"); + +/* requires admin privileges */ +if (rtnl_link_add(sk, link, NLM_F_CREATE) < 0) + /* error */ + +rtnl_link_put(link); +----- + +==== VLAN + +[source,c] +----- +extern char * rtnl_link_vlan_flags2str(int, char *, size_t); +extern int rtnl_link_vlan_str2flags(const char *); + +extern int rtnl_link_vlan_set_id(struct rtnl_link *, int); +extern int rtnl_link_vlan_get_id(struct rtnl_link *); + +extern int rtnl_link_vlan_set_flags(struct rtnl_link *, + unsigned int); +extern int rtnl_link_vlan_unset_flags(struct rtnl_link *, + unsigned int); +extern unsigned int rtnl_link_vlan_get_flags(struct rtnl_link *); + +extern int rtnl_link_vlan_set_ingress_map(struct rtnl_link *, + int, uint32_t); +extern uint32_t * rtnl_link_vlan_get_ingress_map(struct rtnl_link *); +extern int rtnl_link_vlan_set_egress_map(struct rtnl_link *, + uint32_t, int); +extern struct vlan_map *rtnl_link_vlan_get_egress_map(struct rtnl_link *, + int *); +----- + +.Example: Add a VLAN device +[source,c] +----- +struct rtnl_link *link; +int master_index; + +/* lookup interface index of eth0 */ +if (!(master_index = rtnl_link_name2i(link_cache, "eth0"))) + /* error */ + +/* allocate new link object to configure the vlan device */ +link = rtnl_link_alloc(); + +/* set eth0 to be our master device */ +rtnl_link_set_link(link, master_index); + +if ((err = rtnl_link_set_type(link, "vlan")) < 0) + /* error */ + +rtnl_link_vlan_set_id(link, 10); + +if ((err = rtnl_link_add(sk, link, NLM_F_CREATE)) < 0) + /* error */ + +rtnl_link_put(link); +----- == Neighbouring @@ -347,8 +688,7 @@ uint32_t rtnl_tc_get_parent(struct rtnl_tc *tc); ----- Statistics:: -Generic statistics, see <<tc_stats, Accessing Statistics>> for -additional information. +Generic statistics, see <<tc_stats>> for additional information. + [source,c] ----- @@ -593,8 +933,7 @@ if (!(qdisc = rtnl_qdisc_alloc())) ----- The next step is to specify all generic qdisc attributes using the tc -object interface described in the section <<tc_attr, traffic control -object attributes>>. +object interface described in the section <<tc_attr>>. The following attributes must be specified: - IfIndex |