diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2013-11-13 17:40:34 +0900 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2013-11-13 17:40:34 +0900 |
commit | 42a2d923cc349583ebf6fdd52a7d35e1c2f7e6bd (patch) | |
tree | 2b2b0c03b5389c1301800119333967efafd994ca /Documentation | |
parent | 5cbb3d216e2041700231bcfc383ee5f8b7fc8b74 (diff) | |
parent | 75ecab1df14d90e86cebef9ec5c76befde46e65f (diff) | |
download | linux-next-42a2d923cc349583ebf6fdd52a7d35e1c2f7e6bd.tar.gz |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) The addition of nftables. No longer will we need protocol aware
firewall filtering modules, it can all live in userspace.
At the core of nftables is a, for lack of a better term, virtual
machine that executes byte codes to inspect packet or metadata
(arriving interface index, etc.) and make verdict decisions.
Besides support for loading packet contents and comparing them, the
interpreter supports lookups in various datastructures as
fundamental operations. For example sets are supports, and
therefore one could create a set of whitelist IP address entries
which have ACCEPT verdicts attached to them, and use the appropriate
byte codes to do such lookups.
Since the interpreted code is composed in userspace, userspace can
do things like optimize things before giving it to the kernel.
Another major improvement is the capability of atomically updating
portions of the ruleset. In the existing netfilter implementation,
one has to update the entire rule set in order to make a change and
this is very expensive.
Userspace tools exist to create nftables rules using existing
netfilter rule sets, but both kernel implementations will need to
co-exist for quite some time as we transition from the old to the
new stuff.
Kudos to Patrick McHardy, Pablo Neira Ayuso, and others who have
worked so hard on this.
2) Daniel Borkmann and Hannes Frederic Sowa made several improvements
to our pseudo-random number generator, mostly used for things like
UDP port randomization and netfitler, amongst other things.
In particular the taus88 generater is updated to taus113, and test
cases are added.
3) Support 64-bit rates in HTB and TBF schedulers, from Eric Dumazet
and Yang Yingliang.
4) Add support for new 577xx tigon3 chips to tg3 driver, from Nithin
Sujir.
5) Fix two fatal flaws in TCP dynamic right sizing, from Eric Dumazet,
Neal Cardwell, and Yuchung Cheng.
6) Allow IP_TOS and IP_TTL to be specified in sendmsg() ancillary
control message data, much like other socket option attributes.
From Francesco Fusco.
7) Allow applications to specify a cap on the rate computed
automatically by the kernel for pacing flows, via a new
SO_MAX_PACING_RATE socket option. From Eric Dumazet.
8) Make the initial autotuned send buffer sizing in TCP more closely
reflect actual needs, from Eric Dumazet.
9) Currently early socket demux only happens for TCP sockets, but we
can do it for connected UDP sockets too. Implementation from Shawn
Bohrer.
10) Refactor inet socket demux with the goal of improving hash demux
performance for listening sockets. With the main goals being able
to use RCU lookups on even request sockets, and eliminating the
listening lock contention. From Eric Dumazet.
11) The bonding layer has many demuxes in it's fast path, and an RCU
conversion was started back in 3.11, several changes here extend the
RCU usage to even more locations. From Ding Tianhong and Wang
Yufen, based upon suggestions by Nikolay Aleksandrov and Veaceslav
Falico.
12) Allow stackability of segmentation offloads to, in particular, allow
segmentation offloading over tunnels. From Eric Dumazet.
13) Significantly improve the handling of secret keys we input into the
various hash functions in the inet hashtables, TCP fast open, as
well as syncookies. From Hannes Frederic Sowa. The key fundamental
operation is "net_get_random_once()" which uses static keys.
Hannes even extended this to ipv4/ipv6 fragmentation handling and
our generic flow dissector.
14) The generic driver layer takes care now to set the driver data to
NULL on device removal, so it's no longer necessary for drivers to
explicitly set it to NULL any more. Many drivers have been cleaned
up in this way, from Jingoo Han.
15) Add a BPF based packet scheduler classifier, from Daniel Borkmann.
16) Improve CRC32 interfaces and generic SKB checksum iterators so that
SCTP's checksumming can more cleanly be handled. Also from Daniel
Borkmann.
17) Add a new PMTU discovery mode, IP_PMTUDISC_INTERFACE, which forces
using the interface MTU value. This helps avoid PMTU attacks,
particularly on DNS servers. From Hannes Frederic Sowa.
18) Use generic XPS for transmit queue steering rather than internal
(re-)implementation in virtio-net. From Jason Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
random32: add test cases for taus113 implementation
random32: upgrade taus88 generator to taus113 from errata paper
random32: move rnd_state to linux/random.h
random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized
random32: add periodic reseeding
random32: fix off-by-one in seeding requirement
PHY: Add RTL8201CP phy_driver to realtek
xtsonic: add missing platform_set_drvdata() in xtsonic_probe()
macmace: add missing platform_set_drvdata() in mace_probe()
ethernet/arc/arc_emac: add missing platform_set_drvdata() in arc_emac_probe()
ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh
vlan: Implement vlan_dev_get_egress_qos_mask as an inline.
ixgbe: add warning when max_vfs is out of range.
igb: Update link modes display in ethtool
netfilter: push reasm skb through instead of original frag skbs
ip6_output: fragment outgoing reassembled skb properly
MAINTAINERS: mv643xx_eth: take over maintainership from Lennart
net_sched: tbf: support of 64bit rates
ixgbe: deleting dfwd stations out of order can cause null ptr deref
ixgbe: fix build err, num_rx_queues is only available with CONFIG_RPS
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/ABI/testing/sysfs-class-net-batman-adv | 4 | ||||
-rw-r--r-- | Documentation/ABI/testing/sysfs-class-net-mesh | 34 | ||||
-rw-r--r-- | Documentation/DocBook/80211.tmpl | 4 | ||||
-rw-r--r-- | Documentation/devicetree/bindings/net/cpsw-phy-sel.txt | 28 | ||||
-rw-r--r-- | Documentation/networking/batman-adv.txt | 54 | ||||
-rw-r--r-- | Documentation/networking/bonding.txt | 75 | ||||
-rw-r--r-- | Documentation/networking/can.txt | 217 | ||||
-rw-r--r-- | Documentation/networking/ip-sysctl.txt | 15 | ||||
-rw-r--r-- | Documentation/networking/netdevices.txt | 10 | ||||
-rw-r--r-- | Documentation/ptp/testptp.c | 65 |
10 files changed, 380 insertions, 126 deletions
diff --git a/Documentation/ABI/testing/sysfs-class-net-batman-adv b/Documentation/ABI/testing/sysfs-class-net-batman-adv index bdc00707c751..7f34a95bb963 100644 --- a/Documentation/ABI/testing/sysfs-class-net-batman-adv +++ b/Documentation/ABI/testing/sysfs-class-net-batman-adv @@ -1,13 +1,13 @@ What: /sys/class/net/<iface>/batman-adv/iface_status Date: May 2010 -Contact: Marek Lindner <lindner_marek@yahoo.de> +Contact: Marek Lindner <mareklindner@neomailbox.ch> Description: Indicates the status of <iface> as it is seen by batman. What: /sys/class/net/<iface>/batman-adv/mesh_iface Date: May 2010 -Contact: Marek Lindner <lindner_marek@yahoo.de> +Contact: Marek Lindner <mareklindner@neomailbox.ch> Description: The /sys/class/net/<iface>/batman-adv/mesh_iface file displays the batman mesh interface this <iface> diff --git a/Documentation/ABI/testing/sysfs-class-net-mesh b/Documentation/ABI/testing/sysfs-class-net-mesh index bdcd8b4e38f2..0baa657b18c4 100644 --- a/Documentation/ABI/testing/sysfs-class-net-mesh +++ b/Documentation/ABI/testing/sysfs-class-net-mesh @@ -1,22 +1,23 @@ What: /sys/class/net/<mesh_iface>/mesh/aggregated_ogms Date: May 2010 -Contact: Marek Lindner <lindner_marek@yahoo.de> +Contact: Marek Lindner <mareklindner@neomailbox.ch> Description: Indicates whether the batman protocol messages of the mesh <mesh_iface> shall be aggregated or not. -What: /sys/class/net/<mesh_iface>/mesh/ap_isolation +What: /sys/class/net/<mesh_iface>/mesh/<vlan_subdir>/ap_isolation Date: May 2011 -Contact: Antonio Quartulli <ordex@autistici.org> +Contact: Antonio Quartulli <antonio@meshcoding.com> Description: Indicates whether the data traffic going from a wireless client to another wireless client will be - silently dropped. + silently dropped. <vlan_subdir> is empty when referring + to the untagged lan. What: /sys/class/net/<mesh_iface>/mesh/bonding Date: June 2010 -Contact: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> +Contact: Simon Wunderlich <sw@simonwunderlich.de> Description: Indicates whether the data traffic going through the mesh will be sent using multiple interfaces at the @@ -24,7 +25,7 @@ Description: What: /sys/class/net/<mesh_iface>/mesh/bridge_loop_avoidance Date: November 2011 -Contact: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> +Contact: Simon Wunderlich <sw@simonwunderlich.de> Description: Indicates whether the bridge loop avoidance feature is enabled. This feature detects and avoids loops @@ -41,21 +42,21 @@ Description: What: /sys/class/net/<mesh_iface>/mesh/gw_bandwidth Date: October 2010 -Contact: Marek Lindner <lindner_marek@yahoo.de> +Contact: Marek Lindner <mareklindner@neomailbox.ch> Description: Defines the bandwidth which is propagated by this node if gw_mode was set to 'server'. What: /sys/class/net/<mesh_iface>/mesh/gw_mode Date: October 2010 -Contact: Marek Lindner <lindner_marek@yahoo.de> +Contact: Marek Lindner <mareklindner@neomailbox.ch> Description: Defines the state of the gateway features. Can be either 'off', 'client' or 'server'. What: /sys/class/net/<mesh_iface>/mesh/gw_sel_class Date: October 2010 -Contact: Marek Lindner <lindner_marek@yahoo.de> +Contact: Marek Lindner <mareklindner@neomailbox.ch> Description: Defines the selection criteria this node will use to choose a gateway if gw_mode was set to 'client'. @@ -77,25 +78,14 @@ Description: What: /sys/class/net/<mesh_iface>/mesh/orig_interval Date: May 2010 -Contact: Marek Lindner <lindner_marek@yahoo.de> +Contact: Marek Lindner <mareklindner@neomailbox.ch> Description: Defines the interval in milliseconds in which batman sends its protocol messages. What: /sys/class/net/<mesh_iface>/mesh/routing_algo Date: Dec 2011 -Contact: Marek Lindner <lindner_marek@yahoo.de> +Contact: Marek Lindner <mareklindner@neomailbox.ch> Description: Defines the routing procotol this mesh instance uses to find the optimal paths through the mesh. - -What: /sys/class/net/<mesh_iface>/mesh/vis_mode -Date: May 2010 -Contact: Marek Lindner <lindner_marek@yahoo.de> -Description: - Each batman node only maintains information about its - own local neighborhood, therefore generating graphs - showing the topology of the entire mesh is not easily - feasible without having a central instance to collect - the local topologies from all nodes. This file allows - to activate the collecting (server) mode. diff --git a/Documentation/DocBook/80211.tmpl b/Documentation/DocBook/80211.tmpl index f403ec3c5c9a..46ad6faee9ab 100644 --- a/Documentation/DocBook/80211.tmpl +++ b/Documentation/DocBook/80211.tmpl @@ -152,8 +152,8 @@ !Finclude/net/cfg80211.h cfg80211_scan_request !Finclude/net/cfg80211.h cfg80211_scan_done !Finclude/net/cfg80211.h cfg80211_bss -!Finclude/net/cfg80211.h cfg80211_inform_bss_frame -!Finclude/net/cfg80211.h cfg80211_inform_bss +!Finclude/net/cfg80211.h cfg80211_inform_bss_width_frame +!Finclude/net/cfg80211.h cfg80211_inform_bss_width !Finclude/net/cfg80211.h cfg80211_unlink_bss !Finclude/net/cfg80211.h cfg80211_find_ie !Finclude/net/cfg80211.h ieee80211_bss_get_ie diff --git a/Documentation/devicetree/bindings/net/cpsw-phy-sel.txt b/Documentation/devicetree/bindings/net/cpsw-phy-sel.txt new file mode 100644 index 000000000000..7ff57a119f81 --- /dev/null +++ b/Documentation/devicetree/bindings/net/cpsw-phy-sel.txt @@ -0,0 +1,28 @@ +TI CPSW Phy mode Selection Device Tree Bindings +----------------------------------------------- + +Required properties: +- compatible : Should be "ti,am3352-cpsw-phy-sel" +- reg : physical base address and size of the cpsw + registers map +- reg-names : names of the register map given in "reg" node + +Optional properties: +-rmii-clock-ext : If present, the driver will configure the RMII + interface to external clock usage + +Examples: + + phy_sel: cpsw-phy-sel@44e10650 { + compatible = "ti,am3352-cpsw-phy-sel"; + reg= <0x44e10650 0x4>; + reg-names = "gmii-sel"; + }; + +(or) + phy_sel: cpsw-phy-sel@44e10650 { + compatible = "ti,am3352-cpsw-phy-sel"; + reg= <0x44e10650 0x4>; + reg-names = "gmii-sel"; + rmii-clock-ext; + }; diff --git a/Documentation/networking/batman-adv.txt b/Documentation/networking/batman-adv.txt index c1d82047a4b1..89490beb3c0b 100644 --- a/Documentation/networking/batman-adv.txt +++ b/Documentation/networking/batman-adv.txt @@ -69,8 +69,7 @@ folder: # aggregated_ogms gw_bandwidth log_level # ap_isolation gw_mode orig_interval # bonding gw_sel_class routing_algo -# bridge_loop_avoidance hop_penalty vis_mode -# fragmentation +# bridge_loop_avoidance hop_penalty fragmentation There is a special folder for debugging information: @@ -78,7 +77,7 @@ There is a special folder for debugging information: # ls /sys/kernel/debug/batman_adv/bat0/ # bla_backbone_table log transtable_global # bla_claim_table originators transtable_local -# gateways socket vis_data +# gateways socket Some of the files contain all sort of status information regard- ing the mesh network. For example, you can view the table of @@ -127,51 +126,6 @@ ously assigned to interfaces now used by batman advanced, e.g. # ifconfig eth0 0.0.0.0 -VISUALIZATION -------------- - -If you want topology visualization, at least one mesh node must -be configured as VIS-server: - -# echo "server" > /sys/class/net/bat0/mesh/vis_mode - -Each node is either configured as "server" or as "client" (de- -fault: "client"). Clients send their topology data to the server -next to them, and server synchronize with other servers. If there -is no server configured (default) within the mesh, no topology -information will be transmitted. With these "synchronizing -servers", there can be 1 or more vis servers sharing the same (or -at least very similar) data. - -When configured as server, you can get a topology snapshot of -your mesh: - -# cat /sys/kernel/debug/batman_adv/bat0/vis_data - -This raw output is intended to be easily parsable and convertable -with other tools. Have a look at the batctl README if you want a -vis output in dot or json format for instance and how those out- -puts could then be visualised in an image. - -The raw format consists of comma separated values per entry where -each entry is giving information about a certain source inter- -face. Each entry can/has to have the following values: --> "mac" - mac address of an originator's source interface - (each line begins with it) --> "TQ mac value" - src mac's link quality towards mac address - of a neighbor originator's interface which - is being used for routing --> "TT mac" - TT announced by source mac --> "PRIMARY" - this is a primary interface --> "SEC mac" - secondary mac address of source - (requires preceding PRIMARY) - -The TQ value has a range from 4 to 255 with 255 being the best. -The TT entries are showing which hosts are connected to the mesh -via bat0 or being bridged into the mesh network. The PRIMARY/SEC -values are only applied on primary interfaces - - LOGGING/DEBUGGING ----------------- @@ -245,5 +199,5 @@ Mailing-list: b.a.t.m.a.n@open-mesh.org (optional subscription You can also contact the Authors: -Marek Lindner <lindner_marek@yahoo.de> -Simon Wunderlich <siwu@hrz.tu-chemnitz.de> +Marek Lindner <mareklindner@neomailbox.ch> +Simon Wunderlich <sw@simonwunderlich.de> diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 9b28e714831a..2cdb8b66caa9 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -639,6 +639,15 @@ num_unsol_na are generated by the ipv4 and ipv6 code and the numbers of repetitions cannot be set independently. +packets_per_slave + + Specify the number of packets to transmit through a slave before + moving to the next one. When set to 0 then a slave is chosen at + random. + + The valid range is 0 - 65535; the default value is 1. This option + has effect only in balance-rr mode. + primary A string (eth0, eth2, etc) specifying which slave is the @@ -743,21 +752,16 @@ xmit_hash_policy protocol information to generate the hash. Uses XOR of hardware MAC addresses and IP addresses to - generate the hash. The IPv4 formula is - - (((source IP XOR dest IP) AND 0xffff) XOR - ( source MAC XOR destination MAC )) - modulo slave count - - The IPv6 formula is + generate the hash. The formula is - hash = (source ip quad 2 XOR dest IP quad 2) XOR - (source ip quad 3 XOR dest IP quad 3) XOR - (source ip quad 4 XOR dest IP quad 4) + hash = source MAC XOR destination MAC + hash = hash XOR source IP XOR destination IP + hash = hash XOR (hash RSHIFT 16) + hash = hash XOR (hash RSHIFT 8) + And then hash is reduced modulo slave count. - (((hash >> 24) XOR (hash >> 16) XOR (hash >> 8) XOR hash) - XOR (source MAC XOR destination MAC)) - modulo slave count + If the protocol is IPv6 then the source and destination + addresses are first hashed using ipv6_addr_hash. This algorithm will place all traffic to a particular network peer on the same slave. For non-IP traffic, @@ -779,21 +783,16 @@ xmit_hash_policy slaves, although a single connection will not span multiple slaves. - The formula for unfragmented IPv4 TCP and UDP packets is + The formula for unfragmented TCP and UDP packets is - ((source port XOR dest port) XOR - ((source IP XOR dest IP) AND 0xffff) - modulo slave count + hash = source port, destination port (as in the header) + hash = hash XOR source IP XOR destination IP + hash = hash XOR (hash RSHIFT 16) + hash = hash XOR (hash RSHIFT 8) + And then hash is reduced modulo slave count. - The formula for unfragmented IPv6 TCP and UDP packets is - - hash = (source port XOR dest port) XOR - ((source ip quad 2 XOR dest IP quad 2) XOR - (source ip quad 3 XOR dest IP quad 3) XOR - (source ip quad 4 XOR dest IP quad 4)) - - ((hash >> 24) XOR (hash >> 16) XOR (hash >> 8) XOR hash) - modulo slave count + If the protocol is IPv6 then the source and destination + addresses are first hashed using ipv6_addr_hash. For fragmented TCP or UDP packets and all other IPv4 and IPv6 protocol traffic, the source and destination port @@ -801,10 +800,6 @@ xmit_hash_policy formula is the same as for the layer2 transmit hash policy. - The IPv4 policy is intended to mimic the behavior of - certain switches, notably Cisco switches with PFC2 as - well as some Foundry and IBM products. - This algorithm is not fully 802.3ad compliant. A single TCP or UDP conversation containing both fragmented and unfragmented packets will see packets @@ -815,6 +810,26 @@ xmit_hash_policy conversations. Other implementations of 802.3ad may or may not tolerate this noncompliance. + encap2+3 + + This policy uses the same formula as layer2+3 but it + relies on skb_flow_dissect to obtain the header fields + which might result in the use of inner headers if an + encapsulation protocol is used. For example this will + improve the performance for tunnel users because the + packets will be distributed according to the encapsulated + flows. + + encap3+4 + + This policy uses the same formula as layer3+4 but it + relies on skb_flow_dissect to obtain the header fields + which might result in the use of inner headers if an + encapsulation protocol is used. For example this will + improve the performance for tunnel users because the + packets will be distributed according to the encapsulated + flows. + The default value is layer2. This option was added in bonding version 2.6.3. In earlier versions of bonding, this parameter does not exist, and the layer2 policy is the only policy. The diff --git a/Documentation/networking/can.txt b/Documentation/networking/can.txt index 820f55344edc..4c072414eadb 100644 --- a/Documentation/networking/can.txt +++ b/Documentation/networking/can.txt @@ -25,6 +25,12 @@ This file contains 4.1.5 RAW socket option CAN_RAW_FD_FRAMES 4.1.6 RAW socket returned message flags 4.2 Broadcast Manager protocol sockets (SOCK_DGRAM) + 4.2.1 Broadcast Manager operations + 4.2.2 Broadcast Manager message flags + 4.2.3 Broadcast Manager transmission timers + 4.2.4 Broadcast Manager message sequence transmission + 4.2.5 Broadcast Manager receive filter timers + 4.2.6 Broadcast Manager multiplex message receive filter 4.3 connected transport protocols (SOCK_SEQPACKET) 4.4 unconnected transport protocols (SOCK_DGRAM) @@ -593,6 +599,217 @@ solution for a couple of reasons: In order to receive such messages, CAN_RAW_RECV_OWN_MSGS must be set. 4.2 Broadcast Manager protocol sockets (SOCK_DGRAM) + + The Broadcast Manager protocol provides a command based configuration + interface to filter and send (e.g. cyclic) CAN messages in kernel space. + + Receive filters can be used to down sample frequent messages; detect events + such as message contents changes, packet length changes, and do time-out + monitoring of received messages. + + Periodic transmission tasks of CAN frames or a sequence of CAN frames can be + created and modified at runtime; both the message content and the two + possible transmit intervals can be altered. + + A BCM socket is not intended for sending individual CAN frames using the + struct can_frame as known from the CAN_RAW socket. Instead a special BCM + configuration message is defined. The basic BCM configuration message used + to communicate with the broadcast manager and the available operations are + defined in the linux/can/bcm.h include. The BCM message consists of a + message header with a command ('opcode') followed by zero or more CAN frames. + The broadcast manager sends responses to user space in the same form: + + struct bcm_msg_head { + __u32 opcode; /* command */ + __u32 flags; /* special flags */ + __u32 count; /* run 'count' times with ival1 */ + struct timeval ival1, ival2; /* count and subsequent interval */ + canid_t can_id; /* unique can_id for task */ + __u32 nframes; /* number of can_frames following */ + struct can_frame frames[0]; + }; + + The aligned payload 'frames' uses the same basic CAN frame structure defined + at the beginning of section 4 and in the include/linux/can.h include. All + messages to the broadcast manager from user space have this structure. + + Note a CAN_BCM socket must be connected instead of bound after socket + creation (example without error checking): + + int s; + struct sockaddr_can addr; + struct ifreq ifr; + + s = socket(PF_CAN, SOCK_DGRAM, CAN_BCM); + + strcpy(ifr.ifr_name, "can0"); + ioctl(s, SIOCGIFINDEX, &ifr); + + addr.can_family = AF_CAN; + addr.can_ifindex = ifr.ifr_ifindex; + + connect(s, (struct sockaddr *)&addr, sizeof(addr)) + + (..) + + The broadcast manager socket is able to handle any number of in flight + transmissions or receive filters concurrently. The different RX/TX jobs are + distinguished by the unique can_id in each BCM message. However additional + CAN_BCM sockets are recommended to communicate on multiple CAN interfaces. + When the broadcast manager socket is bound to 'any' CAN interface (=> the + interface index is set to zero) the configured receive filters apply to any + CAN interface unless the sendto() syscall is used to overrule the 'any' CAN + interface index. When using recvfrom() instead of read() to retrieve BCM + socket messages the originating CAN interface is provided in can_ifindex. + + 4.2.1 Broadcast Manager operations + + The opcode defines the operation for the broadcast manager to carry out, + or details the broadcast managers response to several events, including + user requests. + + Transmit Operations (user space to broadcast manager): + + TX_SETUP: Create (cyclic) transmission task. + + TX_DELETE: Remove (cyclic) transmission task, requires only can_id. + + TX_READ: Read properties of (cyclic) transmission task for can_id. + + TX_SEND: Send one CAN frame. + + Transmit Responses (broadcast manager to user space): + + TX_STATUS: Reply to TX_READ request (transmission task configuration). + + TX_EXPIRED: Notification when counter finishes sending at initial interval + 'ival1'. Requires the TX_COUNTEVT flag to be set at TX_SETUP. + + Receive Operations (user space to broadcast manager): + + RX_SETUP: Create RX content filter subscription. + + RX_DELETE: Remove RX content filter subscription, requires only can_id. + + RX_READ: Read properties of RX content filter subscription for can_id. + + Receive Responses (broadcast manager to user space): + + RX_STATUS: Reply to RX_READ request (filter task configuration). + + RX_TIMEOUT: Cyclic message is detected to be absent (timer ival1 expired). + + RX_CHANGED: BCM message with updated CAN frame (detected content change). + Sent on first message received or on receipt of revised CAN messages. + + 4.2.2 Broadcast Manager message flags + + When sending a message to the broadcast manager the 'flags' element may + contain the following flag definitions which influence the behaviour: + + SETTIMER: Set the values of ival1, ival2 and count + + STARTTIMER: Start the timer with the actual values of ival1, ival2 + and count. Starting the timer leads simultaneously to emit a CAN frame. + + TX_COUNTEVT: Create the message TX_EXPIRED when count expires + + TX_ANNOUNCE: A change of data by the process is emitted immediately. + + TX_CP_CAN_ID: Copies the can_id from the message header to each + subsequent frame in frames. This is intended as usage simplification. For + TX tasks the unique can_id from the message header may differ from the + can_id(s) stored for transmission in the subsequent struct can_frame(s). + + RX_FILTER_ID: Filter by can_id alone, no frames required (nframes=0). + + RX_CHECK_DLC: A change of the DLC leads to an RX_CHANGED. + + RX_NO_AUTOTIMER: Prevent automatically starting the timeout monitor. + + RX_ANNOUNCE_RESUME: If passed at RX_SETUP and a receive timeout occured, a + RX_CHANGED message will be generated when the (cyclic) receive restarts. + + TX_RESET_MULTI_IDX: Reset the index for the multiple frame transmission. + + RX_RTR_FRAME: Send reply for RTR-request (placed in op->frames[0]). + + 4.2.3 Broadcast Manager transmission timers + + Periodic transmission configurations may use up to two interval timers. + In this case the BCM sends a number of messages ('count') at an interval + 'ival1', then continuing to send at another given interval 'ival2'. When + only one timer is needed 'count' is set to zero and only 'ival2' is used. + When SET_TIMER and START_TIMER flag were set the timers are activated. + The timer values can be altered at runtime when only SET_TIMER is set. + + 4.2.4 Broadcast Manager message sequence transmission + + Up to 256 CAN frames can be transmitted in a sequence in the case of a cyclic + TX task configuration. The number of CAN frames is provided in the 'nframes' + element of the BCM message head. The defined number of CAN frames are added + as array to the TX_SETUP BCM configuration message. + + /* create a struct to set up a sequence of four CAN frames */ + struct { + struct bcm_msg_head msg_head; + struct can_frame frame[4]; + } mytxmsg; + + (..) + mytxmsg.nframes = 4; + (..) + + write(s, &mytxmsg, sizeof(mytxmsg)); + + With every transmission the index in the array of CAN frames is increased + and set to zero at index overflow. + + 4.2.5 Broadcast Manager receive filter timers + + The timer values ival1 or ival2 may be set to non-zero values at RX_SETUP. + When the SET_TIMER flag is set the timers are enabled: + + ival1: Send RX_TIMEOUT when a received message is not received again within + the given time. When START_TIMER is set at RX_SETUP the timeout detection + is activated directly - even without a former CAN frame reception. + + ival2: Throttle the received message rate down to the value of ival2. This + is useful to reduce messages for the application when the signal inside the + CAN frame is stateless as state changes within the ival2 periode may get + lost. + + 4.2.6 Broadcast Manager multiplex message receive filter + + To filter for content changes in multiplex message sequences an array of more + than one CAN frames can be passed in a RX_SETUP configuration message. The + data bytes of the first CAN frame contain the mask of relevant bits that + have to match in the subsequent CAN frames with the received CAN frame. + If one of the subsequent CAN frames is matching the bits in that frame data + mark the relevant content to be compared with the previous received content. + Up to 257 CAN frames (multiplex filter bit mask CAN frame plus 256 CAN + filters) can be added as array to the TX_SETUP BCM configuration message. + + /* usually used to clear CAN frame data[] - beware of endian problems! */ + #define U64_DATA(p) (*(unsigned long long*)(p)->data) + + struct { + struct bcm_msg_head msg_head; + struct can_frame frame[5]; + } msg; + + msg.msg_head.opcode = RX_SETUP; + msg.msg_head.can_id = 0x42; + msg.msg_head.flags = 0; + msg.msg_head.nframes = 5; + U64_DATA(&msg.frame[0]) = 0xFF00000000000000ULL; /* MUX mask */ + U64_DATA(&msg.frame[1]) = 0x01000000000000FFULL; /* data mask (MUX 0x01) */ + U64_DATA(&msg.frame[2]) = 0x0200FFFF000000FFULL; /* data mask (MUX 0x02) */ + U64_DATA(&msg.frame[3]) = 0x330000FFFFFF0003ULL; /* data mask (MUX 0x33) */ + U64_DATA(&msg.frame[4]) = 0x4F07FC0FF0000000ULL; /* data mask (MUX 0x4F) */ + + write(s, &msg, sizeof(msg)); + 4.3 connected transport protocols (SOCK_SEQPACKET) 4.4 unconnected transport protocols (SOCK_DGRAM) diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index a46d78583ae1..8b8a05787641 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -267,17 +267,6 @@ tcp_max_orphans - INTEGER more aggressively. Let me to remind again: each orphan eats up to ~64K of unswappable memory. -tcp_max_ssthresh - INTEGER - Limited Slow-Start for TCP with large congestion windows (cwnd) defined in - RFC3742. Limited slow-start is a mechanism to limit growth of the cwnd - on the region where cwnd is larger than tcp_max_ssthresh. TCP increases cwnd - by at most tcp_max_ssthresh segments, and by at least tcp_max_ssthresh/2 - segments per RTT when the cwnd is above tcp_max_ssthresh. - If TCP connection increased cwnd to thousands (or tens of thousands) segments, - and thousands of packets were being dropped during slow-start, you can set - tcp_max_ssthresh to improve performance for new TCP connection. - Default: 0 (off) - tcp_max_syn_backlog - INTEGER Maximal number of remembered connection requests, which have not received an acknowledgment from connecting client. @@ -451,7 +440,7 @@ tcp_fastopen - INTEGER connect() to perform a TCP handshake automatically. The values (bitmap) are - 1: Enables sending data in the opening SYN on the client. + 1: Enables sending data in the opening SYN on the client w/ MSG_FASTOPEN. 2: Enables TCP Fast Open on the server side, i.e., allowing data in a SYN packet to be accepted and passed to the application before 3-way hand shake finishes. @@ -464,7 +453,7 @@ tcp_fastopen - INTEGER different ways of setting max_qlen without the TCP_FASTOPEN socket option. - Default: 0 + Default: 1 Note that the client & server side Fast Open flags (1 and 2 respectively) must be also enabled before the rest of flags can take diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt index c7ecc7080494..0b1cf6b2a592 100644 --- a/Documentation/networking/netdevices.txt +++ b/Documentation/networking/netdevices.txt @@ -10,12 +10,12 @@ network devices. struct net_device allocation rules ================================== Network device structures need to persist even after module is unloaded and -must be allocated with kmalloc. If device has registered successfully, -it will be freed on last use by free_netdev. This is required to handle the -pathologic case cleanly (example: rmmod mydriver </sys/class/net/myeth/mtu ) +must be allocated with alloc_netdev_mqs() and friends. +If device has registered successfully, it will be freed on last use +by free_netdev(). This is required to handle the pathologic case cleanly +(example: rmmod mydriver </sys/class/net/myeth/mtu ) -There are routines in net_init.c to handle the common cases of -alloc_etherdev, alloc_netdev. These reserve extra space for driver +alloc_netdev_mqs()/alloc_netdev() reserve extra space for driver private data which gets freed when the network device is freed. If separately allocated data is attached to the network device (netdev_priv(dev)) then it is up to the module exit handler to free that. diff --git a/Documentation/ptp/testptp.c b/Documentation/ptp/testptp.c index f59ded066108..a74d0a84d329 100644 --- a/Documentation/ptp/testptp.c +++ b/Documentation/ptp/testptp.c @@ -100,6 +100,11 @@ static long ppb_to_scaled_ppm(int ppb) return (long) (ppb * 65.536); } +static int64_t pctns(struct ptp_clock_time *t) +{ + return t->sec * 1000000000LL + t->nsec; +} + static void usage(char *progname) { fprintf(stderr, @@ -112,6 +117,8 @@ static void usage(char *progname) " -f val adjust the ptp clock frequency by 'val' ppb\n" " -g get the ptp clock time\n" " -h prints this message\n" + " -k val measure the time offset between system and phc clock\n" + " for 'val' times (Maximum 25)\n" " -p val enable output with a period of 'val' nanoseconds\n" " -P val enable or disable (val=1|0) the system clock PPS\n" " -s set the ptp clock time from the system time\n" @@ -133,8 +140,12 @@ int main(int argc, char *argv[]) struct itimerspec timeout; struct sigevent sigevent; + struct ptp_clock_time *pct; + struct ptp_sys_offset *sysoff; + + char *progname; - int c, cnt, fd; + int i, c, cnt, fd; char *device = DEVICE; clockid_t clkid; @@ -144,14 +155,19 @@ int main(int argc, char *argv[]) int extts = 0; int gettime = 0; int oneshot = 0; + int pct_offset = 0; + int n_samples = 0; int periodic = 0; int perout = -1; int pps = -1; int settime = 0; + int64_t t1, t2, tp; + int64_t interval, offset; + progname = strrchr(argv[0], '/'); progname = progname ? 1+progname : argv[0]; - while (EOF != (c = getopt(argc, argv, "a:A:cd:e:f:ghp:P:sSt:v"))) { + while (EOF != (c = getopt(argc, argv, "a:A:cd:e:f:ghk:p:P:sSt:v"))) { switch (c) { case 'a': oneshot = atoi(optarg); @@ -174,6 +190,10 @@ int main(int argc, char *argv[]) case 'g': gettime = 1; break; + case 'k': + pct_offset = 1; + n_samples = atoi(optarg); + break; case 'p': perout = atoi(optarg); break; @@ -376,6 +396,47 @@ int main(int argc, char *argv[]) } } + if (pct_offset) { + if (n_samples <= 0 || n_samples > 25) { + puts("n_samples should be between 1 and 25"); + usage(progname); + return -1; + } + + sysoff = calloc(1, sizeof(*sysoff)); + if (!sysoff) { + perror("calloc"); + return -1; + } + sysoff->n_samples = n_samples; + + if (ioctl(fd, PTP_SYS_OFFSET, sysoff)) + perror("PTP_SYS_OFFSET"); + else + puts("system and phc clock time offset request okay"); + + pct = &sysoff->ts[0]; + for (i = 0; i < sysoff->n_samples; i++) { + t1 = pctns(pct+2*i); + tp = pctns(pct+2*i+1); + t2 = pctns(pct+2*i+2); + interval = t2 - t1; + offset = (t2 + t1) / 2 - tp; + + printf("system time: %ld.%ld\n", + (pct+2*i)->sec, (pct+2*i)->nsec); + printf("phc time: %ld.%ld\n", + (pct+2*i+1)->sec, (pct+2*i+1)->nsec); + printf("system time: %ld.%ld\n", + (pct+2*i+2)->sec, (pct+2*i+2)->nsec); + printf("system/phc clock time offset is %ld ns\n" + "system clock time delay is %ld ns\n", + offset, interval); + } + + free(sysoff); + } + close(fd); return 0; } |