summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
...
* igb: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings2010-09-271-4/+8
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* gianfar: Use netif_set_real_num_rx_queues()Ben Hutchings2010-09-271-2/+1
| | | | | | | | Do not set num_tx_queues or real_num_tx_queues, since alloc_etherdev_mq() does that. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4vf: Use netif_set_real_num_{rx, tx}_queues()Ben Hutchings2010-09-271-1/+4
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings2010-09-271-1/+4
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb3: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings2010-09-271-1/+4
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bnx2x: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings2010-09-271-2/+4
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bnx2: Use netif_set_real_num_{rx,tx}_queues()Ben Hutchings2010-09-271-3/+6
| | | | | Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add netif_copy_real_num_queues() for use by virtual net driversBen Hutchings2010-09-271-0/+12
| | | | | | | This sets the active numbers of queues on a net device to match another. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Allow changing number of RX queues after device allocationBen Hutchings2010-09-274-19/+78
| | | | | | | | | | | | | | | For RPS, we create a kobject for each RX queue based on the number of queues passed to alloc_netdev_mq(). However, drivers generally do not determine the numbers of hardware queues to use until much later, so this usually represents the maximum number the driver may use and not the actual number in use. For TX queues, drivers can update the actual number using netif_set_real_num_tx_queues(). Add a corresponding function for RX queues, netif_set_real_num_rx_queues(). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: sk_{detach|attach}_filter() rcu fixesEric Dumazet2010-09-271-6/+4
| | | | | | | | | | | sk_attach_filter() and sk_detach_filter() are run with socket locked. Use the appropriate rcu_dereference_protected() instead of blocking BH, and rcu_dereference_bh(). There is no point adding BH prevention and memory barrier. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* fib: use atomic_inc_not_zero() in fib_rules_lookupEric Dumazet2010-09-271-3/+5
| | | | | | | | | | | | | | | | | | It seems we dont use appropriate refcount increment in an rcu_read_lock() protected section. fib_rule_get() might increment a null refcount and bad things could happen. While fib_nl_delrule() respects an rcu grace period before calling fib_rule_put(), fib_rules_cleanup_ops() calls fib_rule_put() without a grace period. Note : after this patch, we might avoid the synchronize_rcu() call done in fib_nl_delrule() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sit: percpu stats accountingEric Dumazet2010-09-271-18/+64
| | | | | | | | | | | | | Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets. Other seldom used fields are kept in netdev->stats structure, possibly unsafe. This is a preliminary work to support lockless transmit path, and correct RX stats, that are already unsafe. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipip: percpu stats accountingEric Dumazet2010-09-271-34/+93
| | | | | | | | | | | | | Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets. Other seldom used fields are kept in netdev->stats structure, possibly unsafe. This is a preliminary work to support lockless transmit path, and correct RX stats, that are already unsafe. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ip_gre: percpu stats accountingEric Dumazet2010-09-271-39/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Le lundi 27 septembre 2010 à 14:29 +0100, Ben Hutchings a écrit : > > diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c > > index 5d6ddcb..de39b22 100644 > > --- a/net/ipv4/ip_gre.c > > +++ b/net/ipv4/ip_gre.c > [...] > > @@ -377,7 +405,7 @@ static struct ip_tunnel *ipgre_tunnel_locate(struct net *net, > > if (parms->name[0]) > > strlcpy(name, parms->name, IFNAMSIZ); > > else > > - sprintf(name, "gre%%d"); > > + strcpy(name, "gre%d"); > > > > dev = alloc_netdev(sizeof(*t), name, ipgre_tunnel_setup); > > if (!dev) > [...] > > This is a valid fix, but doesn't belong in this patch! > Sorry ? It was not a fix, but at most a cleanup ;) Anyway I forgot the gretap case... [PATCH 2/4 v2] ip_gre: percpu stats accounting Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets. Other seldom used fields are kept in netdev->stats structure, possibly unsafe. This is a preliminary work to support lockless transmit path, and correct RX stats, that are already unsafe. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tunnels: prepare percpu accountingEric Dumazet2010-09-273-10/+27
| | | | | | | | | | | | | Tunnels are going to use percpu for their accounting. They are going to use a new tstats field in net_device. skb_tunnel_rx() is changed to be a wrapper around __skb_tunnel_rx() IPTUNNEL_XMIT() is changed to be a wrapper around __IPTUNNEL_XMIT() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* vlan: use this_cpu_ptr() in vlan_skb_recv()Eric Dumazet2010-09-271-2/+2
| | | | | Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Phonet: Implement Pipe Controller to support Nokia Slim ModemsKumar Sanghvi2010-09-274-6/+479
| | | | | | | | | | | | | Phonet stack assumes the presence of Pipe Controller, either in Modem or on Application Processing Engine user-space for the Pipe data. Nokia Slim Modems like WG2.5 used in ST-Ericsson U8500 platform do not implement Pipe controller in them. This patch adds Pipe Controller implemenation to Phonet stack to support Pipe data over Phonet stack for Nokia Slim Modems. Signed-off-by: Kumar Sanghvi <kumar.sanghvi@stericsson.com> Acked-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-09-2767-214/+549
|\ | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/qlcnic/qlcnic_init.c net/ipv4/ip_output.c
| * ipv6: add a missing unregister_pernet_subsys callNeil Horman2010-09-263-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up a missing exit path in the ipv6 module init routines. In addrconf_init we call ipv6_addr_label_init which calls register_pernet_subsys for the ipv6_addr_label_ops structure. But if module loading fails, or if the ipv6 module is removed, there is no corresponding unregister_pernet_subsys call, which leaves a now-bogus address on the pernet_list, leading to oopses in subsequent registrations. This patch cleans up both the failed load path and the unload path. Tested by myself with good results. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> include/net/addrconf.h | 1 + net/ipv6/addrconf.c | 11 ++++++++--- net/ipv6/addrlabel.c | 5 +++++ 3 files changed, 14 insertions(+), 3 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
| * s390: use free_netdev(netdev) instead of kfree()Vasiliy Kulikov2010-09-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Freeing netdev without free_netdev() leads to net, tx leaks. I might lead to dereferencing freed pointer. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) @@ struct net_device* dev; @@ -kfree(dev) +free_netdev(dev) Signed-off-by: David S. Miller <davem@davemloft.net>
| * sgiseeq: use free_netdev(netdev) instead of kfree()Kulikov Vasiliy2010-09-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Freeing netdev without free_netdev() leads to net, tx leaks. I might lead to dereferencing freed pointer. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) @@ struct net_device* dev; @@ -kfree(dev) +free_netdev(dev) Signed-off-by: David S. Miller <davem@davemloft.net>
| * rionet: use free_netdev(netdev) instead of kfree()Kulikov Vasiliy2010-09-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Freeing netdev without free_netdev() leads to net, tx leaks. I might lead to dereferencing freed pointer. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) @@ struct net_device* dev; @@ -kfree(dev) +free_netdev(dev) Signed-off-by: David S. Miller <davem@davemloft.net>
| * ibm_newemac: use free_netdev(netdev) instead of kfree()Kulikov Vasiliy2010-09-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Freeing netdev without free_netdev() leads to net, tx leaks. I might lead to dereferencing freed pointer. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) @@ struct net_device* dev; @@ -kfree(dev) +free_netdev(dev) Signed-off-by: David S. Miller <davem@davemloft.net>
| * smsc911x: Add MODULE_ALIAS()Vincent Stehlé2010-09-261-0/+1
| | | | | | | | | | | | | | This enables auto loading for the smsc911x ethernet driver. Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: reset skb queue mapping when rx'ing over tunnelTom Herbert2010-09-261-0/+1
| | | | | | | | | | | | | | | | | | | | Reset queue mapping when an skb is reentering the stack via a tunnel. On second pass, the queue mapping from the original device is no longer valid. Signed-off-by: Tom Herbert <therbert@google.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * br2684: fix scheduling while atomicKarl Hiramoto2010-09-261-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | You can't call atomic_notifier_chain_unregister() while in atomic context. Fix, call un/register_atmdevice_notifier in module __init and __exit. Bug report: http://comments.gmane.org/gmane.linux.network/172603 Reported-by: Mikko Vinni <mmvinni@yahoo.com> Tested-by: Mikko Vinni <mmvinni@yahoo.com> Signed-off-by: Karl Hiramoto <karl@hiramoto.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * de2104x: fix TP link detectionOndrej Zary2010-09-261-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compex FreedomLine 32 PnP-PCI2 cards have only TP and BNC connectors but the SROM contains AUI port too. When TP loses link, the driver switches to non-existing AUI port (which reports that carrier is always present). Connecting TP back generates LinkPass interrupt but de_media_interrupt() is broken - it only updates the link state of currently connected media, ignoring the fact that LinkPass and LinkFail bits of MacStatus register belong to the TP port only (the chip documentation says that). This patch changes de_media_interrupt() to switch media to TP when link goes up (and media type is not locked) and also to update the link state only when the TP port is used. Also the NonselPortActive (and also SelPortActive) bits of SIAStatus register need to be cleared (by writing 1) after reading or they're useless. Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * de2104x: fix power managementOndrej Zary2010-09-261-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | At least my 21041 cards come out of suspend with bus mastering disabled so they did not work after resume(no data transferred). After adding pci_set_master(), the driver oopsed immediately on resume - because de_clean_rings() is called on suspend but de_init_rings() call was missing in resume. Also disable link (reset SIA) before sleep (de4x5 does this too). Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * de2104x: disable autonegotiation on broken hardwareOndrej Zary2010-09-241-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | At least on older 21041-AA chips (mine is rev. 11), TP duplex autonegotiation causes the card not to work at all (link is up but no packets are transmitted). de4x5 disables autonegotiation completely. But it seems to work on newer (21041-PA rev. 21) so disable it only on rev<20 chips. Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Acked-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: fix a lockdep splatEric Dumazet2010-09-246-26/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have for each socket : One spinlock (sk_slock.slock) One rwlock (sk_callback_lock) Possible scenarios are : (A) (this is used in net/sunrpc/xprtsock.c) read_lock(&sk->sk_callback_lock) (without blocking BH) <BH> spin_lock(&sk->sk_slock.slock); ... read_lock(&sk->sk_callback_lock); ... (B) write_lock_bh(&sk->sk_callback_lock) stuff write_unlock_bh(&sk->sk_callback_lock) (C) spin_lock_bh(&sk->sk_slock) ... write_lock_bh(&sk->sk_callback_lock) stuff write_unlock_bh(&sk->sk_callback_lock) spin_unlock_bh(&sk->sk_slock) This (C) case conflicts with (A) : CPU1 [A] CPU2 [C] read_lock(callback_lock) <BH> spin_lock_bh(slock) <wait to spin_lock(slock)> <wait to write_lock_bh(callback_lock)> We have one problematic (C) use case in inet_csk_listen_stop() : local_bh_disable(); bh_lock_sock(child); // spin_lock_bh(&sk->sk_slock) WARN_ON(sock_owned_by_user(child)); ... sock_orphan(child); // write_lock_bh(&sk->sk_callback_lock) lockdep is not happy with this, as reported by Tetsuo Handa It seems only way to deal with this is to use read_lock_bh(callbacklock) everywhere. Thanks to Jarek for pointing a bug in my first attempt and suggesting this solution. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jarek Poplawski <jarkao2@gmail.com> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * e1000e: 82579 do not gate auto config of PHY by hardware during nominal useBruce Allan2010-09-221-9/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For non-managed versions of 82579, set the bit that prevents the hardware from automatically configuring the PHY after resets only when the driver performs a reset, clear the bit after resets. This is so the hardware can configure the PHY automatically when the part is reset in a manner that is not controlled by the driver (e.g. in a virtual environment via PCI FLR) otherwise the PHY will be mis-configured causing issues such as failing to link at 1000Mbps. For managed versions of 82579, keep the previous behavior since the manageability firmware will handle the PHY configuration. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * e1000e: 82579 jumbo frame workaround causing CRC errorsBruce Allan2010-09-222-21/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The subject workaround was causing CRC errors due to writing the wrong register with updates of the RCTL register. It was also found that the workaround function which modifies the RCTL register was being called in the middle of a read-modify-write operation of the RCTL register, so the function call has been moved appropriately. Lastly, jumbo frames must not be allowed when CRC stripping is disabled by a module parameter because the workaround requires the CRC be stripped. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * e1000e: 82579 unaccounted missed packetsBruce Allan2010-09-222-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | On 82579, there is a hardware bug that can cause received packets to not get transferred from the PHY to the MAC due to K1 (a power saving feature of the PHY-MAC interconnect similar to ASPM L1). Since the MAC controls the accounting of missed packets, these will go unnoticed. Workaround the issue by setting the K1 beacon duration according to the link speed. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * e1000e: 82566DC fails to get linkBruce Allan2010-09-221-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two recent patches to cleanup the reset[1] and initial PHY configuration[2] code paths for ICH/PCH devices inadvertently left out a 10msec delay and device ID check respectively which are necessary for the 82566DC (device id 0x104b) to be configured properly, otherwise it will not get link. [1] commit e98cac447cc1cc418dff1d610a5c79c4f2bdec7f [2] commit 3f0c16e84438d657d29446f85fe375794a93f159 CC: stable@kernel.org Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * e1000e: 82579 SMBus address and LEDs incorrect after device resetBruce Allan2010-09-221-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Since the hardware is prevented from performing automatic PHY configuration (the driver does it instead), the OEM_WRITE_ENABLE bit in the EXTCNF_CTRL register will not get cleared preventing the SMBus address and the LED configuration to be written to the PHY registers. On 82579, do not check the OEM_WRITE_ENABLE bit. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * e1000e: 82577/8/9 issues with device in SxBruce Allan2010-09-221-8/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When going to Sx, disable gigabit in PHY (e1000_oem_bits_config_ich8lan) in addition to the MAC before configuring PHY wakeup otherwise the PHY configuration writes might be missed. Also write the LED configuration and SMBus address to the PHY registers (e1000_oem_bits_config_ich8lan and e1000_write_smbus_addr, respectively). The reset is no longer needed since re-auto-negotiation is forced in e1000_oem_bits_config_ich8lan and leaving it in causes issues with auto-negotiating the link. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xfrm4: strip ECN bits from tos fieldUlrich Weber2010-09-221-1/+1
| | | | | | | | | | | | | | | | otherwise ECT(1) bit will get interpreted as RTO_ONLINK and routing will fail with XfrmOutBundleGenError. Signed-off-by: Ulrich Weber <uweber@astaro.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * atl1: zero out CMB and SBM in atl1_free_ring_resourcesLuca Tettamanti2010-09-221-0/+6
| | | | | | | | | | | | | | | | | | They are allocated in atl1_setup_ring_resources, zero out the pointers in atl1_free_ring_resources (like the other resources). Signed-off-by: Luca Tettamanti <kronos.it@gmail.com> Acked-by: Chris Snook <chris.snook@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * atl1: fix resumeLuca Tettamanti2010-09-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | adapter->cmb.cmb is initialized when the device is opened and freed when it's closed. Accessing it unconditionally during resume results either in a crash (NULL pointer dereference, when the interface has not been opened yet) or data corruption (when the interface has been used and brought down adapter->cmb.cmb points to a deallocated memory area). Cc: stable@kernel.org Signed-off-by: Luca Tettamanti <kronos.it@gmail.com> Acked-by: Chris Snook <chris.snook@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Move "struct net" declaration inside the __KERNEL__ macro guardOllie Wild2010-09-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | This patch reduces namespace pollution by moving the "struct net" declaration out of the userspace-facing portion of linux/netlink.h. It has no impact on the kernel. (This came up because we have several C++ applications which use "net" as a namespace name.) Signed-off-by: Ollie Wild <aaw@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: nf_conntrack_defrag: check socket type before touching nodefrag flagJiri Olsa2010-09-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | we need to check proper socket type within ipv4_conntrack_defrag function before referencing the nodefrag flag. For example the tun driver receive path produces skbs with AF_UNSPEC socket type, and so current code is causing unwanted fragmented packets going out. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: nf_nat_snmp: fix checksum calculation (v4)Patrick McHardy2010-09-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | Fix checksum calculation in nf_nat_snmp_basic. Based on patches by Clark Wang <wtweeker@163.com> and Stephen Hemminger <shemminger@vyatta.com>. https://bugzilla.kernel.org/show_bug.cgi?id=17622 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: fix a race in nf_ct_ext_create()Eric Dumazet2010-09-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | As soon as rcu_read_unlock() is called, there is no guarantee current thread can safely derefence t pointer, rcu protected. Fix is to copy t->alloc_size in a temporary variable. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: fix ipt_REJECT TCP RST routing for indev == outdevChangli Gao2010-09-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ip_route_me_harder can't create the route cache when the outdev is the same with the indev for the skbs whichout a valid protocol set. __mkroute_input functions has this check: 1998 if (skb->protocol != htons(ETH_P_IP)) { 1999 /* Not IP (i.e. ARP). Do not create route, if it is 2000 * invalid for proxy arp. DNAT routes are always valid. 2001 * 2002 * Proxy arp feature have been extended to allow, ARP 2003 * replies back to the same interface, to support 2004 * Private VLAN switch technologies. See arp.c. 2005 */ 2006 if (out_dev == in_dev && 2007 IN_DEV_PROXY_ARP_PVLAN(in_dev) == 0) { 2008 err = -EINVAL; 2009 goto cleanup; 2010 } 2011 } This patch gives the new skb a valid protocol to bypass this check. In order to make ipt_REJECT work with bridges, you also need to enable ip_forward. This patch also fixes a regression. When we used skb_copy_expand(), we didn't have this issue stated above, as the protocol was properly set. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: nf_ct_sip: default to NF_ACCEPT in sip_help_tcp()Simon Horman2010-09-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I initially noticed this because of the compiler warning below, but it does seem to be a valid concern in the case where ct_sip_get_header() returns 0 in the first iteration of the while loop. net/netfilter/nf_conntrack_sip.c: In function 'sip_help_tcp': net/netfilter/nf_conntrack_sip.c:1379: warning: 'ret' may be used uninitialized in this function Signed-off-by: Simon Horman <horms@verge.net.au> [Patrick: changed NF_DROP to NF_ACCEPT] Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: tproxy: nf_tproxy_assign_sock() can handle tw socketsEric Dumazet2010-09-221-1/+5
| | | | | | | | | | | | | | | | | | | | | | transparent field of a socket is either inet_twsk(sk)->tw_transparent for timewait sockets, or inet_sk(sk)->transparent for other sockets (TCP/UDP). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ip: fix truesize mismatch in ip fragmentationEric Dumazet2010-09-212-11/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Special care should be taken when slow path is hit in ip_fragment() : When walking through frags, we transfert truesize ownership from skb to frags. Then if we hit a slow_path condition, we must undo this or risk uncharging frags->truesize twice, and in the end, having negative socket sk_wmem_alloc counter, or even freeing socket sooner than expected. Many thanks to Nick Bowler, who provided a very clean bug report and test program. Thanks to Jarek for reviewing my first patch and providing a V2 While Nick bisection pointed to commit 2b85a34e911 (net: No more expensive sock_hold()/sock_put() on each tx), underlying bug is older (2.6.12-rc5) A side effect is to extend work done in commit b2722b1c3a893e (ip_fragment: also adjust skb->truesize for packets not owned by a socket) to ipv6 as well. Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com> Tested-by: Nick Bowler <nbowler@elliptictech.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jarek Poplawski <jarkao2@gmail.com> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netxen: dont set skb->truesizeEric Dumazet2010-09-211-3/+0
| | | | | | | | | | | | | | | | | | skb->truesize is set in core network. Dont change it unless dealing with fragments. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * qlcnic: dont set skb->truesizeEric Dumazet2010-09-211-5/+0
| | | | | | | | | | | | | | | | | | skb->truesize is set in core network. Dont change it unless dealing with fragments. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of ↵David S. Miller2010-09-212-1/+6
| |\ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6