summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* ARM: highbank: Add smc calls to enable/disable the L2Rob Herring2012-06-074-1/+47
| | | | | | | | | Linux runs in non-secure mode on highbank, so we need secure monitor calls to enable and disable the PL310. Rather than invent new smc calls, the same calling convention used by OMAP is used here. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Olof Johansson <olof@lixom.net>
* Merge branch 'imx/fixes' of git://git.pengutronix.de/git/imx/linux-2.6 into ↵Olof Johansson2012-06-076-14/+29
|\ | | | | | | | | | | | | | | | | | | | | fixes * 'imx/fixes' of git://git.pengutronix.de/git/imx/linux-2.6: ARM i.MX imx21ads: Fix overlapping static i/o mappings ARM i.MX27 Visstrim M10: fix gpio handling. ARM: imx: only call l2x0_init if it's available ARM: imx: only specify i2c device type once ARM: mx31_3ds: Fix build due to missing IMX_HAVE_PLATFORM_IMX_SSI
| * ARM i.MX imx21ads: Fix overlapping static i/o mappingsJaccon Bastiaansen2012-06-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The statically defined I/O memory regions for the i.MX21 on chip peripherals and the on board I/O peripherals of the i.MX21ADS board overlap. This results in a kernel crash during startup. This is fixed by reducing the memory range for the on board I/O peripherals to the actually required range. Signed-off-by: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Cc: stable <stable@vger.kernel.org>
| * ARM i.MX27 Visstrim M10: fix gpio handling.Javier Martin2012-06-051-11/+25
| | | | | | | | | | | | | | | | | | Some GPIOs in Visstrim M10 are used without being registered. This leads to USB and video malfunctions. This patch registers those GPIOs to solve the issue. Signed-off-by: Javier Martin <javier.martin@vista-silicon.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * ARM: imx: only call l2x0_init if it's availableUwe Kleine-König2012-05-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a build failure with CONFIG_CACHE_L2X0=n: arch/arm/mach-imx/built-in.o: In function `imx3_init_l2x0': imx53-dt.c:(.init.text+0x190): undefined reference to `l2x0_init' make[2]: *** [.tmp_vmlinux1] Error 1 make[1]: *** [sub-make] Error 2 make: *** [all] Error 2 When the l2 cache isn't enabled the quirk introduced in 9524705 (MX35: Fix bogus L2 cache settings) doesn't need to be done either. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * ARM: imx: only specify i2c device type onceUwe Kleine-König2012-05-292-2/+0
| | | | | | | | | | | | | | | | The first argument of I2C_BOARD_INFO is used to assign .type, so drop second assignment. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * ARM: mx31_3ds: Fix build due to missing IMX_HAVE_PLATFORM_IMX_SSIFabio Estevam2012-05-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 5fb86e5 (ARM: mx31_3ds: Add sound support) missed to select IMX_HAVE_PLATFORM_IMX_SSI, which causes the following build error when only mx31_3ds is selected: arch/arm/mach-imx/built-in.o: In function `mx31_3ds_init': mach-mx31_3ds.c:(.init.text+0x15dc): undefined reference to `imx_add_imx_ssi' mach-mx31_3ds.c:(.init.text+0x16b8): undefined reference to `imx31_imx_ssi_data' Select IMX_HAVE_PLATFORM_IMX_SSI to fix it. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* | Merge branch 'imx/clk-fixes' of git://git.pengutronix.de/git/imx/linux-2.6 ↵Olof Johansson2012-06-072-39/+56
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | into fixes * 'imx/clk-fixes' of git://git.pengutronix.de/git/imx/linux-2.6: ARM i.MX53: Fix PLL4 base address ARM i.MX pllv2: make round_rate accurate ARM i.MX pllv2: use standard register set unconditionally
| * | ARM i.MX53: Fix PLL4 base addressSascha Hauer2012-06-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | MX53_DPLL4_BASE accidently returned the base address of PLL3. Fix this. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Cc: stable@vger.kernel.org
| * | ARM i.MX pllv2: make round_rate accurateSascha Hauer2012-06-041-24/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | in round_rate we made the assumption that we can set arbitrary frequencies and thus returned the input rate. This is not correct, for certain frequencies after setting a frequency with set_rate, recalc_rate will return different values. To fix this, introduce set_rate/recalc_rate functions which work on variables instead of registers directly. This way we can call these in round_rate to get the exact rate which we would get if we call set_rate with this value. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | ARM i.MX pllv2: use standard register set unconditionallySascha Hauer2012-06-041-24/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The i.MX5 PLL has two different register sets for setting the rate. One is used for the standard case and and is used for DVFS. Which one of them is used depends on a hardware input of the PLL. Current implementation reads back from the hardware which setting is used. This is bogus: If we ever want to implement DVFS we have to program both register sets and not only the one which happens to be used at the moment. For now, just use the standard register set uncondionally. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* | | Merge branch 'imx/fixes-for-3.5' of ↵Olof Johansson2012-06-073-1/+45
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.linaro.org/people/shawnguo/linux-2.6 into fixes * 'imx/fixes-for-3.5' of git://git.linaro.org/people/shawnguo/linux-2.6: ARM: imx6: exit coherency when shutting down a cpu ARM: mx51: Add pinctrl_provide_dummies() ARM: mx31: Add pinctrl_provide_dummies() Signed-off-by: Olof Johansson <olof@lixom.net>
| * | | ARM: imx6: exit coherency when shutting down a cpuShawn Guo2012-06-071-1/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a system hang issue on imx6q which can easily be seen with running a cpu hotplug stress testing (hotplug secondary cores from user space via sysfs interface for thousands iterations). It turns out that the issue is caused by coherency of the cpu that is being shut down. When shutting down a cpu, we need to have the cpu exit coherency to prevent it from receiving cache, TLB, or BTB maintenance operations broadcast by other CPUs in the cluster. Copy cpu_enter_lowpower() and cpu_leave_lowpower() from mach-vexpress to have coherency properly handled in platform_cpu_die(), thus fix the issue. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Cc: stable@kernel.org
| * | | ARM: mx51: Add pinctrl_provide_dummies()Fabio Estevam2012-06-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit a2aa65a (ARM: imx: enable pinctrl dummy states) missed to add pinctrl_provide_dummies() for mx51. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
| * | | ARM: mx31: Add pinctrl_provide_dummies()Fabio Estevam2012-06-071-0/+2
| |/ / | | | | | | | | | | | | | | | | | | | | | commit a2aa65a (ARM: imx: enable pinctrl dummy states) missed to add pinctrl_provide_dummies() for mx31. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
* | | Merge tag 'imx-clk-common-fixes' of ↵Olof Johansson2012-06-0711-50/+44
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.pengutronix.de/git/imx/linux-2.6 into fixes From Sascha Hauer: "Some fixes for the fresh i.MX common clock support" Resolved trivial conflict in arch/arm/plat-mxc/include/mach/common.h. * tag 'imx-clk-common-fixes' of git://git.pengutronix.de/git/imx/linux-2.6: ARM: imx6q: prepare and enable init on clks directly instead of clk_get first ARM i.MX: remove now unnecessary argument from mxc_timer_init ARM: i.MX: change timer clock from ipg to perclk ARM i.MX5: fix gpt peripheral clock path Signed-off-by: Olof Johansson <olof@lixom.net>
| * | ARM: imx6q: prepare and enable init on clks directly instead of clk_get firstRichard Zhao2012-05-161-14/+6
| | | | | | | | | | | | | | | | | | | | | | | | This also removes the usboh3 clk from the initially turned on clocks which leaked in from an internal development tree. Signed-off-by: Richard Zhao <richard.zhao@freescale.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | ARM i.MX: remove now unnecessary argument from mxc_timer_initSascha Hauer2012-05-1611-32/+33
| | | | | | | | | | | | | | | | | | | | | | | | As the timer code now does a clk_get to get its clock we don't need the struct clk argument anymore. This also changes the alternative EPIT timer to do a clk_get. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | ARM: i.MX: change timer clock from ipg to perclkRichard Zhao2012-05-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Contrary to the ipg clock the perclk rate is not changed or gated in low power mode, so we choose perclk for gpt. With the port to the common clock framework as a side effect the timer used the rate returned from the peripheral clock but the hardware was still programmed to use the ipg clock, so this patch only changes the hardware to really use the clock it already assumed. Signed-off-by: Richard Zhao <richard.zhao@freescale.com> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
| * | ARM i.MX5: fix gpt peripheral clock pathSascha Hauer2012-05-161-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | - The gpt peripheral clk parent is per_root, not ipg - The register for selectin per_lp_apm and per_root is MXC_CCM_CBCMR, not MXC_CCM_CBCDR Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
* | | Linux 3.5-rc1v3.5-rc1Linus Torvalds2012-06-021-2/+2
| | |
* | | Merge tag 'dm-3.5-changes-1' of ↵Linus Torvalds2012-06-026-90/+322
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm Pull device-mapper updates from Alasdair G Kergon: "Improve multipath's retrying mechanism in some defined circumstances and provide a simple reserve/release mechanism for userspace tools to access thin provisioning metadata while the pool is in use." * tag 'dm-3.5-changes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: dm thin: provide userspace access to pool metadata dm thin: use slab mempools dm mpath: allow ioctls to trigger pg init dm mpath: delay retry of bypassed pg dm mpath: reduce size of struct multipath
| * | | dm thin: provide userspace access to pool metadataJoe Thornber2012-06-035-11/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements two new messages that can be sent to the thin pool target allowing it to take a snapshot of the _metadata_. This, read-only snapshot can be accessed by userland, concurrently with the live target. Only one metadata snapshot can be held at a time. The pool's status line will give the block location for the current msnap. Since version 0.1.5 of the userland thin provisioning tools, the thin_dump program displays the msnap as follows: thin_dump -m <msnap root> <metadata dev> Available here: https://github.com/jthornber/thin-provisioning-tools Now that userland can access the metadata we can do various things that have traditionally been kernel side tasks: i) Incremental backups. By using metadata snapshots we can work out what blocks have changed over time. Combined with data snapshots we can ensure the data doesn't change while we back it up. A short proof of concept script can be found here: https://github.com/jthornber/thinp-test-suite/blob/master/incremental_backup_example.rb ii) Migration of thin devices from one pool to another. iii) Merging snapshots back into an external origin. iv) Asyncronous replication. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm thin: use slab mempoolsMike Snitzer2012-06-031-62/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use dedicated caches prefixed with a "dm_" name rather than relying on kmalloc mempools backed by generic slab caches so the memory usage of thin provisioning (and any leaks) can be accounted for independently. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm mpath: allow ioctls to trigger pg initMikulas Patocka2012-06-031-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the failure of a group of paths, any alternative paths that need initialising do not become available until further I/O is sent to the device. Until this has happened, ioctls return -EAGAIN. With this patch, new paths are made available in response to an ioctl too. The processing of the ioctl gets delayed until this has happened. Instead of returning an error, we submit a work item to kmultipathd (that will potentially activate the new path) and retry in ten milliseconds. Note that the patch doesn't retry an ioctl if the ioctl itself fails due to a path failure. Such retries should be handled intelligently by the code that generated the ioctl in the first place, noting that some SCSI commands should not be retried because they are not idempotent (XOR write commands). For commands that could be retried, there is a danger that if the device rejected the SCSI command, the path could be errorneously marked as failed, and the request would be retried on another path which might fail too. It can be determined if the failure happens on the device or on the SCSI controller, but there is no guarantee that all SCSI drivers set these flags correctly. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm mpath: delay retry of bypassed pgMike Christie2012-06-031-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If I/O needs retrying and only bypassed priority groups are available, set the pg_init_delay_retry flag to wait before retrying. If, for example, the reason for the bypass is that the controller is getting reset or there is a firmware upgrade happening, retrying right away would cause a flood of log messages and retries for what could be a few seconds or even several minutes. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
| * | | dm mpath: reduce size of struct multipathMike Snitzer2012-06-031-6/+7
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | Move multipath structure's 'lock' and 'queue_size' members to eliminate two 4-byte holes. Also use a bit within a single unsigned int for each existing flag (saves 8-bytes). This allows future flags to be added without each consuming an unsigned int. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2012-06-0221-84/+201
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) Make syn floods consume significantly less resources by a) Not pre-COW'ing routing metrics for SYN/ACKs b) Mirroring the device queue mapping of the SYN for the SYN/ACK reply. Both from Eric Dumazet. 2) Fix calculation errors in Byte Queue Limiting, from Hiroaki SHIMODA. 3) Validate the length requested when building a paged SKB for a socket, so we don't overrun the page vector accidently. From Jason Wang. 4) When netlabel is disabled, we abort all IP option processing when we see a CIPSO option. This isn't the right thing to do, we should simply skip over it and continue processing the remaining options (if any). Fix from Paul Moore. 5) SRIOV fixes for the mellanox driver from Jack orgenstein and Marcel Apfelbaum. 6) 8139cp enables the receiver before the ring address is properly programmed, which potentially lets the device crap over random memory. Fix from Jason Wang. 7) e1000/e1000e fixes for i217 RST handling, and an improper buffer address reference in jumbo RX frame processing from Bruce Allan and Sebastian Andrzej Siewior, respectively. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: fec_mpc52xx: fix timestamp filtering mcs7830: Implement link state detection e1000e: fix Rapid Start Technology support for i217 e1000: look into the page instead of skb->data for e1000_tbi_adjust_stats() r8169: call netif_napi_del at errpaths and at driver unload tcp: reflect SYN queue_mapping into SYNACK packets tcp: do not create inetpeer on SYNACK message 8139cp/8139too: terminate the eeprom access with the right opmode 8139cp: set ring address before enabling receiver cipso: handle CIPSO options correctly when NetLabel is disabled net: sock: validate data_len before allocating skb in sock_alloc_send_pskb() bql: Avoid possible inconsistent calculation. bql: Avoid unneeded limit decrement. bql: Fix POSDIFF() to integer overflow aware. net/mlx4_core: Fix obscure mlx4_cmd_box parameter in QUERY_DEV_CAP net/mlx4_core: Check port out-of-range before using in mlx4_slave_cap net/mlx4_core: Fixes for VF / Guest startup flow net/mlx4_en: Fix improper use of "port" parameter in mlx4_en_event net/mlx4_core: Fix number of EQs used in ICM initialisation net/mlx4_core: Fix the slave_id out-of-range test in mlx4_eq_int
| * | | fec_mpc52xx: fix timestamp filteringStephan Gatzka2012-06-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | skb_defer_rx_timestamp was called with a freshly allocated skb but must be called with rskb instead. Signed-off-by: Stephan Gatzka <stephan@gatzka.org> Cc: stable <stable@vger.kernel.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | mcs7830: Implement link state detectionOndrej Zary2012-06-021-2/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add .status callback that detects link state changes. Tested with MCS7832CV-AA chip (9710:7830, identified as rev.C by the driver). Fixes https://bugzilla.kernel.org/show_bug.cgi?id=28532 Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | e1000e: fix Rapid Start Technology support for i217Bruce Allan2012-06-021-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The definition of I217_PROXY_CTRL must use the BM_PHY_REG() macro instead of the PHY_REG() macro for PHY page 800 register 70 since it is for a PHY register greater than the maximum allowed by the latter macro, and fix a typo setting the I217_MEMPWR register in e1000_suspend_workarounds_ich8lan. Also for clarity, rename a few defines as bit definitions instead of masks. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | | e1000: look into the page instead of skb->data for e1000_tbi_adjust_stats()Sebastian Andrzej Siewior2012-06-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is another fixup where the data is not transfered into buffer addressed by skb->data but into a page. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | | r8169: call netif_napi_del at errpaths and at driver unloadDevendra Naga2012-06-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | when register_netdev fails, the init'ed NAPIs by netif_napi_add must be deleted with netif_napi_del, and also when driver unloads, it should delete the NAPI before unregistering netdevice using unregister_netdev. Signed-off-by: Devendra Naga <devendra.aaru@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | tcp: reflect SYN queue_mapping into SYNACK packetsEric Dumazet2012-06-012-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While testing how linux behaves on SYNFLOOD attack on multiqueue device (ixgbe), I found that SYNACK messages were dropped at Qdisc level because we send them all on a single queue. Obvious choice is to reflect incoming SYN packet @queue_mapping to SYNACK packet. Under stress, my machine could only send 25.000 SYNACK per second (for 200.000 incoming SYN per second). NIC : ixgbe with 16 rx/tx queues. After patch, not a single SYNACK is dropped. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | tcp: do not create inetpeer on SYNACK messageEric Dumazet2012-06-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Another problem on SYNFLOOD/DDOS attack is the inetpeer cache getting larger and larger, using lots of memory and cpu time. tcp_v4_send_synack() ->inet_csk_route_req() ->ip_route_output_flow() ->rt_set_nexthop() ->rt_init_metrics() ->inet_getpeer( create = true) This is a side effect of commit a4daad6b09230 (net: Pre-COW metrics for TCP) added in 2.6.39 Possible solution : Instruct inet_csk_route_req() to remove FLOWI_FLAG_PRECOW_METRICS Before patch : # grep peer /proc/slabinfo inet_peer_cache 4175430 4175430 192 42 2 : tunables 0 0 0 : slabdata 99415 99415 0 Samples: 41K of event 'cycles', Event count (approx.): 30716565122 + 20,24% ksoftirqd/0 [kernel.kallsyms] [k] inet_getpeer + 8,19% ksoftirqd/0 [kernel.kallsyms] [k] peer_avl_rebalance.isra.1 + 4,81% ksoftirqd/0 [kernel.kallsyms] [k] sha_transform + 3,64% ksoftirqd/0 [kernel.kallsyms] [k] fib_table_lookup + 2,36% ksoftirqd/0 [ixgbe] [k] ixgbe_poll + 2,16% ksoftirqd/0 [kernel.kallsyms] [k] __ip_route_output_key + 2,11% ksoftirqd/0 [kernel.kallsyms] [k] kernel_map_pages + 2,11% ksoftirqd/0 [kernel.kallsyms] [k] ip_route_input_common + 2,01% ksoftirqd/0 [kernel.kallsyms] [k] __inet_lookup_established + 1,83% ksoftirqd/0 [kernel.kallsyms] [k] md5_transform + 1,75% ksoftirqd/0 [kernel.kallsyms] [k] check_leaf.isra.9 + 1,49% ksoftirqd/0 [kernel.kallsyms] [k] ipt_do_table + 1,46% ksoftirqd/0 [kernel.kallsyms] [k] hrtimer_interrupt + 1,45% ksoftirqd/0 [kernel.kallsyms] [k] kmem_cache_alloc + 1,29% ksoftirqd/0 [kernel.kallsyms] [k] inet_csk_search_req + 1,29% ksoftirqd/0 [kernel.kallsyms] [k] __netif_receive_skb + 1,16% ksoftirqd/0 [kernel.kallsyms] [k] copy_user_generic_string + 1,15% ksoftirqd/0 [kernel.kallsyms] [k] kmem_cache_free + 1,02% ksoftirqd/0 [kernel.kallsyms] [k] tcp_make_synack + 0,93% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_lock_bh + 0,87% ksoftirqd/0 [kernel.kallsyms] [k] __call_rcu + 0,84% ksoftirqd/0 [kernel.kallsyms] [k] rt_garbage_collect + 0,84% ksoftirqd/0 [kernel.kallsyms] [k] fib_rules_lookup Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | 8139cp/8139too: terminate the eeprom access with the right opmodeJason Wang2012-06-012-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we terminate the eeprom access through clearing the CS by: RTL_W8 (Cfg9346, ~EE_CS); or writeb (~EE_CS, ee_addr); This would left the eeprom into "Config. Register Write Enable:" state which is not expcted as the highest two bits were set to 0x11 ( expected is the "Normal" mode (0x00)). Solving this by write 0x0 instead of ~EE_CS when terminating the eeprom access. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | 8139cp: set ring address before enabling receiverJason Wang2012-06-011-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, we enable the receiver before setting the ring address which could lead the card DMA into unexpected areas. Solving this by set the ring address before enabling the receiver. btw. I find and test this in qemu as I didn't have a 8139cp card in hand. please review it carefully. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | cipso: handle CIPSO options correctly when NetLabel is disabledPaul Moore2012-06-011-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When NetLabel is not enabled, e.g. CONFIG_NETLABEL=n, and the system receives a CIPSO tagged packet it is dropped (cipso_v4_validate() returns non-zero). In most cases this is the correct and desired behavior, however, in the case where we are simply forwarding the traffic, e.g. acting as a network bridge, this becomes a problem. This patch fixes the forwarding problem by providing the basic CIPSO validation code directly in ip_options_compile() without the need for the NetLabel or CIPSO code. The new validation code can not perform any of the CIPSO option label/value verification that cipso_v4_validate() does, but it can verify the basic CIPSO option format. The behavior when NetLabel is enabled is unchanged. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: sock: validate data_len before allocating skb in sock_alloc_send_pskb()Jason Wang2012-05-311-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to validate the number of pages consumed by data_len, otherwise frags array could be overflowed by userspace. So this patch validate data_len and return -EMSGSIZE when data_len may occupies more frags than MAX_SKB_FRAGS. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bql: Avoid possible inconsistent calculation.Hiroaki SHIMODA2012-05-311-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dql->num_queued could change while processing dql_completed(). To provide consistent calculation, added an on stack variable. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bql: Avoid unneeded limit decrement.Hiroaki SHIMODA2012-05-311-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When below pattern is observed, TIME dql_queued() dql_completed() | a) initial state | | b) X bytes queued V c) Y bytes queued d) X bytes completed e) Z bytes queued f) Y bytes completed a) dql->limit has already some value and there is no in-flight packet. b) X bytes queued. c) Y bytes queued and excess limit. d) X bytes completed and dql->prev_ovlimit is set and also dql->prev_num_queued is set Y. e) Z bytes queued. f) Y bytes completed. inprogress and prev_inprogress are true. At f), according to the comment, all_prev_completed becomes true and limit should be increased. But POSDIFF() ignores (completed == dql->prev_num_queued) case, so limit is decreased. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | bql: Fix POSDIFF() to integer overflow aware.Hiroaki SHIMODA2012-05-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POSDIFF() fails to take into account integer overflow case. Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Cc: Tom Herbert <therbert@google.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Denys Fedoryshchenko <denys@visp.net.lb> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net/mlx4_core: Fix obscure mlx4_cmd_box parameter in QUERY_DEV_CAPJack Morgenstein2012-05-311-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "!mlx4_is_slave" is totally confusing. Fix with constant MLX4_CMD_NATIVE, which is the intended behavior. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net/mlx4_core: Check port out-of-range before using in mlx4_slave_capJack Morgenstein2012-05-311-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The range check was performed after using the port number. Reverse this to prevent a potential array overflow. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net/mlx4_core: Fixes for VF / Guest startup flowJack Morgenstein2012-05-314-14/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - pass the following parameters: - firmware version (added QUERY_FW paravirtualization for that) - disable Blueflame on slaves. KVM disables write combining on guests, and we get better performance without BF in this case. (This requires QUERY_DEV_CAP paravirtualization, also in this commit) - max qp rdma as destination - get rid of a chunk of "if (0)" dead code Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net/mlx4_en: Fix improper use of "port" parameter in mlx4_en_eventJack Morgenstein2012-05-311-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Port is used as an array index before we know if that is proper. For example, in the catas event case, port is zero; however, the port index should lie in the range (1..2). Fix this by using 'port' only in the events where it is of interest. Test for port out of range in the default (unhandled event) case, and do not output a message if it is not an ethernet port. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net/mlx4_core: Fix number of EQs used in ICM initialisationMarcel Apfelbaum2012-05-313-15/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In SRIOV mode, the number of EQs used when computing the total ICM size was incorrect. To fix this, we do the following: 1. We add a new structure to mlx4_dev, mlx4_phys_caps, to contain physical HCA capabilities. The PPF uses the phys capabilities when it computes things like ICM size. The dev_caps structure will then contain the paravirtualized values, making bookkeeping much easier in SRIOV mode. We add a structure rather than a single parameter because there will be other fields in the phys_caps. The first field we add to the mlx4_phys_caps structure is num_phys_eqs. 2. In INIT_HCA, when running in SRIOV mode, the "log_num_eqs" parameter passed to the FW is the number of EQs per VF/PF; each function (PF or VF) has this number of EQs available. However, the total number of EQs which must be allowed for in the ICM is (1 << log_num_eqs) * (#VFs + #PFs). Rather than compute this quantity, we allocate ICM space for 1024 EQs (which is the device maximum number of EQs, and which is the value we place in the mlx4_phys_caps structure). For INIT_HCA, however, we use the per-function number of EQs as described above. Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net/mlx4_core: Fix the slave_id out-of-range test in mlx4_eq_intJack Morgenstein2012-05-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ths fixes the comparison in the FLR (Function Level Reset) event case. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2012-06-0211-39/+297
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull straggler x86 fixes from Peter Anvin: "Three groups of patches: - EFI boot stub documentation and the ability to print error messages; - Removal for PTRACE_ARCH_PRCTL for x32 (obsolete interface which should never have been ported, and the port is broken and potentially dangerous.) - ftrace stack corruption fixes. I'm not super-happy about the technical implementation, but it is probably the least invasive in the short term. In the future I would like a single method for nesting the debug stack, however." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, x32, ptrace: Remove PTRACE_ARCH_PRCTL for x32 x86, efi: Add EFI boot stub documentation x86, efi; Add EFI boot stub console support x86, efi: Only close open files in error path ftrace/x86: Do not change stacks in DEBUG when calling lockdep x86: Allow nesting of the debug stack IDT setting x86: Reset the debug_stack update counter ftrace: Use breakpoint method to update ftrace caller ftrace: Synchronize variable setting with breakpoints
| * \ \ \ Merge remote-tracking branch 'rostedt/tip/perf/urgent-2' into ↵H. Peter Anvin2012-06-016-16/+154
| |\ \ \ \ | | | | | | | | | | | | | | | | | | x86-urgent-for-linus