summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* s390/mm: add __releases()/__acquires() annotations to gmap_alloc_table()Heiko Carstens2013-09-071-1/+3
| | | | | | Let sparse not incorrectly complain about unbalanced locking. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* s390: keep Kconfig sortedHeiko Carstens2013-09-071-3/+3
| | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* s390/irq: rework irq subclass handlingHeiko Carstens2013-09-049-57/+38
| | | | | | | | | | | Let's not add a function for every external interrupt subclass for which we need reference counting. Just have two register/unregister functions which have a subclass parameter: void irq_subclass_register(enum irq_subclass subclass); void irq_subclass_unregister(enum irq_subclass subclass); Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* s390/irq: use hlists for external interrupt handler arrayHeiko Carstens2013-09-041-12/+12
| | | | | | | Use hlists for the hashed array of external interrupt handlers. Reduces the size of the array by 50% (2KB). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* s390/dumpstack: convert print_symbol to %pSRHeiko Carstens2013-09-041-10/+10
| | | | | | | | This is the same as what other architectures did. The change has also the advantage that there won't be any interleaving messages between printk() and print_symbol(). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* s390/perf: Remove print_hex_dump_bytes() debug outputHendrik Brueckner2013-09-041-4/+1
| | | | | Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* s390: update defconfigHeiko Carstens2013-09-041-12/+27
| | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* s390/bpf,jit: fix address randomizationHeiko Carstens2013-09-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add misssing braces to hole calculation. This resulted in an addition instead of an substraction. Which in turn means that the jit compiler could try to write out of bounds of the allocated piece of memory. This bug was introduced with aa2d2c73 "s390/bpf,jit: address randomize and write protect jit code". Fixes this one: [ 37.320956] Unable to handle kernel pointer dereference at virtual kernel address 000003ff80231000 [ 37.320984] Oops: 0011 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 37.320993] Modules linked in: dm_multipath scsi_dh eadm_sch dm_mod ctcm fsm autofs4 [ 37.321007] CPU: 28 PID: 6443 Comm: multipathd Not tainted 3.10.9-61.x.20130829-s390xdefault #1 [ 37.321011] task: 0000004ada778000 ti: 0000004ae3304000 task.ti: 0000004ae3304000 [ 37.321014] Krnl PSW : 0704c00180000000 000000000012d1de (bpf_jit_compile+0x198e/0x23d0) [ 37.321022] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3 Krnl GPRS: 000000004350207d 0000004a00000001 0000000000000007 000003ff80231002 [ 37.321029] 0000000000000007 000003ff80230ffe 00000000a7740000 000003ff80230f76 [ 37.321032] 000003ffffffffff 000003ff00000000 000003ff0000007d 000000000071e820 [ 37.321035] 0000004adbe99950 000000000071ea18 0000004af3d9e7c0 0000004ae3307b80 [ 37.321046] Krnl Code: 000000000012d1d0: 41305004 la %r3,4(%r5) 000000000012d1d4: e330f0f80021 clg %r3,248(%r15) #000000000012d1da: a7240009 brc 2,12d1ec >000000000012d1de: 50805000 st %r8,0(%r5) 000000000012d1e2: e330f0f00004 lg %r3,240(%r15) 000000000012d1e8: 41303004 la %r3,4(%r3) 000000000012d1ec: e380f0e00004 lg %r8,224(%r15) 000000000012d1f2: e330f0f00024 stg %r3,240(%r15) [ 37.321074] Call Trace: [ 37.321077] ([<000000000012da78>] bpf_jit_compile+0x2228/0x23d0) [ 37.321083] [<00000000006007c2>] sk_attach_filter+0xfe/0x214 [ 37.321090] [<00000000005d2d92>] sock_setsockopt+0x926/0xbdc [ 37.321097] [<00000000005cbfb6>] SyS_setsockopt+0x8a/0xe8 [ 37.321101] [<00000000005ccaa8>] SyS_socketcall+0x264/0x364 [ 37.321106] [<0000000000713f1c>] sysc_nr_ok+0x22/0x28 [ 37.321113] [<000003fffce10ea8>] 0x3fffce10ea8 [ 37.321118] INFO: lockdep is turned off. [ 37.321121] Last Breaking-Event-Address: [ 37.321124] [<000000000012d192>] bpf_jit_compile+0x1942/0x23d0 [ 37.321132] [ 37.321135] Kernel panic - not syncing: Fatal exception: panic_on_oops Cc: stable@vger.kernel.org # v3.11 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
* Merge tag 'for-3.12-rc1' of git://gitorious.org/linux-pwm/linux-pwmLinus Torvalds2013-09-0326-98/+141
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull pwm changes from Thierry Reding: "A set of patches makes the device tree documentation for the various PWM drivers more consistent. Device tree support is added to the Renesas TPU driver. The sysfs interface now makes use of dev_groups. Other than that there is a healthy assortment of fixes and enhancements for minor issues that have shown up" * tag 'for-3.12-rc1' of git://gitorious.org/linux-pwm/linux-pwm: pwm: pxa: Use module_platform_driver pwm: tiehrpwm: add missing __iomem annotation pwm: tiecap: add CONFIG_PM_SLEEP to ecap_pwm_{save,restore}_context() pwm: simplify use of devm_ioremap_resource pwm: renesas-tpu: Add DT support ARM: dts: Use the PWM polarity flags pwm: Update DT bindings to reference pwm.txt for cells documentation pwm: Use the DT macro directly when parsing PWM DT flags pwm: Add PWM polarity flag macro for DT pwm: mxs: Check the return value from stmp_reset_block() pwm: convert class code to use dev_groups
| * pwm: pxa: Use module_platform_driverMike Dunn2013-09-031-11/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 76abbdde2d95a3807d0dc6bf9f84d03d0dbd4f3d pwm: Add sysfs interface causes a kernel oops due to a null pointer dereference on PXA platforms. This happens because the class added by the patch is registered in a subsys_initcall (initcall4), but the pxa pwm driver is registered in arch_initcall (initcall3). If the class is not registered before the driver probe function runs, the oops occurs in device_add() when the uninitialized pointers in struct class are dereferenced. I don't see a reason that the driver must be an arch_initcall, so this patch makes it a regular module_platform_driver (initcall6), preventing the oops. Signed-off-by: Mike Dunn <mikedunn@newsguy.com> Acked-by: Robert Jarzmik <robert.jarzmik@free.fr> Acked-by: Marek Vasut <marex@denx.de> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
| * pwm: tiehrpwm: add missing __iomem annotationJingoo Han2013-09-031-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following sparse warnings: drivers/pwm/pwm-tiehrpwm.c:144:16: warning: incorrect type in argument 1 (different address spaces) drivers/pwm/pwm-tiehrpwm.c:144:16: expected void const volatile [noderef] <asn:2>*addr drivers/pwm/pwm-tiehrpwm.c:144:16: got void * drivers/pwm/pwm-tiehrpwm.c:149:9: warning: incorrect type in argument 2 (different address spaces) drivers/pwm/pwm-tiehrpwm.c:149:9: expected void volatile [noderef] <asn:2>*addr drivers/pwm/pwm-tiehrpwm.c:149:9: got void * drivers/pwm/pwm-tiehrpwm.c:157:18: warning: incorrect type in argument 1 (different address spaces) drivers/pwm/pwm-tiehrpwm.c:157:18: expected void const volatile [noderef] <asn:2>*addr drivers/pwm/pwm-tiehrpwm.c:157:18: got void * drivers/pwm/pwm-tiehrpwm.c:160:9: warning: incorrect type in argument 2 (different address spaces) drivers/pwm/pwm-tiehrpwm.c:160:9: expected void volatile [noderef] <asn:2>*addr drivers/pwm/pwm-tiehrpwm.c:160:9: got void * Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
| * pwm: tiecap: add CONFIG_PM_SLEEP to ecap_pwm_{save,restore}_context()Jingoo Han2013-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | ecap_pwm_save_context() and ecap_pwm_restore_context() are only used when CONFIG_PM_SLEEP is selected. drivers/pwm/pwm-tiecap.c:293:13: warning: 'ecap_pwm_save_context' defined but not used [-Wunused-function] drivers/pwm/pwm-tiecap.c:302:13: warning: 'ecap_pwm_restore_context' defined but not used [-Wunused-function] Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
| * pwm: simplify use of devm_ioremap_resourceJulia Lawall2013-09-033-14/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove unneeded error handling on the result of a call to platform_get_resource when the value is passed to devm_ioremap_resource. Move the call to platform_get_resource adjacent to the call to devm_ioremap_resource to make the connection between them more clear. A simplified version of the semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression pdev,res,n,e,e1; expression ret != 0; identifier l; @@ - res = platform_get_resource(pdev, IORESOURCE_MEM, n); ... when != res - if (res == NULL) { ... \(goto l;\|return ret;\) } ... when != res + res = platform_get_resource(pdev, IORESOURCE_MEM, n); e = devm_ioremap_resource(e1, res); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
| * pwm: renesas-tpu: Add DT supportLaurent Pinchart2013-09-032-7/+62
| | | | | | | | | | | | | | | | | | Specify DT bindings for the TPU PWM controller and add OF support to the driver. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Acked-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
| * ARM: dts: Use the PWM polarity flagsLaurent Pinchart2013-09-032-2/+4
| | | | | | | | | | | | | | | | | | Replace the numerical polarity flags with the PWM_POLARITY_INVERTED symbolic constant. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
| * pwm: Update DT bindings to reference pwm.txt for cells documentationLaurent Pinchart2013-09-0312-43/+29
| | | | | | | | | | | | | | | | | | | | The PWM client cells format is documented in the generic pwm.txt documentation and duplicated in all PWM driver bindings. Remove duplicate information and reference pwm.txt instead. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
| * pwm: Use the DT macro directly when parsing PWM DT flagsLaurent Pinchart2013-09-031-4/+3
| | | | | | | | | | | | | | | | | | Don't redefine a PWM_SPEC_POLARITY macro with a value identical to PWM_POLARITY_INVERTED, use the PWM DT macro directly. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
| * pwm: Add PWM polarity flag macro for DTLaurent Pinchart2013-09-032-3/+18
| | | | | | | | | | | | | | | | | | Define a PWM_POLARITY_INVERTED macro in include/dt-bindings/pwm/pwm.h to be used by device tree sources. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
| * pwm: mxs: Check the return value from stmp_reset_block()Fabio Estevam2013-09-031-1/+7
| | | | | | | | | | | | | | | | stmp_reset_block() may fail, so let's check its return value and propagate it in the case of error. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
| * pwm: convert class code to use dev_groupsGreg Kroah-Hartman2013-07-291-9/+12
| | | | | | | | | | | | | | | | | | The dev_attrs field of struct class is going away soon, dev_groups should be used instead. This converts the PWM class code to use the correct field. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
* | Merge tag 'please-pull-pstore' of ↵Linus Torvalds2013-09-039-140/+306
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull pstore changes from Tony Luck: "A big part of this is the addition of compression to the generic pstore layer so that all backends can use the pitiful amounts of storage they control more effectively. Three other small fixes/cleanups too. * tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: pstore/ram: (really) fix undefined usage of rounddown_pow_of_two pstore/ram: Read and write to the 'compressed' flag of pstore efi-pstore: Read and write to the 'compressed' flag of pstore erst: Read and write to the 'compressed' flag of pstore powerpc/pseries: Read and write to the 'compressed' flag of pstore pstore: Add file extension to pstore file if compressed pstore: Add decompression support to pstore pstore: Introduce new argument 'compressed' in the read callback pstore: Add compression support to pstore pstore/Kconfig: Select ZLIB_DEFLATE and ZLIB_INFLATE when PSTORE is selected pstore: Add new argument 'compressed' in pstore write callback powerpc/pseries: Remove (de)compression in nvram with pstore enabled pstore: d_alloc_name() doesn't return an ERR_PTR acpi/apei/erst: Add missing iounmap() on error in erst_exec_move_data()
| * | pstore/ram: (really) fix undefined usage of rounddown_pow_of_twoMaxime Bizon2013-08-301-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous attempt to fix was b042e47491ba5f487601b5141a3f1d8582304170 Suggested use of is_power_of_2() was bogus because is_power_of_2(0) is false (documented behaviour). Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | pstore/ram: Read and write to the 'compressed' flag of pstoreAruna Balakrishnaiah2013-08-191-8/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In pstore write, add character 'C'(compressed) or 'D'(decompressed) in the header while writing to Ram persistent buffer. In pstore read, read the header and update the 'compressed' flag accordingly. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | efi-pstore: Read and write to the 'compressed' flag of pstoreAruna Balakrishnaiah2013-08-191-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In pstore write, Efi will add a character 'C'(compressed) or D'(decompressed) in its header while writing to persistent store. In pstore read, read the header and update the 'compressed' flag accordingly. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | erst: Read and write to the 'compressed' flag of pstoreAruna Balakrishnaiah2013-08-191-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In pstore write, set the section type to CPER_SECTION_TYPE_DMESG_COMPR if the data is compressed. In pstore read, read the section type and update the 'compressed' flag accordingly. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | powerpc/pseries: Read and write to the 'compressed' flag of pstoreAruna Balakrishnaiah2013-08-191-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If data returned from pstore is compressed, nvram's write callback will add a flag ERR_TYPE_KERNEL_PANIC_GZ indicating the data is compressed while writing to nvram. If the data read from nvram is compressed, nvram's read callback will set the flag 'compressed'. The patch adds backward compatibilty with old format oops header when reading from pstore. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | pstore: Add file extension to pstore file if compressedAruna Balakrishnaiah2013-08-193-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case decompression fails, add a ".enc.z" to indicate the file has compressed data. This will help user space utilities to figure out the file contents. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | pstore: Add decompression support to pstoreAruna Balakrishnaiah2013-08-191-2/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on the flag 'compressed' set or not, pstore will decompress the data returning a plain text file. If decompression fails for a particular record it will have the compressed data in the file which can be decompressed with 'openssl' command line tool. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | pstore: Introduce new argument 'compressed' in the read callbackAruna Balakrishnaiah2013-08-196-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backends will set the flag 'compressed' after reading the log from persistent store to indicate the data being returned to pstore is compressed or not. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | pstore: Add compression support to pstoreAruna Balakrishnaiah2013-08-191-9/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add compression support to pstore which will help in capturing more data. Initially, pstore will make a call to kmsg_dump with a bigger buffer and will pass the size of bigger buffer to kmsg_dump and then compress the data to registered buffer of registered size. In case compression fails, pstore will capture the uncompressed data by making a call again to kmsg_dump with registered_buffer of registered size. Pstore will indicate the data is compressed or not with a flag in the write callback. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | pstore/Kconfig: Select ZLIB_DEFLATE and ZLIB_INFLATE when PSTORE is selectedAruna Balakrishnaiah2013-08-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Pstore will make use of deflate and inflate algorithm to compress and decompress the data. So when Pstore is enabled select zlib_deflate and zlib_inflate. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | pstore: Add new argument 'compressed' in pstore write callbackAruna Balakrishnaiah2013-08-196-11/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Addition of new argument 'compressed' in the write call back will help the backend to know if the data passed from pstore is compressed or not (In case where compression fails.). If compressed, the backend can add a tag indicating the data is compressed while writing to persistent store. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | powerpc/pseries: Remove (de)compression in nvram with pstore enabledAruna Balakrishnaiah2013-08-191-90/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (De)compression support is provided in pstore in subsequent patches which needs an additional argument 'compressed' to determine if the data is compressed or not. This patch will take care of removing (de)compression in nvram with pstore which was making use of 'hsize' argument in pstore write as 'hsize' will be removed in the subsequent patch. Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | pstore: d_alloc_name() doesn't return an ERR_PTRDan Carpenter2013-08-191-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | d_alloc_name() returns NULL on error. Also I changed the error code from -ENOSPC to -ENOMEM to reflect that we were short on RAM not disk space. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
| * | acpi/apei/erst: Add missing iounmap() on error in erst_exec_move_data()Wei Yongjun2013-08-191-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add the missing iounmap() before return from erst_exec_move_data() in the error handling case. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
* | | Merge branch 'for-3.12' of ↵Linus Torvalds2013-09-0324-1611/+1751
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "A lot of activities on the cgroup front. Most changes aren't visible to userland at all at this point and are laying foundation for the planned unified hierarchy. - The biggest change is decoupling the lifetime management of css (cgroup_subsys_state) from that of cgroup's. Because controllers (cpu, memory, block and so on) will need to be dynamically enabled and disabled, css which is the association point between a cgroup and a controller may come and go dynamically across the lifetime of a cgroup. Till now, css's were created when the associated cgroup was created and stayed till the cgroup got destroyed. Assumptions around this tight coupling permeated through cgroup core and controllers. These assumptions are gradually removed, which consists bulk of patches, and css destruction path is completely decoupled from cgroup destruction path. Note that decoupling of creation path is relatively easy on top of these changes and the patchset is pending for the next window. - cgroup has its own event mechanism cgroup.event_control, which is only used by memcg. It is overly complex trying to achieve high flexibility whose benefits seem dubious at best. Going forward, new events will simply generate file modified event and the existing mechanism is being made specific to memcg. This pull request contains prepatory patches for such change. - Various fixes and cleanups" Fixed up conflict in kernel/cgroup.c as per Tejun. * 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (69 commits) cgroup: fix cgroup_css() invocation in css_from_id() cgroup: make cgroup_write_event_control() use css_from_dir() instead of __d_cgrp() cgroup: make cgroup_event hold onto cgroup_subsys_state instead of cgroup cgroup: implement CFTYPE_NO_PREFIX cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsys cgroup: rename cgroup_css_from_dir() to css_from_dir() and update its syntax cgroup: fix cgroup_write_event_control() cgroup: fix subsystem file accesses on the root cgroup cgroup: change cgroup_from_id() to css_from_id() cgroup: use css_get() in cgroup_create() to check CSS_ROOT cpuset: remove an unncessary forward declaration cgroup: RCU protect each cgroup_subsys_state release cgroup: move subsys file removal to kill_css() cgroup: factor out kill_css() cgroup: decouple cgroup_subsys_state destruction from cgroup destruction cgroup: replace cgroup->css_kill_cnt with ->nr_css cgroup: bounce cgroup_subsys_state ref kill confirmation to a work item cgroup: move cgroup->subsys[] assignment to online_css() cgroup: reorganize css init / exit paths cgroup: add __rcu modifier to cgroup->subsys[] ...
| * | | cgroup: fix cgroup_css() invocation in css_from_id()Tejun Heo2013-08-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ca8bdcaff0 ("cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsys") missed one conversion in css_from_id(), which was newly added. As css_from_id() doesn't have any user yet, this doesn't break anything other than generating a build warning. Convert it. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: kbuild test robot <fengguang.wu@intel.com>
| * | | cgroup: make cgroup_write_event_control() use css_from_dir() instead of ↵Tejun Heo2013-08-261-13/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __d_cgrp() cgroup_event will be moved to its only user - memcg. Replace __d_cgrp() usage with css_from_dir(), which is already exported. This also simplifies the code a bit. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
| * | | cgroup: make cgroup_event hold onto cgroup_subsys_state instead of cgroupTejun Heo2013-08-261-14/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, each registered cgroup_event holds an extra reference to the cgroup. This is a bit weird as events are subsystem specific and will also be incorrect in the planned unified hierarchy as css (cgroup_subsys_state) may come and go dynamically across the lifetime of a cgroup. Holding onto cgroup won't prevent the target css from going away. Update cgroup_event to hold onto the css the traget file belongs to instead of cgroup. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
| * | | cgroup: implement CFTYPE_NO_PREFIXTejun Heo2013-08-262-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When cgroup files are created, cgroup core automatically prepends the name of the subsystem as prefix. This patch adds CFTYPE_NO_ which disables the automatic prefix. This is to work around historical baggages and shouldn't be used for new files. This will be used to move "cgroup.event_control" from cgroup core to memcg. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Glauber Costa <glommer@gmail.com>
| * | | cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsysTejun Heo2013-08-261-47/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cgroup_css() is no longer used in hot paths. Make it take struct cgroup_subsys * and allow the users to specify NULL subsys to obtain the dummy_css. This removes open-coded NULL subsystem testing in a couple users and generally simplifies the code. After this patch, css_from_dir() also allows NULL @ss and returns the matching dummy_css. This behavior change doesn't affect its only user - perf. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
| * | | cgroup: rename cgroup_css_from_dir() to css_from_dir() and update its syntaxTejun Heo2013-08-263-18/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cgroup_css_from_dir() will grow another user. In preparation, make the following changes. * All css functions are prefixed with just "css_", rename it to css_from_dir(). * Take dentry * instead of file * as dentry is what ultimately identifies a cgroup and file may not always be available. Note that the function now checkes whether @dentry->d_inode is NULL as the caller now may specify a negative dentry. * Make it take cgroup_subsys * instead of integer subsys_id. This simplifies the function and allows specifying no subsystem for cgroup->dummy_css. * Make return section a bit less verbose. This patch doesn't introduce any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com>
| * | | cgroup: fix cgroup_write_event_control()Tejun Heo2013-08-191-4/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 81eeaf0411 ("cgroup: make cftype->[un]register_event() deal with cgroup_subsys_state inst ead of cgroup") updated the cftype event methods to take @css (cgroup_subsys_state) instead of @cgroup; however, it incorrectly used @css passed to cgroup_write_event_control(), which the dummy_css for the cgroup as the file is a cgroup core file. This leads to oops on event registration. Fix it by using the css matching the event target file. Note that cgroup_write_event_control() now disallows cgroup core files from being event sources. This is for simplicity and doesn't matter as cgroup_event will be moved and made specific to memcg. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
| * | | cgroup: fix subsystem file accesses on the root cgroupTejun Heo2013-08-191-14/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 105347ba5 ("cgroup: make cgroup_file_open() rcu_read_lock() around cgroup_css() and add cfent->css") added cfent->css to cache the associted cgroup_subsys_state across file operations. A cfent is associated with single css throughout its lifetime and the origimal commit initialized the cache pointer during cgroup_add_file() and verified that it matches the actual one in cgroup_file_open(). While this works fine for !root cgroups, it's broken for root cgroups as files in a root cgroup are created before the css's are associated with the cgroup and thus cgroup_css() call in cgroup_add_file() returns NULL associating all cfents in the root cgroup with NULL css. This makes cgroup_file_open() trigger WARN and fail with -ENODEV for all !core subsystem files in the root cgroups. There's no reason to initialize cfent->css separately from cgroup_add_file(). As the association never changes, cgroup_file_open() can set it unconditionally every time and containing the logic in cgroup_file_open() makes more sense anyway as the only reason it's necessary is file->private_data being already occupied. Fix it by setting cfent->css unconditionally from cgroup_file_open(). Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
| * | | cgroup: change cgroup_from_id() to css_from_id()Li Zefan2013-08-192-18/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now we want cgroup core to always provide the css to use to the subsystems, so change this API to css_from_id(). Uninline css_from_id(), because it's getting bigger and cgroup_css() has been unexported. While at it, remove the #ifdef, and shuffle the order of the args. Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | cgroup: use css_get() in cgroup_create() to check CSS_ROOTLi Zhong2013-08-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems that the root css doesn't have refcnt allocated(not needed?), and would cause the booting error attached. This patch tries to use css_get() to not increase the refcnt if parent is root. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff810b37cc>] cgroup_mkdir+0x37c/0x740 PGD 0 Oops: 0002 [#1] Modules linked in: CPU: 0 PID: 1 Comm: systemd Not tainted 3.11.0-rc5-next-20130815+ #1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff88007f868000 ti: ffff88007f864000 task.ti: ffff88007f864000 RIP: 0010:[<ffffffff810b37cc>] [<ffffffff810b37cc>] cgroup_mkdir+0x37c/0x740 RSP: 0018:ffff88007f865df8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffff81a46ee0 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff81a415c0 RBP: ffff88007f865ec8 R08: 0000000000000001 R09: 0000000000000000 R10: ffff88007ce6d060 R11: 0000000000000000 R12: ffff88007ce6d000 R13: ffff88007ce6d060 R14: ffffffff81a46d80 R15: ffff88007c6e8018 FS: 00007f13dbf6f840(0000) GS:ffffffff81a23000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000007b7e5000 CR4: 00000000000006b0 Stack: ffffffff810b380d 0000000000000002 ffff88007f865e18 ffffffff81167069 ffff88007f865ed8 ffffffff8116a3f5 ffff880037454400 ffff88007c6e8018 ffff88007c6e8028 ffff88007c6e8328 ffff88007c6e8000 ffff88007ce6d000 Call Trace: [<ffffffff810b380d>] ? cgroup_mkdir+0x3bd/0x740 [<ffffffff81167069>] ? lookup_hash+0x19/0x20 [<ffffffff8116a3f5>] ? kern_path_create+0x95/0x170 [<ffffffff8116ce3e>] vfs_mkdir+0x9e/0xf0 [<ffffffff8116d7a0>] SyS_mkdirat+0x60/0xe0 [<ffffffff8116d839>] SyS_mkdir+0x19/0x20 [<ffffffff814c960d>] tracesys+0xcf/0xd4 Code: ad 70 ff ff ff 48 89 9d 60 ff ff ff 4d 89 d5 4c 8b bd 68 ff ff ff 4c 8b 65 88 eb 50 0f 1f 00 48 8b 43 18 a8 03 0f 85 6c 03 00 00 <ff> 00 e8 1d 0a fb ff 85 c0 74 0d 80 3d f0 45 a1 00 00 0f 84 4c RIP [<ffffffff810b37cc>] cgroup_mkdir+0x37c/0x740 RSP <ffff88007f865df8> CR2: 0000000000000000 ---[ end trace a4b14b49bc46fd60 ]--- Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Acked-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | cpuset: remove an unncessary forward declarationLi Zefan2013-08-131-3/+0
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Li Zefan <lizefan@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org>
| * | | cgroup: RCU protect each cgroup_subsys_state releaseTejun Heo2013-08-132-17/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the planned unified hierarchy, individual css's will be created and destroyed dynamically across the lifetime of a cgroup. To enable such usages, css destruction is being decoupled from cgroup destruction. Most of the destruction path has been decoupled but the actual free of css still depends on cgroup free path. When all css refs are drained, css_release() kicks off css_free_work_fn() which puts the cgroup. When the cgroup refcnt reaches zero, cgroup_diput() is invoked which in turn schedules RCU free of the cgroup. After a grace period, all css's are freed along with the cgroup itself. This patch moves the RCU grace period and css freeing from cgroup release path to css release path. css_release(), instead of kicking off css_free_work_fn() directly, schedules RCU callback css_free_rcu_fn() which in turn kicks off css_free_work_fn() after a RCU grace period. css_free_work_fn() is updated to free the css directly. The five-way punting - percpu ref kill confirmation, a work item, percpu ref release, RCU grace period, and again a work item - is quite hairy but the work items are there only to provide process context and the actual sequence is kill confirm -> release -> RCU free, which isn't simple but not too crazy. This removes cgroup_css() usage after offline_css() allowing clearing cgroup->subsys[] from offline_css(), which makes it consistent with online_css() and brings it closer to proper lifetime management for individual css's. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
| * | | cgroup: move subsys file removal to kill_css()Tejun Heo2013-08-131-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the planned unified hierarchy, individual css's will be created and destroyed dynamically across the lifetime of a cgroup. To enable such usages, css destruction is being decoupled from cgroup destruction. This patch moves subsys file removal from cgroup_destroy_locked() to kill_css(). While this changes the order of destruction operations, the changes shouldn't be noticeable to cgroup subsystems or userland. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>
| * | | cgroup: factor out kill_css()Tejun Heo2013-08-131-23/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Factor out css ref killing from cgroup_destroy_locked() into kill_css(). We're gonna add more to the path and the factored out function will eventually be called from other places too. While at it, replace open coded percpu_ref_get() with css_get() for consistency. This shouldn't cause any functional difference as the function is not used for root cgroups. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Li Zefan <lizefan@huawei.com>