From e6639117d624d5c8f531d22456a69e38dc23c501 Mon Sep 17 00:00:00 2001 From: Peter De Schrijver Date: Thu, 12 Jun 2014 18:58:27 +0300 Subject: kernel: add calibration_delay_done() Add calibration_delay_done() call and dummy implementation. This allows architectures to stop accepting registrations for new timer based delay functions. Signed-off-by: Peter De Schrijver Acked-by: Russell King Signed-off-by: Stephen Warren --- init/calibrate.c | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'init') diff --git a/init/calibrate.c b/init/calibrate.c index 520702db9acc..ce635dccf3d9 100644 --- a/init/calibrate.c +++ b/init/calibrate.c @@ -262,6 +262,15 @@ unsigned long __attribute__((weak)) calibrate_delay_is_known(void) return 0; } +/* + * Indicate the cpu delay calibration is done. This can be used by + * architectures to stop accepting delay timer registrations after this point. + */ + +void __attribute__((weak)) calibration_delay_done(void) +{ +} + void calibrate_delay(void) { unsigned long lpj; @@ -301,4 +310,6 @@ void calibrate_delay(void) loops_per_jiffy = lpj; printed = true; + + calibration_delay_done(); } -- cgit v1.2.1 From 23b2899f7f194f06e09b52a1f46f027a21fae17c Mon Sep 17 00:00:00 2001 From: "Luis R. Rodriguez" Date: Wed, 6 Aug 2014 16:08:56 -0700 Subject: printk: allow increasing the ring buffer depending on the number of CPUs The default size of the ring buffer is too small for machines with a large amount of CPUs under heavy load. What ends up happening when debugging is the ring buffer overlaps and chews up old messages making debugging impossible unless the size is passed as a kernel parameter. An idle system upon boot up will on average spew out only about one or two extra lines but where this really matters is on heavy load and that will vary widely depending on the system and environment. There are mechanisms to help increase the kernel ring buffer for tracing through debugfs, and those interfaces even allow growing the kernel ring buffer per CPU. We also have a static value which can be passed upon boot. Relying on debugfs however is not ideal for production, and relying on the value passed upon bootup is can only used *after* an issue has creeped up. Instead of being reactive this adds a proactive measure which lets you scale the amount of contributions you'd expect to the kernel ring buffer under load by each CPU in the worst case scenario. We use num_possible_cpus() to avoid complexities which could be introduced by dynamically changing the ring buffer size at run time, num_possible_cpus() lets us use the upper limit on possible number of CPUs therefore avoiding having to deal with hotplugging CPUs on and off. This introduces the kernel configuration option LOG_CPU_MAX_BUF_SHIFT which is used to specify the maximum amount of contributions to the kernel ring buffer in the worst case before the kernel ring buffer flips over, the size is specified as a power of 2. The total amount of contributions made by each CPU must be greater than half of the default kernel ring buffer size (1 << LOG_BUF_SHIFT bytes) in order to trigger an increase upon bootup. The kernel ring buffer is increased to the next power of two that would fit the required minimum kernel ring buffer size plus the additional CPU contribution. For example if LOG_BUF_SHIFT is 18 (256 KB) you'd require at least 128 KB contributions by other CPUs in order to trigger an increase of the kernel ring buffer. With a LOG_CPU_BUF_SHIFT of 12 (4 KB) you'd require at least anything over > 64 possible CPUs to trigger an increase. If you had 128 possible CPUs the amount of minimum required kernel ring buffer bumps to: ((1 << 18) + ((128 - 1) * (1 << 12))) / 1024 = 764 KB Since we require the ring buffer to be a power of two the new required size would be 1024 KB. This CPU contributions are ignored when the "log_buf_len" kernel parameter is used as it forces the exact size of the ring buffer to an expected power of two value. [pmladek@suse.cz: fix build] Signed-off-by: Luis R. Rodriguez Signed-off-by: Petr Mladek Tested-by: Davidlohr Bueso Tested-by: Petr Mladek Reviewed-by: Davidlohr Bueso Cc: Andrew Lunn Cc: Stephen Warren Cc: Michal Hocko Cc: Petr Mladek Cc: Joe Perches Cc: Arun KS Cc: Kees Cook Cc: Davidlohr Bueso Cc: Chris Metcalf Cc: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/Kconfig | 46 ++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 42 insertions(+), 4 deletions(-) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index 41066e49e880..a291b7ef4738 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -807,15 +807,53 @@ config LOG_BUF_SHIFT range 12 21 default 17 help - Select kernel log buffer size as a power of 2. + Select the minimal kernel log buffer size as a power of 2. + The final size is affected by LOG_CPU_MAX_BUF_SHIFT config + parameter, see below. Any higher size also might be forced + by "log_buf_len" boot parameter. + Examples: - 17 => 128 KB + 17 => 128 KB 16 => 64 KB - 15 => 32 KB - 14 => 16 KB + 15 => 32 KB + 14 => 16 KB 13 => 8 KB 12 => 4 KB +config LOG_CPU_MAX_BUF_SHIFT + int "CPU kernel log buffer size contribution (13 => 8 KB, 17 => 128KB)" + range 0 21 + default 12 if !BASE_SMALL + default 0 if BASE_SMALL + help + This option allows to increase the default ring buffer size + according to the number of CPUs. The value defines the contribution + of each CPU as a power of 2. The used space is typically only few + lines however it might be much more when problems are reported, + e.g. backtraces. + + The increased size means that a new buffer has to be allocated and + the original static one is unused. It makes sense only on systems + with more CPUs. Therefore this value is used only when the sum of + contributions is greater than the half of the default kernel ring + buffer as defined by LOG_BUF_SHIFT. The default values are set + so that more than 64 CPUs are needed to trigger the allocation. + + Also this option is ignored when "log_buf_len" kernel parameter is + used as it forces an exact (power of two) size of the ring buffer. + + The number of possible CPUs is used for this computation ignoring + hotplugging making the compuation optimal for the the worst case + scenerio while allowing a simple algorithm to be used from bootup. + + Examples shift values and their meaning: + 17 => 128 KB for each CPU + 16 => 64 KB for each CPU + 15 => 32 KB for each CPU + 14 => 16 KB for each CPU + 13 => 8 KB for each CPU + 12 => 4 KB for each CPU + # # Architectures with an unreliable sched_clock() should select this: # -- cgit v1.2.1 From 4dfe694f616e00e6fd83e5bbcd7a3c4d7113493d Mon Sep 17 00:00:00 2001 From: Paul Gortmaker Date: Fri, 8 Aug 2014 14:19:46 -0700 Subject: init: make rootdelay=N consistent with rootwait behaviour Currently rootdelay=N and rootwait behave differently (aside from the obvious unbounded wait duration) because they are at different places in the init sequence. The difference manifests itself for md devices because the call to md_run_setup() lives between rootdelay and rootwait, so if you try to use rootdelay=20 to try and allow a slow RAID0 array to assemble, you get this: [ 4.526011] sd 6:0:0:0: [sdc] Attached SCSI removable disk [ 22.972079] md: Waiting for all devices to be available before autodetect i.e. you've achieved nothing other than delaying the probing 20s, when what you wanted was a 20s delay _after_ the probing for md devices was initiated. Here we move the rootdelay code to be right beside the rootwait code, so that their behaviour is consistent. It should be noted that in doing so, the actions based on the saved_root_name[0] and initrd_load() were previously put on hold by rootdelay=N and now currently will not be delayed. However, I think consistent behaviour is more important than matching historical behaviour of delaying the above two operations. Signed-off-by: Paul Gortmaker Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/do_mounts.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) (limited to 'init') diff --git a/init/do_mounts.c b/init/do_mounts.c index 82f22885c87e..b6237c31b0e2 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -539,12 +539,6 @@ void __init prepare_namespace(void) { int is_floppy; - if (root_delay) { - printk(KERN_INFO "Waiting %d sec before mounting root device...\n", - root_delay); - ssleep(root_delay); - } - /* * wait for the known devices to complete their probing * @@ -571,6 +565,12 @@ void __init prepare_namespace(void) if (initrd_load()) goto out; + if (root_delay) { + pr_info("Waiting %d sec before mounting root device...\n", + root_delay); + ssleep(root_delay); + } + /* wait for any asynchronous scanning to complete */ if ((ROOT_DEV == 0) && root_wait) { printk(KERN_INFO "Waiting for root device %s...\n", -- cgit v1.2.1 From 38747439914c468ecba70b492b54dc4ef0b50453 Mon Sep 17 00:00:00 2001 From: Yinghai Lu Date: Fri, 8 Aug 2014 14:23:12 -0700 Subject: initramfs: support initrd that is bigger than 2GiB When initrd (compressed or not) is used, kernel report data corrupted with /dev/ram0. The root cause: During initramfs checking, if it is initrd, it will be transferred to /initrd.image with sys_write. sys_write only support 2G-4K write, so if the initrd ram is more than that, /initrd.image will not complete at all. Add local xwrite to loop calling sys_write to workaround the problem. Also need to use xwrite in write_buffer() to handle: image is uncompressed cpio and there is one big file (>2G) in it. unpack_to_rootfs ===> write_buffer ===> actions[]/do_copy At the same time, we don't need to worry about sys_read/sys_write in do_mounts_rd.c::crd_load. As decompressor will have fill/flush and local buffer that is smaller than 2G. Test with uncompressed initrd, and compressed ones with gz, bz2, lzma,xz, lzop. Signed-off-by: Yinghai Lu Acked-by: H. Peter Anvin Cc: Ingo Molnar Cc: Geert Uytterhoeven Cc: Tetsuo Handa Cc: "Daniel M. Weeks" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/initramfs.c | 36 ++++++++++++++++++++++++++++++++---- 1 file changed, 32 insertions(+), 4 deletions(-) (limited to 'init') diff --git a/init/initramfs.c b/init/initramfs.c index a8497fab1c3d..4f276b6a167b 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -19,6 +19,29 @@ #include #include +static ssize_t __init xwrite(int fd, const char *p, size_t count) +{ + ssize_t out = 0; + + /* sys_write only can write MAX_RW_COUNT aka 2G-4K bytes at most */ + while (count) { + ssize_t rv = sys_write(fd, p, count); + + if (rv < 0) { + if (rv == -EINTR || rv == -EAGAIN) + continue; + return out ? out : rv; + } else if (rv == 0) + break; + + p += rv; + out += rv; + count -= rv; + } + + return out; +} + static __initdata char *message; static void __init error(char *x) { @@ -346,7 +369,7 @@ static int __init do_name(void) static int __init do_copy(void) { if (count >= body_len) { - sys_write(wfd, victim, body_len); + xwrite(wfd, victim, body_len); sys_close(wfd); do_utime(vcollected, mtime); kfree(vcollected); @@ -354,7 +377,7 @@ static int __init do_copy(void) state = SkipIt; return 0; } else { - sys_write(wfd, victim, count); + xwrite(wfd, victim, count); body_len -= count; eat(count); return 1; @@ -603,8 +626,13 @@ static int __init populate_rootfs(void) fd = sys_open("/initrd.image", O_WRONLY|O_CREAT, 0700); if (fd >= 0) { - sys_write(fd, (char *)initrd_start, - initrd_end - initrd_start); + ssize_t written = xwrite(fd, (char *)initrd_start, + initrd_end - initrd_start); + + if (written != initrd_end - initrd_start) + pr_err("/initrd.image: incomplete write (%zd != %ld)\n", + written, initrd_end - initrd_start); + sys_close(fd); free_initrd(); } -- cgit v1.2.1 From d97b07c54f34e88352ebe676beb798c8f59ac588 Mon Sep 17 00:00:00 2001 From: Yinghai Lu Date: Fri, 8 Aug 2014 14:23:14 -0700 Subject: initramfs: support initramfs that is bigger than 2GiB Now with 64bit bzImage and kexec tools, we support ramdisk that size is bigger than 2g, as we could put it above 4G. Found compressed initramfs image could not be decompressed properly. It turns out that image length is int during decompress detection, and it will become < 0 when length is more than 2G. Furthermore, during decompressing len as int is used for inbuf count, that has problem too. Change len to long, that should be ok as on 32 bit platform long is 32bits. Tested with following compressed initramfs image as root with kexec. gzip, bzip2, xz, lzma, lzop, lz4. run time for populate_rootfs(): size name Nehalem-EX Westmere-EX Ivybridge-EX 9034400256 root_img : 26s 24s 30s 3561095057 root_img.lz4 : 28s 27s 27s 3459554629 root_img.lzo : 29s 29s 28s 3219399480 root_img.gz : 64s 62s 49s 2251594592 root_img.xz : 262s 260s 183s 2226366598 root_img.lzma: 386s 376s 277s 2901482513 root_img.bz2 : 635s 599s Signed-off-by: Yinghai Lu Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Rashika Kheria Cc: Josh Triplett Cc: Kyungsik Lee Cc: P J P Cc: Al Viro Cc: Tetsuo Handa Cc: "Daniel M. Weeks" Cc: Alexandre Courbot Cc: Jan Beulich Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/do_mounts_rd.c | 10 +++++----- init/initramfs.c | 22 +++++++++++----------- 2 files changed, 16 insertions(+), 16 deletions(-) (limited to 'init') diff --git a/init/do_mounts_rd.c b/init/do_mounts_rd.c index a8227022e3a0..e5d059e8aa11 100644 --- a/init/do_mounts_rd.c +++ b/init/do_mounts_rd.c @@ -311,9 +311,9 @@ static int exit_code; static int decompress_error; static int crd_infd, crd_outfd; -static int __init compr_fill(void *buf, unsigned int len) +static long __init compr_fill(void *buf, unsigned long len) { - int r = sys_read(crd_infd, buf, len); + long r = sys_read(crd_infd, buf, len); if (r < 0) printk(KERN_ERR "RAMDISK: error while reading compressed data"); else if (r == 0) @@ -321,13 +321,13 @@ static int __init compr_fill(void *buf, unsigned int len) return r; } -static int __init compr_flush(void *window, unsigned int outcnt) +static long __init compr_flush(void *window, unsigned long outcnt) { - int written = sys_write(crd_outfd, window, outcnt); + long written = sys_write(crd_outfd, window, outcnt); if (written != outcnt) { if (decompress_error == 0) printk(KERN_ERR - "RAMDISK: incomplete write (%d != %d)\n", + "RAMDISK: incomplete write (%ld != %ld)\n", written, outcnt); decompress_error = 1; return -1; diff --git a/init/initramfs.c b/init/initramfs.c index 4f276b6a167b..a7566031242e 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -197,7 +197,7 @@ static __initdata enum state { } state, next_state; static __initdata char *victim; -static __initdata unsigned count; +static unsigned long count __initdata; static __initdata loff_t this_header, next_header; static inline void __init eat(unsigned n) @@ -209,7 +209,7 @@ static inline void __init eat(unsigned n) static __initdata char *vcollected; static __initdata char *collected; -static __initdata int remains; +static long remains __initdata; static __initdata char *collect; static void __init read_into(char *buf, unsigned size, enum state next) @@ -236,7 +236,7 @@ static int __init do_start(void) static int __init do_collect(void) { - unsigned n = remains; + unsigned long n = remains; if (count < n) n = count; memcpy(collect, victim, n); @@ -407,7 +407,7 @@ static __initdata int (*actions[])(void) = { [Reset] = do_reset, }; -static int __init write_buffer(char *buf, unsigned len) +static long __init write_buffer(char *buf, unsigned long len) { count = len; victim = buf; @@ -417,11 +417,11 @@ static int __init write_buffer(char *buf, unsigned len) return len - count; } -static int __init flush_buffer(void *bufv, unsigned len) +static long __init flush_buffer(void *bufv, unsigned long len) { char *buf = (char *) bufv; - int written; - int origLen = len; + long written; + long origLen = len; if (message) return -1; while ((written = write_buffer(buf, len)) < len && !message) { @@ -440,13 +440,13 @@ static int __init flush_buffer(void *bufv, unsigned len) return origLen; } -static unsigned my_inptr; /* index of next byte to be processed in inbuf */ +static unsigned long my_inptr; /* index of next byte to be processed in inbuf */ #include -static char * __init unpack_to_rootfs(char *buf, unsigned len) +static char * __init unpack_to_rootfs(char *buf, unsigned long len) { - int written, res; + long written; decompress_fn decompress; const char *compress_name; static __initdata char msg_buf[64]; @@ -480,7 +480,7 @@ static char * __init unpack_to_rootfs(char *buf, unsigned len) decompress = decompress_method(buf, len, &compress_name); pr_debug("Detected %s compressed data\n", compress_name); if (decompress) { - res = decompress(buf, len, NULL, flush_buffer, NULL, + int res = decompress(buf, len, NULL, flush_buffer, NULL, &my_inptr, error); if (res) error("decompressor failed"); -- cgit v1.2.1 From 9687fd9101afaa1c4b1de7ffd2f9d7e53f45b29f Mon Sep 17 00:00:00 2001 From: David Engraf Date: Fri, 8 Aug 2014 14:23:16 -0700 Subject: initramfs: add write error checks On a system with low memory extracting the initramfs may fail. If this happens the user gets "Failed to execute /init" instead of an initramfs error. Check return value of sys_write and call error() when the write was incomplete or failed. Signed-off-by: David Engraf Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/initramfs.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'init') diff --git a/init/initramfs.c b/init/initramfs.c index a7566031242e..bece48c3461e 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -369,7 +369,8 @@ static int __init do_name(void) static int __init do_copy(void) { if (count >= body_len) { - xwrite(wfd, victim, body_len); + if (xwrite(wfd, victim, body_len) != body_len) + error("write error"); sys_close(wfd); do_utime(vcollected, mtime); kfree(vcollected); @@ -377,7 +378,8 @@ static int __init do_copy(void) state = SkipIt; return 0; } else { - xwrite(wfd, victim, count); + if (xwrite(wfd, victim, count) != count) + error("write error"); body_len -= count; eat(count); return 1; -- cgit v1.2.1 From dd4d9fecbeba893e9ce2488e4d619e5397a2712a Mon Sep 17 00:00:00 2001 From: Fabian Frederick Date: Fri, 8 Aug 2014 14:23:44 -0700 Subject: init/main.c: code clean-up Fixing some checkpatch warnings(remove global initialization, move __initdata, coalesce formats ...) Signed-off-by: Fabian Frederick Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/main.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) (limited to 'init') diff --git a/init/main.c b/init/main.c index e8ae1fef0908..bb1aed928f21 100644 --- a/init/main.c +++ b/init/main.c @@ -6,7 +6,7 @@ * GK 2/5/95 - Changed to support mounting root fs via NFS * Added initrd & change_root: Werner Almesberger & Hans Lermen, Feb '96 * Moan early if gcc is old, avoiding bogus kernels - Paul Gortmaker, May '96 - * Simplified starting of init: Michael A. Griffith + * Simplified starting of init: Michael A. Griffith */ #define DEBUG /* Enable initcall_debug */ @@ -136,7 +136,7 @@ static char *ramdisk_execute_command; * Used to generate warnings if static_key manipulation functions are used * before jump_label_init is called. */ -bool static_key_initialized __read_mostly = false; +bool static_key_initialized __read_mostly; EXPORT_SYMBOL_GPL(static_key_initialized); /* @@ -159,8 +159,8 @@ static int __init set_reset_devices(char *str) __setup("reset_devices", set_reset_devices); -static const char * argv_init[MAX_INIT_ARGS+2] = { "init", NULL, }; -const char * envp_init[MAX_INIT_ENVS+2] = { "HOME=/", "TERM=linux", NULL, }; +static const char *argv_init[MAX_INIT_ARGS+2] = { "init", NULL, }; +const char *envp_init[MAX_INIT_ENVS+2] = { "HOME=/", "TERM=linux", NULL, }; static const char *panic_later, *panic_param; extern const struct obs_kernel_param __setup_start[], __setup_end[]; @@ -199,7 +199,6 @@ static int __init obsolete_checksetup(char *line) * still work even if initially too large, it will just take slightly longer */ unsigned long loops_per_jiffy = (1<<12); - EXPORT_SYMBOL(loops_per_jiffy); static int __init debug_kernel(char *str) @@ -376,8 +375,8 @@ static void __init setup_command_line(char *command_line) initcall_command_line = memblock_virt_alloc(strlen(boot_command_line) + 1, 0); static_command_line = memblock_virt_alloc(strlen(command_line) + 1, 0); - strcpy (saved_command_line, boot_command_line); - strcpy (static_command_line, command_line); + strcpy(saved_command_line, boot_command_line); + strcpy(static_command_line, command_line); } /* @@ -445,8 +444,8 @@ void __init parse_early_options(char *cmdline) /* Arch code calls this early on, or if not, just before other parsing. */ void __init parse_early_param(void) { - static __initdata int done = 0; - static __initdata char tmp_cmdline[COMMAND_LINE_SIZE]; + static int done __initdata; + static char tmp_cmdline[COMMAND_LINE_SIZE] __initdata; if (done) return; @@ -500,7 +499,8 @@ static void __init mm_init(void) asmlinkage __visible void __init start_kernel(void) { - char * command_line, *after_dashes; + char *command_line; + char *after_dashes; extern const struct kernel_param __start___param[], __stop___param[]; /* @@ -572,7 +572,8 @@ asmlinkage __visible void __init start_kernel(void) * fragile until we cpu_idle() for the first time. */ preempt_disable(); - if (WARN(!irqs_disabled(), "Interrupts were enabled *very* early, fixing it\n")) + if (WARN(!irqs_disabled(), + "Interrupts were enabled *very* early, fixing it\n")) local_irq_disable(); idr_init_cache(); rcu_init(); -- cgit v1.2.1 From de5b56ba51f63973ceb5c184ee0855f0c8a13fc9 Mon Sep 17 00:00:00 2001 From: Vivek Goyal Date: Fri, 8 Aug 2014 14:25:41 -0700 Subject: kernel: build bin2c based on config option CONFIG_BUILD_BIN2C currently bin2c builds only if CONFIG_IKCONFIG=y. But bin2c will now be used by kexec too. So make it compilation dependent on CONFIG_BUILD_BIN2C and this config option can be selected by CONFIG_KEXEC and CONFIG_IKCONFIG. Signed-off-by: Vivek Goyal Cc: Borislav Petkov Cc: Michael Kerrisk Cc: Yinghai Lu Cc: Eric Biederman Cc: H. Peter Anvin Cc: Matthew Garrett Cc: Greg Kroah-Hartman Cc: Dave Young Cc: WANG Chao Cc: Baoquan He Cc: Andy Lutomirski Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/Kconfig | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index a291b7ef4738..44f9ed3dae22 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -783,8 +783,13 @@ endchoice endmenu # "RCU Subsystem" +config BUILD_BIN2C + bool + default n + config IKCONFIG tristate "Kernel .config support" + select BUILD_BIN2C ---help--- This option enables the complete Linux kernel ".config" file contents to be saved in the kernel. It provides documentation -- cgit v1.2.1 From a2a368d905472293d4e13d09fdd9e537edc74347 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Tue, 12 Aug 2014 13:46:11 -0700 Subject: mm: fix CROSS_MEMORY_ATTACH help text grammar Signed-off-by: Geert Uytterhoeven Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index 44f9ed3dae22..e84c6423a2e5 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -268,7 +268,7 @@ config CROSS_MEMORY_ATTACH help Enabling this option adds the system calls process_vm_readv and process_vm_writev which allow a process with the correct privileges - to directly read from or write to to another process's address space. + to directly read from or write to another process' address space. See the man page for more details. config FHANDLE -- cgit v1.2.1 From d3ac21cacc24790eb45d735769f35753f5b56ceb Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Sun, 17 Aug 2014 19:41:09 -0500 Subject: mm: Support compiling out madvise and fadvise Many embedded systems will not need these syscalls, and omitting them saves space. Add a new EXPERT config option CONFIG_ADVISE_SYSCALLS (default y) to support compiling them out. bloat-o-meter: add/remove: 0/3 grow/shrink: 0/0 up/down: 0/-2250 (-2250) function old new delta sys_fadvise64 57 - -57 sys_fadvise64_64 691 - -691 sys_madvise 1502 - -1502 Signed-off-by: Josh Triplett --- init/Kconfig | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index e84c6423a2e5..782a65bf76ea 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1537,6 +1537,16 @@ config AIO by some high performance threaded applications. Disabling this option saves about 7k. +config ADVISE_SYSCALLS + bool "Enable madvise/fadvise syscalls" if EXPERT + default y + help + This option enables the madvise and fadvise syscalls, used by + applications to advise the kernel about their future memory or file + usage, improving performance. If building an embedded system where no + applications use these syscalls, you can disable this option to save + space. + config PCI_QUIRKS default y bool "Enable PCI quirk workarounds" if EXPERT -- cgit v1.2.1 From beb50df39e91745604ce3cb9dc6a503f39f4383d Mon Sep 17 00:00:00 2001 From: Bertrand Jacquin Date: Wed, 27 Aug 2014 20:31:56 +0930 Subject: kbuild: handle module compression while running 'make modules_install'. Since module-init-tools (gzip) and kmod (gzip and xz) support compressed modules, it could be useful to include a support for compressing modules right after having them installed. Doing this in kbuild instead of per distro can permit to make this kind of usage more generic. This patch add a Kconfig entry to "Enable loadable module support" menu and let you choose to compress using gzip (default) or xz. Both gzip and xz does not used any extra -[1-9] option since Andi Kleen and Rusty Russell prove no gain is made using them. gzip is called with -n argument to avoid storing original filename inside compressed file, that way we can save some more bytes. On a v3.16 kernel, 'make allmodconfig' generated 4680 modules for a total of 378MB (no strip, no sign, no compress), the following table shows observed disk space gain based on the allmodconfig .config : | time | +-------------+-----------------+ | manual .ko | make | size | percent | compression | modules_install | | gain +-------------+-----------------+------+-------- - | | 18.61s | 378M | GZIP | 3m16s | 3m37s | 102M | 73.41% XZ | 5m22s | 5m39s | 77M | 79.83% The gain for restricted environnement seems to be interesting while uncompress can be time consuming but happens only while loading a module, that is generally done only once. This is fully compatible with signed modules while the signed module is compressed. module-init-tools or kmod handles decompression and provide to other layer the uncompressed but signed payload. Reviewed-by: Willy Tarreau Signed-off-by: Bertrand Jacquin Signed-off-by: Rusty Russell --- init/Kconfig | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index e84c6423a2e5..4980925bf348 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1906,6 +1906,49 @@ config MODULE_SIG_HASH default "sha384" if MODULE_SIG_SHA384 default "sha512" if MODULE_SIG_SHA512 +config MODULE_COMPRESS + bool "Compress modules on installation" + depends on MODULES + help + This option compresses the kernel modules when 'make + modules_install' is run. + + The modules will be compressed either using gzip or xz depend on the + choice made in "Compression algorithm". + + module-init-tools has support for gzip format while kmod handle gzip + and xz compressed modules. + + When a kernel module is installed from outside of the main kernel + source and uses the Kbuild system for installing modules then that + kernel module will also be compressed when it is installed. + + This option provides little benefit when the modules are to be used inside + an initrd or initramfs, it generally is more efficient to compress the whole + initrd or initramfs instead. + + This is fully compatible with signed modules while the signed module is + compressed. module-init-tools or kmod handles decompression and provide to + other layer the uncompressed but signed payload. + +choice + prompt "Compression algorithm" + depends on MODULE_COMPRESS + default MODULE_COMPRESS_GZIP + help + This determines which sort of compression will be used during + 'make modules_install'. + + GZIP (default) and XZ are supported. + +config MODULE_COMPRESS_GZIP + bool "GZIP" + +config MODULE_COMPRESS_XZ + bool "XZ" + +endchoice + endif # MODULES config INIT_ALL_POSSIBLE -- cgit v1.2.1 From 8315f42295d2667a7f942f154b73a86fd7cb2227 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 27 Jun 2014 13:42:20 -0700 Subject: rcu: Add call_rcu_tasks() This commit adds a new RCU-tasks flavor of RCU, which provides call_rcu_tasks(). This RCU flavor's quiescent states are voluntary context switch (not preemption!) and userspace execution (not the idle loop -- use some sort of schedule_on_each_cpu() if you need to handle the idle tasks. Note that unlike other RCU flavors, these quiescent states occur in tasks, not necessarily CPUs. Includes fixes from Steven Rostedt. This RCU flavor is assumed to have very infrequent latency-tolerant updaters. This assumption permits significant simplifications, including a single global callback list protected by a single global lock, along with a single task-private linked list containing all tasks that have not yet passed through a quiescent state. If experience shows this assumption to be incorrect, the required additional complexity will be added. Suggested-by: Steven Rostedt Signed-off-by: Paul E. McKenney --- init/Kconfig | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index e84c6423a2e5..c4539c4e177f 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -507,6 +507,16 @@ config PREEMPT_RCU This option enables preemptible-RCU code that is common between TREE_PREEMPT_RCU and, in the old days, TINY_PREEMPT_RCU. +config TASKS_RCU + bool "Task_based RCU implementation using voluntary context switch" + default n + help + This option enables a task-based RCU implementation that uses + only voluntary context switch (not preemption!), idle, and + user-mode execution as quiescent states. + + If unsure, say N. + config RCU_STALL_COMMON def_bool ( TREE_RCU || TREE_PREEMPT_RCU || RCU_TRACE ) help -- cgit v1.2.1 From a80e49e2cc3145af014a8ae44f575829cc236192 Mon Sep 17 00:00:00 2001 From: Frederic Weisbecker Date: Sat, 16 Aug 2014 17:47:18 +0200 Subject: nohz: Move nohz full init call to tick init This way we unbloat a bit main.c and more importantly we initialize nohz full after init_IRQ(). This dependency will be needed in further patches because nohz full needs irq work to raise its own IRQ. Information about the support for this ability on ARM64 is obtained on init_IRQ() which initialize the pointer to __smp_call_function. Since tick_init() is called right after init_IRQ(), this is a good place to call tick_nohz_init() and prepare for that dependency. Acked-by: Peter Zijlstra (Intel) Cc: Ingo Molnar Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Signed-off-by: Frederic Weisbecker --- init/main.c | 1 - 1 file changed, 1 deletion(-) (limited to 'init') diff --git a/init/main.c b/init/main.c index bb1aed928f21..8af2f1abfe38 100644 --- a/init/main.c +++ b/init/main.c @@ -577,7 +577,6 @@ asmlinkage __visible void __init start_kernel(void) local_irq_disable(); idr_init_cache(); rcu_init(); - tick_nohz_init(); context_tracking_init(); radix_tree_init(); /* init some links before init_ISA_irqs() */ -- cgit v1.2.1 From f4579fc57cf4244057b713b1f73f4dc9f0b11e97 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Fri, 25 Jul 2014 11:21:47 -0700 Subject: rcu: Fix attempt to avoid unsolicited offloading of callbacks Commit b58cc46c5f6b (rcu: Don't offload callbacks unless specifically requested) failed to adjust the callback lists of the CPUs that are known to be no-CBs CPUs only because they are also nohz_full= CPUs. This failure can result in callbacks that are posted during early boot getting stranded on nxtlist for CPUs whose no-CBs property becomes apparent late, and there can also be spurious warnings about offline CPUs posting callbacks. This commit fixes these problems by adding an early-boot rcu_init_nohz() that properly initializes the no-CBs CPUs. Note that kernels built with CONFIG_RCU_NOCB_CPU_ALL=y or with CONFIG_RCU_NOCB_CPU=n do not exhibit this bug. Neither do kernels booted without the nohz_full= boot parameter. Signed-off-by: Paul E. McKenney Reviewed-by: Pranith Kumar Tested-by: Paul Gortmaker --- init/Kconfig | 4 ++-- init/main.c | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index e84c6423a2e5..64ee4d967786 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -737,7 +737,7 @@ choice config RCU_NOCB_CPU_NONE bool "No build_forced no-CBs CPUs" - depends on RCU_NOCB_CPU && !NO_HZ_FULL_ALL + depends on RCU_NOCB_CPU help This option does not force any of the CPUs to be no-CBs CPUs. Only CPUs designated by the rcu_nocbs= boot parameter will be @@ -751,7 +751,7 @@ config RCU_NOCB_CPU_NONE config RCU_NOCB_CPU_ZERO bool "CPU 0 is a build_forced no-CBs CPU" - depends on RCU_NOCB_CPU && !NO_HZ_FULL_ALL + depends on RCU_NOCB_CPU help This option forces CPU 0 to be a no-CBs CPU, so that its RCU callbacks are invoked by a per-CPU kthread whose name begins diff --git a/init/main.c b/init/main.c index bb1aed928f21..e3c4cdd94d5b 100644 --- a/init/main.c +++ b/init/main.c @@ -578,6 +578,7 @@ asmlinkage __visible void __init start_kernel(void) idr_init_cache(); rcu_init(); tick_nohz_init(); + rcu_init_nohz(); context_tracking_init(); radix_tree_init(); /* init some links before init_ISA_irqs() */ -- cgit v1.2.1 From 8ba4caf1ee1585f018d32ab924244c9589bc9f37 Mon Sep 17 00:00:00 2001 From: Paul Gortmaker Date: Wed, 17 Sep 2014 10:57:45 -0400 Subject: Revert "init: make rootdelay=N consistent with rootwait behaviour" This reverts commit 4dfe694f616e00e6fd83e5bbcd7a3c4d7113493d. In that, we did: Here we move the rootdelay code to be right beside the rootwait code, so that their behaviour is consistent. ...which is fine, but in hindsight, perhaps moving the rootwait to be beside the rootdelay would have been better. We also indicated: It should be noted that in doing so, the actions based on the saved_root_name[0] and initrd_load() were previously put on hold by rootdelay=N and now currently will not be delayed. However, I think consistent behaviour is more important than matching historical behaviour of delaying the above two operations. But Pavel reported an instance where an ARM target with root on MMC was failing to mount root, and Russell diagnosed it to the fact that the call to set ROOT_DEV within the saved_root_name[0] processing block mentioned above was no longer being delayed. Rather than moving both wait clauses to the original position of rootdelay and risking unearthing other possible corner case breakage at this point in time, we simply revert now and we can revisit trying the alternate/earlier location in another development cycle. Cc: Pavel Machek Cc: Russell King Cc: Andrew Morton Signed-off-by: Paul Gortmaker Signed-off-by: Linus Torvalds --- init/do_mounts.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) (limited to 'init') diff --git a/init/do_mounts.c b/init/do_mounts.c index b6237c31b0e2..82f22885c87e 100644 --- a/init/do_mounts.c +++ b/init/do_mounts.c @@ -539,6 +539,12 @@ void __init prepare_namespace(void) { int is_floppy; + if (root_delay) { + printk(KERN_INFO "Waiting %d sec before mounting root device...\n", + root_delay); + ssleep(root_delay); + } + /* * wait for the known devices to complete their probing * @@ -565,12 +571,6 @@ void __init prepare_namespace(void) if (initrd_load()) goto out; - if (root_delay) { - pr_info("Waiting %d sec before mounting root device...\n", - root_delay); - ssleep(root_delay); - } - /* wait for any asynchronous scanning to complete */ if ((ROOT_DEV == 0) && root_wait) { printk(KERN_INFO "Waiting for root device %s...\n", -- cgit v1.2.1 From d4311ff1a8da48d609db9500f121c15580dfeeb7 Mon Sep 17 00:00:00 2001 From: Aaron Tomlin Date: Fri, 12 Sep 2014 14:16:17 +0100 Subject: init/main.c: Give init_task a canary Tasks get their end of stack set to STACK_END_MAGIC with the aim to catch stack overruns. Currently this feature does not apply to init_task. This patch removes this restriction. Note that a similar patch was posted by Prarit Bhargava some time ago but was never merged: http://marc.info/?l=linux-kernel&m=127144305403241&w=2 Signed-off-by: Aaron Tomlin Signed-off-by: Peter Zijlstra (Intel) Acked-by: Oleg Nesterov Acked-by: Michael Ellerman Cc: aneesh.kumar@linux.vnet.ibm.com Cc: dzickus@redhat.com Cc: bmr@redhat.com Cc: jcastillo@redhat.com Cc: jgh@redhat.com Cc: minchan@kernel.org Cc: tglx@linutronix.de Cc: hannes@cmpxchg.org Cc: Alex Thorlton Cc: Andrew Morton Cc: Benjamin Herrenschmidt Cc: Daeseok Youn Cc: David Rientjes Cc: Fabian Frederick Cc: Geert Uytterhoeven Cc: Jiri Olsa Cc: Kees Cook Cc: Kirill A. Shutemov Cc: Linus Torvalds Cc: Masami Hiramatsu Cc: Michael Opdenacker Cc: Paul Mackerras Cc: Prarit Bhargava Cc: Rik van Riel Cc: Rusty Russell Cc: Seiji Aguchi Cc: Steven Rostedt Cc: Vladimir Davydov Cc: Yasuaki Ishimatsu Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1410527779-8133-2-git-send-email-atomlin@redhat.com Signed-off-by: Ingo Molnar --- init/main.c | 1 + 1 file changed, 1 insertion(+) (limited to 'init') diff --git a/init/main.c b/init/main.c index bb1aed928f21..5fc3fc7bd475 100644 --- a/init/main.c +++ b/init/main.c @@ -508,6 +508,7 @@ asmlinkage __visible void __init start_kernel(void) * lockdep hash: */ lockdep_init(); + set_task_stack_end_magic(&init_task); smp_setup_processor_id(); debug_objects_early_init(); -- cgit v1.2.1 From 361e9dfbaae84b0b246ed18d1ab7c82a1a41b53e Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Fri, 3 Oct 2014 16:00:54 -0700 Subject: init/Kconfig: Hide printk log config if CONFIG_PRINTK=n The buffers sized by CONFIG_LOG_BUF_SHIFT and CONFIG_LOG_CPU_MAX_BUF_SHIFT do not exist if CONFIG_PRINTK=n, so don't ask about their size at all. Signed-off-by: Josh Triplett Acked-by: Randy Dunlap Cc: stable --- init/Kconfig | 2 ++ 1 file changed, 2 insertions(+) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index e84c6423a2e5..31505a52c165 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -811,6 +811,7 @@ config LOG_BUF_SHIFT int "Kernel log buffer size (16 => 64KB, 17 => 128KB)" range 12 21 default 17 + depends on PRINTK help Select the minimal kernel log buffer size as a power of 2. The final size is affected by LOG_CPU_MAX_BUF_SHIFT config @@ -830,6 +831,7 @@ config LOG_CPU_MAX_BUF_SHIFT range 0 21 default 12 if !BASE_SMALL default 0 if BASE_SMALL + depends on PRINTK help This option allows to increase the default ring buffer size according to the number of CPUs. The value defines the contribution -- cgit v1.2.1 From 62b4d2041117f35ab2409c9f5c4b8d3dc8e59d0f Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Fri, 3 Oct 2014 16:19:24 -0700 Subject: init/Kconfig: Fix HAVE_FUTEX_CMPXCHG to not break up the EXPERT menu commit 03b8c7b623c80af264c4c8d6111e5c6289933666 ("futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test") added the HAVE_FUTEX_CMPXCHG symbol right below FUTEX. This placed it right in the middle of the options for the EXPERT menu. However, HAVE_FUTEX_CMPXCHG does not depend on EXPERT or FUTEX, so Kconfig stops placing items in the EXPERT menu, and displays the remaining several EXPERT items (starting with EPOLL) directly in the General Setup menu. Since both users of HAVE_FUTEX_CMPXCHG only select it "if FUTEX", make HAVE_FUTEX_CMPXCHG itself depend on FUTEX. With this change, the subsequent items display as part of the EXPERT menu again; the EMBEDDED menu now appears as the next top-level item in the General Setup menu, which makes General Setup much shorter and more usable. Signed-off-by: Josh Triplett Acked-by: Randy Dunlap Cc: stable --- init/Kconfig | 1 + 1 file changed, 1 insertion(+) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index 31505a52c165..80a6907f91c5 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1477,6 +1477,7 @@ config FUTEX config HAVE_FUTEX_CMPXCHG bool + depends on FUTEX help Architectures should select this if futex_atomic_cmpxchg_inatomic() is implemented and always working. This removes a couple of runtime -- cgit v1.2.1 From 6a33979d5bd7521497121c5ae4435d7003115a0f Mon Sep 17 00:00:00 2001 From: Mel Gorman Date: Thu, 9 Oct 2014 15:26:33 -0700 Subject: mm: remove misleading ARCH_USES_NUMA_PROT_NONE ARCH_USES_NUMA_PROT_NONE was defined for architectures that implemented _PAGE_NUMA using _PROT_NONE. This saved using an additional PTE bit and relied on the fact that PROT_NONE vmas were skipped by the NUMA hinting fault scanner. This was found to be conceptually confusing with a lot of implicit assumptions and it was asked that an alternative be found. Commit c46a7c81 "x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels" redefined _PAGE_NUMA on x86 to be one of the swap PTE bits and shrunk the maximum possible swap size but it did not go far enough. There are no architectures that reuse _PROT_NONE as _PROT_NUMA but the relics still exist. This patch removes ARCH_USES_NUMA_PROT_NONE and removes some unnecessary duplication in powerpc vs the generic implementation by defining the types the core NUMA helpers expected to exist from x86 with their ppc64 equivalent. This necessitated that a PTE bit mask be created that identified the bits that distinguish present from NUMA pte entries but it is expected this will only differ between arches based on _PAGE_PROTNONE. The naming for the generic helpers was taken from x86 originally but ppc64 has types that are equivalent for the purposes of the helper so they are mapped instead of duplicating code. Signed-off-by: Mel Gorman Cc: Hugh Dickins Cc: "Kirill A. Shutemov" Cc: Rik van Riel Cc: Johannes Weiner Cc: Cyrill Gorcunov Reviewed-by: Aneesh Kumar K.V Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/Kconfig | 11 ----------- 1 file changed, 11 deletions(-) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index e25a82a291a6..d2355812ba48 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -889,17 +889,6 @@ config ARCH_SUPPORTS_INT128 config ARCH_WANT_NUMA_VARIABLE_LOCALITY bool -# -# For architectures that are willing to define _PAGE_NUMA as _PAGE_PROTNONE -config ARCH_WANTS_PROT_NUMA_PROT_NONE - bool - -config ARCH_USES_NUMA_PROT_NONE - bool - default y - depends on ARCH_WANTS_PROT_NUMA_PROT_NONE - depends on NUMA_BALANCING - config NUMA_BALANCING_DEFAULT_ENABLED bool "Automatically enable NUMA aware memory/task placement" default y -- cgit v1.2.1 From 2240a31db67582468e2f7a5a5962b7d0ffaaa6a4 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Mon, 13 Oct 2014 15:51:11 -0700 Subject: printk: don't bother using LOG_CPU_MAX_BUF_SHIFT on !SMP When configuring a uniprocessor kernel, don't bother the user with an irrelevant LOG_CPU_MAX_BUF_SHIFT question, and don't build the unused code. Signed-off-by: Geert Uytterhoeven Acked-by: Luis R. Rodriguez Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/Kconfig | 1 + 1 file changed, 1 insertion(+) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index 1c505e090422..3ee28ae02cc8 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -838,6 +838,7 @@ config LOG_BUF_SHIFT config LOG_CPU_MAX_BUF_SHIFT int "CPU kernel log buffer size contribution (13 => 8 KB, 17 => 128KB)" + depends on SMP range 0 21 default 12 if !BASE_SMALL default 0 if BASE_SMALL -- cgit v1.2.1 From c34d85aca91729596f876604e147892b81ecbbe9 Mon Sep 17 00:00:00 2001 From: Mark Rustad Date: Mon, 13 Oct 2014 15:54:07 -0700 Subject: init/initramfs.c: resolve shadow warnings Resolve shadow warnings that are produced in W=2 builds by renaming a global with a too-generic name and renaming a formal parameter. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Mark Rustad Signed-off-by: Jeff Kirsher Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/initramfs.c | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) (limited to 'init') diff --git a/init/initramfs.c b/init/initramfs.c index bece48c3461e..ad1bd7787bbb 100644 --- a/init/initramfs.c +++ b/init/initramfs.c @@ -197,14 +197,14 @@ static __initdata enum state { } state, next_state; static __initdata char *victim; -static unsigned long count __initdata; +static unsigned long byte_count __initdata; static __initdata loff_t this_header, next_header; static inline void __init eat(unsigned n) { victim += n; this_header += n; - count -= n; + byte_count -= n; } static __initdata char *vcollected; @@ -214,7 +214,7 @@ static __initdata char *collect; static void __init read_into(char *buf, unsigned size, enum state next) { - if (count >= size) { + if (byte_count >= size) { collected = victim; eat(size); state = next; @@ -237,8 +237,8 @@ static int __init do_start(void) static int __init do_collect(void) { unsigned long n = remains; - if (count < n) - n = count; + if (byte_count < n) + n = byte_count; memcpy(collect, victim, n); eat(n); collect += n; @@ -280,8 +280,8 @@ static int __init do_header(void) static int __init do_skip(void) { - if (this_header + count < next_header) { - eat(count); + if (this_header + byte_count < next_header) { + eat(byte_count); return 1; } else { eat(next_header - this_header); @@ -292,9 +292,9 @@ static int __init do_skip(void) static int __init do_reset(void) { - while(count && *victim == '\0') + while (byte_count && *victim == '\0') eat(1); - if (count && (this_header & 3)) + if (byte_count && (this_header & 3)) error("broken padding"); return 1; } @@ -309,11 +309,11 @@ static int __init maybe_link(void) return 0; } -static void __init clean_path(char *path, umode_t mode) +static void __init clean_path(char *path, umode_t fmode) { struct stat st; - if (!sys_newlstat(path, &st) && (st.st_mode^mode) & S_IFMT) { + if (!sys_newlstat(path, &st) && (st.st_mode ^ fmode) & S_IFMT) { if (S_ISDIR(st.st_mode)) sys_rmdir(path); else @@ -368,7 +368,7 @@ static int __init do_name(void) static int __init do_copy(void) { - if (count >= body_len) { + if (byte_count >= body_len) { if (xwrite(wfd, victim, body_len) != body_len) error("write error"); sys_close(wfd); @@ -378,10 +378,10 @@ static int __init do_copy(void) state = SkipIt; return 0; } else { - if (xwrite(wfd, victim, count) != count) + if (xwrite(wfd, victim, byte_count) != byte_count) error("write error"); - body_len -= count; - eat(count); + body_len -= byte_count; + eat(byte_count); return 1; } } @@ -411,12 +411,12 @@ static __initdata int (*actions[])(void) = { static long __init write_buffer(char *buf, unsigned long len) { - count = len; + byte_count = len; victim = buf; while (!actions[state]()) ; - return len - count; + return len - byte_count; } static long __init flush_buffer(void *bufv, unsigned long len) -- cgit v1.2.1 From 63a12d9d01831208a47f5c0fbbf93f503d1fb162 Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Mon, 13 Oct 2014 15:55:44 -0700 Subject: kernel/param: consolidate __{start,stop}___param[] in Consolidate the various external const and non-const declarations of __start___param[] and __stop___param in . This requires making a few struct kernel_param pointers in kernel/params.c const. Signed-off-by: Geert Uytterhoeven Acked-by: Rusty Russell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- init/main.c | 2 -- 1 file changed, 2 deletions(-) (limited to 'init') diff --git a/init/main.c b/init/main.c index 89ec862da2d4..800a0daede7e 100644 --- a/init/main.c +++ b/init/main.c @@ -501,7 +501,6 @@ asmlinkage __visible void __init start_kernel(void) { char *command_line; char *after_dashes; - extern const struct kernel_param __start___param[], __stop___param[]; /* * Need to run as early as possible, to initialize the @@ -844,7 +843,6 @@ static char *initcall_level_names[] __initdata = { static void __init do_initcall_level(int level) { - extern const struct kernel_param __start___param[], __stop___param[]; initcall_t *fn; strcpy(initcall_command_line, saved_command_line); -- cgit v1.2.1 From f89b7755f517cdbb755d7543eef986ee9d54e654 Mon Sep 17 00:00:00 2001 From: Alexei Starovoitov Date: Thu, 23 Oct 2014 18:41:08 -0700 Subject: bpf: split eBPF out of NET introduce two configs: - hidden CONFIG_BPF to select eBPF interpreter that classic socket filters depend on - visible CONFIG_BPF_SYSCALL (default off) that tracing and sockets can use that solves several problems: - tracing and others that wish to use eBPF don't need to depend on NET. They can use BPF_SYSCALL to allow loading from userspace or select BPF to use it directly from kernel in NET-less configs. - in 3.18 programs cannot be attached to events yet, so don't force it on - when the rest of eBPF infra is there in 3.19+, it's still useful to switch it off to minimize kernel size bloat-o-meter on x64 shows: add/remove: 0/60 grow/shrink: 0/2 up/down: 0/-15601 (-15601) tested with many different config combinations. Hopefully didn't miss anything. Signed-off-by: Alexei Starovoitov Acked-by: Daniel Borkmann Signed-off-by: David S. Miller --- init/Kconfig | 14 ++++++++++++++ 1 file changed, 14 insertions(+) (limited to 'init') diff --git a/init/Kconfig b/init/Kconfig index 3ee28ae02cc8..2081a4d3d917 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1341,6 +1341,10 @@ config SYSCTL_ARCH_UNALIGN_ALLOW config HAVE_PCSPKR_PLATFORM bool +# interpreter that classic socket filters depend on +config BPF + bool + menuconfig EXPERT bool "Configure standard kernel features (expert users)" # Unhide debug options, to make the on-by-default options visible @@ -1521,6 +1525,16 @@ config EVENTFD If unsure, say Y. +# syscall, maps, verifier +config BPF_SYSCALL + bool "Enable bpf() system call" if EXPERT + select ANON_INODES + select BPF + default n + help + Enable the bpf() system call that allows to manipulate eBPF + programs and maps via file descriptors. + config SHMEM bool "Use full shmem filesystem" if EXPERT default y -- cgit v1.2.1 From 3438cf549d2f3ee8e52c82acc8e2a9710ac21a5b Mon Sep 17 00:00:00 2001 From: Daniel Thompson Date: Tue, 11 Nov 2014 16:29:46 +1030 Subject: param: fix crash on bad kernel arguments Currently if the user passes an invalid value on the kernel command line then the kernel will crash during argument parsing. On most systems this is very hard to debug because the console hasn't been initialized yet. This is a regression due to commit 51e158c12aca ("param: hand arguments after -- straight to init") which, in response to the systemd debug controversy, made it possible to explicitly pass arguments to init. To achieve this parse_args() was extended from simply returning an error code to returning a pointer. Regretably the new init args logic does not perform a proper validity check on the pointer resulting in a crash. This patch fixes the validity check. Should the check fail then no arguments will be passed to init. This is reasonable and matches how the kernel treats its own arguments (i.e. no error recovery). Signed-off-by: Daniel Thompson Cc: stable@vger.kernel.org Signed-off-by: Rusty Russell --- init/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'init') diff --git a/init/main.c b/init/main.c index 800a0daede7e..321d0ceb26d3 100644 --- a/init/main.c +++ b/init/main.c @@ -544,7 +544,7 @@ asmlinkage __visible void __init start_kernel(void) static_command_line, __start___param, __stop___param - __start___param, -1, -1, &unknown_bootoption); - if (after_dashes) + if (!IS_ERR_OR_NULL(after_dashes)) parse_args("Setting init args", after_dashes, NULL, 0, -1, -1, set_init_arg); -- cgit v1.2.1