summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorOleksandr Natalenko <oleksandr@redhat.com>2020-04-06 09:54:35 +1000
committerStephen Rothwell <sfr@canb.auug.org.au>2020-04-14 11:51:06 +1000
commit5fd7a3e71cef1e1b04f79a3b61b01808752b5f36 (patch)
treea045e5269f100cf9e8adcb88f608062aeacbaf06
parent28c3d428cea3b2e66212a5758d2b5adac01e4395 (diff)
downloadlinux-next-5fd7a3e71cef1e1b04f79a3b61b01808752b5f36.tar.gz
mm/madvise: allow KSM hints for remote API
It all began with the fact that KSM works only on memory that is marked by madvise(). And the only way to get around that is to either: * use LD_PRELOAD; or * patch the kernel with something like UKSM or PKSM. (i skip ptrace can of worms here intentionally) To overcome this restriction, lets employ a new remote madvise API. This can be used by some small userspace helper daemon that will do auto-KSM job for us. I think of two major consumers of remote KSM hints: * hosts, that run containers, especially similar ones and especially in a trusted environment, sharing the same runtime like Node.js; * heavy applications, that can be run in multiple instances, not limited to opensource ones like Firefox, but also those that cannot be modified since they are binary-only and, maybe, statically linked. Speaking of statistics, more numbers can be found in the very first submission, that is related to this one [1]. For my current setup with two Firefox instances I get 100 to 200 MiB saved for the second instance depending on the amount of tabs. 1 FF instance with 15 tabs: $ echo "$(cat /sys/kernel/mm/ksm/pages_sharing) * 4 / 1024" | bc 410 2 FF instances, second one has 12 tabs (all the tabs are different): $ echo "$(cat /sys/kernel/mm/ksm/pages_sharing) * 4 / 1024" | bc 592 At the very moment I do not have specific numbers for containerised workload, but those should be comparable in case the containers share similar/same runtime. [1] https://lore.kernel.org/patchwork/patch/1012142/ Link: http://lkml.kernel.org/r/20200302193630.68771-8-minchan@kernel.org Signed-off-by: Oleksandr Natalenko <oleksandr@redhat.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Reviewed-by: SeongJae Park <sjpark@amazon.de> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Brian Geffon <bgeffon@google.com> Cc: Christian Brauner <christian@brauner.io> Cc: Daniel Colascione <dancol@google.com> Cc: Jann Horn <jannh@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Dias <joaodias@google.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Sandeep Patil <sspatil@google.com> Cc: SeongJae Park <sj38.park@gmail.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Sonny Rao <sonnyrao@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tim Murray <timmurray@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <linux-man@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
-rw-r--r--mm/madvise.c4
1 files changed, 4 insertions, 0 deletions
diff --git a/mm/madvise.c b/mm/madvise.c
index 59ae5804c479..097506466fdc 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1011,6 +1011,10 @@ process_madvise_behavior_valid(int behavior)
switch (behavior) {
case MADV_COLD:
case MADV_PAGEOUT:
+#ifdef CONFIG_KSM
+ case MADV_MERGEABLE:
+ case MADV_UNMERGEABLE:
+#endif
return true;
default:
return false;