From 24512ed303663863d15ff84aa96d183e8405bcd8 Mon Sep 17 00:00:00 2001 From: Vladimir Davydov Date: Sat, 25 Jun 2016 10:10:10 +1000 Subject: af_unix: charge buffers to kmemcg Unix sockets can consume a significant amount of system memory, hence they should be accounted to kmemcg. Since unix socket buffers are always allocated from process context, all we need to do to charge them to kmemcg is set __GFP_ACCOUNT in sock->sk_allocation mask. Eric asked: > 1) What happens when a buffer, allocated from socket lands in a > different socket , maybe owned by another user/process. > > Who owns it now, in term of kmemcg accounting ? We never move memcg charges. E.g. if two processes from different cgroups are sharing a memory region, each page will be charged to the process which touched it first. Or if two processes are working with the same directory tree, inodes and dentries will be charged to the first user. The same is fair for unix socket buffers - they will be charged to the sender. > 2) Has performance impact been evaluated ? I ran netperf STREAM_STREAM with default options in a kmemcg on a 4 core x 2 HT box. The results are below: # clients bandwidth (10^6bits/sec) base patched 1 67643 +- 725 64874 +- 353 - 4.0 % 4 193585 +- 2516 186715 +- 1460 - 3.5 % 8 194820 +- 377 187443 +- 1229 - 3.7 % So the accounting doesn't come for free - it takes ~4% of performance. I believe we could optimize it by using per cpu batching not only on charge, but also on uncharge in memcg core, but that's beyond the scope of this patch set - I'll take a look at this later. Anyway, if performance impact is found to be unacceptable, it is always possible to disable kmem accounting at boot time (cgroup.memory=nokmem) or not use memory cgroups at runtime at all (thanks to jump labels there'll be no overhead even if they are compiled in). Link: http://lkml.kernel.org/r/fcfe6cae27a59fbc5e40145664b3cf085a560c68.1464079538.git.vdavydov@virtuozzo.com Signed-off-by: Vladimir Davydov Cc: "David S. Miller" Cc: Johannes Weiner Cc: Michal Hocko Cc: Eric Dumazet Cc: Minchan Kim Signed-off-by: Andrew Morton --- net/unix/af_unix.c | 1 + 1 file changed, 1 insertion(+) (limited to 'net') diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 735362c26c8e..f1dffe84f0d5 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -769,6 +769,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern) lockdep_set_class(&sk->sk_receive_queue.lock, &af_unix_sk_receive_queue_lock_key); + sk->sk_allocation = GFP_KERNEL_ACCOUNT; sk->sk_write_space = unix_write_space; sk->sk_max_ack_backlog = net->unx.sysctl_max_dgram_qlen; sk->sk_destruct = unix_sock_destructor; -- cgit v1.2.1