summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* ipv6: Replace spinlock with seqlock and rcu in ip6_tunnelMartin KaFai Lau2015-09-153-28/+36
| | | | | | | | | This patch uses a seqlock to ensure consistency between idst->dst and idst->cookie. It also makes dst freeing from fib tree to undergo a rcu grace period. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Avoid double dst_freeMartin KaFai Lau2015-09-153-9/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | It is a prep work to get dst freeing from fib tree undergo a rcu grace period. The following is a common paradigm: if (ip6_del_rt(rt)) dst_free(rt) which means, if rt cannot be deleted from the fib tree, dst_free(rt) now. 1. We don't know the ip6_del_rt(rt) failure is because it was not managed by fib tree (e.g. DST_NOCACHE) or it had already been removed from the fib tree. 2. If rt had been managed by the fib tree, ip6_del_rt(rt) failure means dst_free(rt) has been called already. A second dst_free(rt) is not always obviously safe. The rt may have been destroyed already. 3. If rt is a DST_NOCACHE, dst_free(rt) should not be called. 4. It is a stopper to make dst freeing from fib tree undergo a rcu grace period. This patch is to use a DST_NOCACHE flag to indicate a rt is not managed by the fib tree. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Fix dst_entry refcnt bugs in ip6_tunnelMartin KaFai Lau2015-09-153-48/+123
| | | | | | | | | | | | | | | | | | | | | | | | | Problems in the current dst_entry cache in the ip6_tunnel: 1. ip6_tnl_dst_set is racy. There is no lock to protect it: - One major problem is that the dst refcnt gets messed up. F.e. the same dst_cache can be released multiple times and then triggering the infamous dst refcnt < 0 warning message. - Another issue is the inconsistency between dst_cache and dst_cookie. It can be reproduced by adding and removing the ip6gre tunnel while running a super_netperf TCP_CRR test. 2. ip6_tnl_dst_get does not take the dst refcnt before returning the dst. This patch: 1. Create a percpu dst_entry cache in ip6_tnl 2. Use a spinlock to protect the dst_cache operations 3. ip6_tnl_dst_get always takes the dst refcnt before returning Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Rename the dst_cache helper functions in ip6_tunnelMartin KaFai Lau2015-09-153-10/+10
| | | | | | | | | | | | | | It is a prep work to fix the dst_entry refcnt bugs in ip6_tunnel. This patch rename: 1. ip6_tnl_dst_check() to ip6_tnl_dst_get() to better reflect that it will take a dst refcnt in the next patch. 2. ip6_tnl_dst_store() to ip6_tnl_dst_set() to have a more conventional name matching with ip6_tnl_dst_get(). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Refactor common ip6gre_tunnel_init codesMartin KaFai Lau2015-09-151-13/+24
| | | | | | | | | | It is a prep work to fix the dst_entry refcnt bugs in ip6_tunnel. This patch refactors some common init codes used by both ip6gre_tunnel_init and ip6gre_tap_init. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* irda: ali-ircc: Fix deadlock in ali_ircc_sir_change_speed()Alexey Khoroshilov2015-09-111-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | ali_ircc_sir_change_speed() is always called with self->lock held, so acquiring the lock inside it leads to unavoidable deadlock. Call graph: ali_ircc_sir_change_speed() is called from ali_ircc_change_speed() ali_ircc_fir_hard_xmit() under spin_lock_irqsave(&self->lock, flags); ali_ircc_sir_hard_xmit() under spin_lock_irqsave(&self->lock, flags); ali_ircc_net_ioctl() under spin_lock_irqsave(&self->lock, flags); ali_ircc_dma_xmit_complete() ali_ircc_fir_interrupt() ali_ircc_interrupt() under spin_lock(&self->lock); ali_ircc_sir_write_wakeup() ali_ircc_sir_interrupt() ali_ircc_interrupt() under spin_lock(&self->lock); The patch removes spin_lock/unlock from ali_ircc_sir_change_speed(). Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
* openvswitch: Fix dependency on IPv6 defrag.Joe Stringer2015-09-111-1/+2
| | | | | | | | | | | | | | When NF_CONNTRACK is built-in, NF_DEFRAG_IPV6 is a module, and OPENVSWITCH is built-in, the following build error would occur: net/built-in.o: In function `ovs_ct_execute': (.text+0x10f587): undefined reference to `nf_ct_frag6_gather' Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action") Reported-by: Jim Davis <jim.epost@gmail.com> Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: fix igmpv3 / mldv2 report parsingLinus Lüssing2015-09-111-2/+2
| | | | | | | | | | | | | | | | | | | With the newly introduced helper functions the skb pulling is hidden in the checksumming function - and undone before returning to the caller. The IGMPv3 and MLDv2 report parsing functions in the bridge still assumed that the skb is pointing to the beginning of the IGMP/MLD message while it is now kept at the beginning of the IPv4/6 header, breaking the message parsing and creating packet loss. Fixing this by taking the offset between IP and IGMP/MLD header into account, too. Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code") Reported-by: Tobias Powalowski <tobias.powalowski@googlemail.com> Tested-by: Tobias Powalowski <tobias.powalowski@googlemail.com> Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: David S. Miller <davem@davemloft.net>
* bnx2x: use ktime_get_seconds() for timestampArnd Bergmann2015-09-111-4/+2
| | | | | | | | | | | | | | | | | | | | | | commit c48f350ff5e7 "bnx2x: Add MFW dump support" added the bnx2x_update_mfw_dump() function that reads the current time and stores it in a 32-bit field that gets passed into a buffer in a fixed format. This is potentially broken when the epoch overflows in 2038, and otherwise overflows in 2106. As we're trying to avoid uses of struct timeval for this reason, I noticed the addition of this function, and tried to rewrite it in a way that is more explicit about the overflow and that will keep working once we deprecate struct timeval. I assume that it is not possible to change the ABI any more, otherwise we should try to use a 64-bit field for the seconds right away. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Yuval Mintz <Yuval.Mintz@qlogic.com> Cc: Ariel Elior <Ariel.Elior@qlogic.com> Acked-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: fix race on protocol/netns initializationMarcelo Ricardo Leitner2015-09-111-23/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider sctp module is unloaded and is being requested because an user is creating a sctp socket. During initialization, sctp will add the new protocol type and then initialize pernet subsys: status = sctp_v4_protosw_init(); if (status) goto err_protosw_init; status = sctp_v6_protosw_init(); if (status) goto err_v6_protosw_init; status = register_pernet_subsys(&sctp_net_ops); The problem is that after those calls to sctp_v{4,6}_protosw_init(), it is possible for userspace to create SCTP sockets like if the module is already fully loaded. If that happens, one of the possible effects is that we will have readers for net->sctp.local_addr_list list earlier than expected and sctp_net_init() does not take precautions while dealing with that list, leading to a potential panic but not limited to that, as sctp_sock_init() will copy a bunch of blank/partially initialized values from net->sctp. The race happens like this: CPU 0 | CPU 1 socket() | __sock_create | socket() inet_create | __sock_create list_for_each_entry_rcu( | answer, &inetsw[sock->type], | list) { | inet_create /* no hits */ | if (unlikely(err)) { | ... | request_module() | /* socket creation is blocked | * the module is fully loaded | */ | sctp_init | sctp_v4_protosw_init | inet_register_protosw | list_add_rcu(&p->list, | last_perm); | | list_for_each_entry_rcu( | answer, &inetsw[sock->type], sctp_v6_protosw_init | list) { | /* hit, so assumes protocol | * is already loaded | */ | /* socket creation continues | * before netns is initialized | */ register_pernet_subsys | Simply inverting the initialization order between register_pernet_subsys() and sctp_v4_protosw_init() is not possible because register_pernet_subsys() will create a control sctp socket, so the protocol must be already visible by then. Deferring the socket creation to a work-queue is not good specially because we loose the ability to handle its errors. So, as suggested by Vlad, the fix is to split netns initialization in two moments: defaults and control socket, so that the defaults are already loaded by when we register the protocol, while control socket initialization is kept at the same moment it is today. Fixes: 4db67e808640 ("sctp: Make the address lists per network namespace") Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ebpf: emit correct src_reg for conditional jumpsTycho Andersen2015-09-111-1/+1
| | | | | | | | | | | | | | | Instead of always emitting BPF_REG_X, let's emit BPF_REG_X only when the source actually is BPF_X. This causes programs generated by the classic converter to not be importable via bpf(), as the eBPF verifier checks that the src_reg is correct or 0. While not a problem yet, this will be a problem when BPF_PROG_DUMP lands, and we can potentially dump and re-import programs generated by the converter. Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com> CC: Alexei Starovoitov <ast@kernel.org> CC: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* netlink, mmap: transform mmap skb into full skb on tapsDaniel Borkmann2015-09-112-7/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | Ken-ichirou reported that running netlink in mmap mode for receive in combination with nlmon will throw a NULL pointer dereference in __kfree_skb() on nlmon_xmit(), in my case I can also trigger an "unable to handle kernel paging request". The problem is the skb_clone() in __netlink_deliver_tap_skb() for skbs that are mmaped. I.e. the cloned skb doesn't have a destructor, whereas the mmap netlink skb has it pointed to netlink_skb_destructor(), set in the handler netlink_ring_setup_skb(). There, skb->head is being set to NULL, so that in such cases, __kfree_skb() doesn't perform a skb_release_data() via skb_release_all(), where skb->head is possibly being freed through kfree(head) into slab allocator, although netlink mmap skb->head points to the mmap buffer. Similarly, the same has to be done also for large netlink skbs where the data area is vmalloced. Therefore, as discussed, make a copy for these rather rare cases for now. This fixes the issue on my and Ken-ichirou's test-cases. Reference: http://thread.gmane.org/gmane.linux.network/371129 Fixes: bcbde0d449ed ("net: netlink: virtual tap device management") Reported-by: Ken-ichirou MATSUZAWA <chamaken@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Ken-ichirou MATSUZAWA <chamaken@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'sound-fix-4.3-rc1' of ↵Linus Torvalds2015-09-113-5/+18
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes since the last update: the HD-audio quirks as usual with a USB-audio fix and a trivial fix for the old sparc driver" * tag 'sound-fix-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: usb-audio: Change internal PCM order ALSA: hda - Fix white noise on Dell M3800 ALSA: hda - Use ALC880_FIXUP_FUJITSU for FSC Amilo M1437 ALSA: hda - Enable headphone jack detect on old Fujitsu laptops ALSA: sparc: amd7930: Fix module autoload for OF platform driver ALSA: hda - Add some FIXUP quirks for white noise on Dell laptop.
| * ALSA: usb-audio: Change internal PCM orderJohan Rastén2015-09-071-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New PCMs will now be added to the end of the chip's PCM list instead of to the front. This changes the way streams are combined so that the first capture stream will now be merged with the first playback stream instead of the last. This fixes a problem with ASUS U7. Cards with one playback stream and cards without capture streams should be unaffected by this change. Exception added for M-Audio Audiophile USB (tm) since it seems to have a fix to swap capture stream numbering in alsa-lib conf/cards/USB-audio.conf Signed-off-by: Johan Rastén <johan@oljud.se> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * ALSA: hda - Fix white noise on Dell M3800Niranjan Sivakumar2015-09-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | The M3800 is very minor workstation variant of the XPS 15 which has already been patched for this issue. I figured it's probably more important for this version of the laptop to be patched than the regular XPS as Dell sells is pre-configured with Ubuntu to be used as a Linux workstation. I have tested the patch on my the hardware on Linux 4.2.0. Signed-off-by: Niranjan Sivakumar <ns253@cornell.edu> Cc: <stable@vger.kernel.org> # v4.1+ Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * ALSA: hda - Use ALC880_FIXUP_FUJITSU for FSC Amilo M1437Takashi Iwai2015-09-051-1/+1
| | | | | | | | | | | | | | | | | | It turned out that the machine has a bass speaker, so take a correct fixup entry. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=102501 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * ALSA: hda - Enable headphone jack detect on old Fujitsu laptopsTakashi Iwai2015-09-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | According to the bug report, FSC Amilo laptops with ALC880 can detect the headphone jack but currently the driver disables it. It's partly intentionally, as non-working jack detect was reported in the past. Let's enable now. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=102501 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * ALSA: sparc: amd7930: Fix module autoload for OF platform driverLuis de Bethencourt2015-09-041-0/+1
| | | | | | | | | | | | | | | | This platform driver has a OF device ID table but the OF module alias information is not created so module autoloading won't work. Signed-off-by: Luis de Bethencourt <luis@debethencourt.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * ALSA: hda - Add some FIXUP quirks for white noise on Dell laptop.Woodrow Shen2015-09-041-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dell laptop has a series model to use the same codec but different subsystem ID. At the same time they happens the white noise by login screen and headphone; for fixing them together, I only can add these IDs to FIXUP function ALC292_FIXUP_DISABLE_AAMIX, then try to solve such the similar issues. Codec: Realtek ALC3235 Vendor Id: 0x10ec0293 Subsystem Id: 0x102806dd Subsystem Id: 0x102806df Subsystem Id: 0x102806e0 Cc: <stable@vger.kernel.org> BugLink: https://bugs.launchpad.net/bugs/1492132 Signed-off-by: Woodrow Shen <woodrow.shen@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
* | Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds2015-09-1116-83/+218
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull drm fixes from Dave Airlie: "Just a bunch of fixes to squeeze in before -rc1: - three nouveau regression fixes - one qxl regression fix - a bunch of i915 fixes ... and some core displayport/atomic fixes" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/nouveau/device: enable c800 quirk for tecra w50 drm/nouveau/clk/gt215: Unbreak engine pausing for GT21x/MCP7x drm/nouveau/gr/nv04: fix big endian setting on gr context drm/qxl: validate monitors config modes drm/i915: Allow DSI dual link to be configured on any pipe drm/i915: Don't try to use DDR DVFS on CHV when disabled in the BIOS drm/i915: Fix CSR MMIO address check drm/i915: Limit the number of loops for reading a split 64bit register drm/i915: Fix broken mst get_hw_state. drm/i915: Pass hpd_status_i915[] to intel_get_hpd_pins() in pre-g4x uapi/drm/i915_drm.h: fix userspace compilation. drm/i915: Always mark the object as dirty when used by the GPU drm/dp: Add dp_aux_i2c_speed_khz module param to set the assume i2c bus speed drm/dp: Adjust i2c-over-aux retry count based on message size and i2c bus speed drm/dp: Define AUX_RETRY_INTERVAL as 500 us drm/atomic: Fix bookkeeping with TEST_ONLY, v3.
| * \ Merge branch 'linux-4.3' of ↵Dave Airlie2015-09-113-4/+5
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-next three nouveau regression fixes. * 'linux-4.3' of git://anongit.freedesktop.org/git/nouveau/linux-2.6: drm/nouveau/device: enable c800 quirk for tecra w50 drm/nouveau/clk/gt215: Unbreak engine pausing for GT21x/MCP7x drm/nouveau/gr/nv04: fix big endian setting on gr context
| | * | drm/nouveau/device: enable c800 quirk for tecra w50Ben Skeggs2015-09-111-0/+1
| | | | | | | | | | | | | | | | Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
| | * | drm/nouveau/clk/gt215: Unbreak engine pausing for GT21x/MCP7xRoy Spliet2015-09-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Typo that snuck in with commit 6979c6303a4abf263753cd9d577d79f05c6e8c47 Signed-off-by: Roy Spliet <rspliet@eclipso.eu> Reported-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
| | * | drm/nouveau/gr/nv04: fix big endian setting on gr contextIlia Mirkin2015-09-111-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Broken since "gr: convert user classes to new-style nvkm_object" Tested on a PPC64 G5 + NV34 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
| * | | Merge tag 'topic/drm-fixes-2015-09-09' of ↵Dave Airlie2015-09-112-21/+117
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://anongit.freedesktop.org/drm-intel into drm-next bunch of drm fixes. * tag 'topic/drm-fixes-2015-09-09' of git://anongit.freedesktop.org/drm-intel: drm/dp: Add dp_aux_i2c_speed_khz module param to set the assume i2c bus speed drm/dp: Adjust i2c-over-aux retry count based on message size and i2c bus speed drm/dp: Define AUX_RETRY_INTERVAL as 500 us drm/atomic: Fix bookkeeping with TEST_ONLY, v3.
| | * | | drm/dp: Add dp_aux_i2c_speed_khz module param to set the assume i2c bus speedVille Syrjälä2015-09-021-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To help with debugging i2c-over-aux issues, add a module parameter than can be used to tweak the assumed i2c bus speed, and thus the maximum number of retries we will do for each aux message. Cc: Simon Farnsworth <simon.farnsworth@onelan.com> Cc: moosotc@gmail.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Simon Farnsworth <simon.farnsworth@onelan.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| | * | | drm/dp: Adjust i2c-over-aux retry count based on message size and i2c bus speedVille Syrjälä2015-09-021-2/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calculate the number of retries we should do for each i2c-over-aux message based on the time it takes to perform the i2c transfer vs. the aux transfer. We assume the shortest possible length for the aux transfer, and the longest possible (exluding clock stretching) for the i2c transfer. The DP spec has some examples on how to calculate this, but we don't calculate things quite the same way. The spec doesn't account for the retry interval (assumes immediate retry on defer), and doesn't assume the best/worst case behaviour as we do. Note that currently we assume 10 kHz speed for the i2c bus. Some real world devices (eg. some Apple DP->VGA dongle) fails with less than 16 retries. and that would correspond to something close to 15 kHz (with our method of calculating things) But let's just go for 10 kHz to be on the safe side. Ideally we should query/set the i2c bus speed via DPCD but for now this should at leaast remove the regression from the 1->16 byte trasnfer size change. And of course if the sink completes the transfer quicker this shouldn't slow things down since we don't change the interval between retries. I did a few experiments with a DP->DVI dongle I have that allows you to change the i2c bus speed. Here are the results of me changing the actual bus speed and the assumed bus speed and seeing when we start to fail the operation: actual i2c khz assumed i2c khz max retries 1 1 ok -> 2 fail 211 ok -> 106 fail 5 8 ok -> 9 fail 27 ok -> 24 fail 10 17 ok -> 18 fail 13 ok -> 12 fail 100 210 ok -> 211 fail 2 ok -> 1 fail So based on that we have a fairly decent safety margin baked into the formula to calculate the max number of retries. Fixes a regression with some DP dongles from: commit 1d002fa720738bcd0bddb9178e9ea0773288e1dd Author: Simon Farnsworth <simon.farnsworth@onelan.co.uk> Date: Tue Feb 10 18:38:08 2015 +0000 drm/dp: Use large transactions for I2C over AUX v2: Use best case for AUX and worst case for i2c (Simon Farnsworth) Add a define our AUX retry interval and account for it v3: Make everything usecs to avoid confusion about units (Daniel) Add a comment reminding people about the AUX bitrate (Daniel) Use DIV_ROUND_UP() since we're after the "worst" case for i2c Cc: Simon Farnsworth <simon.farnsworth@onelan.com> Cc: moosotc@gmail.com Tested-by: moosotc@gmail.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91451 Reviewed-by: Simon Farnsworth <simon.farnsworth@onelan.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| | * | | drm/dp: Define AUX_RETRY_INTERVAL as 500 usVille Syrjälä2015-09-021-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we react to native and i2c defers by waiting either 400-500 us or 500-600 us, depending on which code path we take. Consolidate them all to one define AUX_RETRY_INTERVAL which defines the minimum interval. Since we've been using two different intervals pick the longer of them and define AUX_RETRY_INTERVAL as 500 us. For the maximum just use AUX_RETRY_INTERVAL+100 us. I want to have a define for this so that I can use it when calculating the estimated duration of i2c-over-aux transfers. Without a define it would be very easy to change the sleep duration and neglect to update the i2c-over-aux estimates. Cc: Simon Farnsworth <simon.farnsworth@onelan.com> Cc: moosotc@gmail.com Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Simon Farnsworth <simon.farnsworth@onelan.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| | * | | drm/atomic: Fix bookkeeping with TEST_ONLY, v3.Maarten Lankhorst2015-09-011-16/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit ec9f932ed41622d120de52a5b525e4d77b9ef17e "drm/atomic: Cleanup on error properly in the atomic ioctl." cleaned up some error paths, but didn't fix the TEST_ONLY path. In the check only case plane->fb shouldn't be updated, and the vblank events should be cleared as on failure. Changes since v1: - Fix -EDEADLK handling of vblank events too. - Free state last with CHECK_ONLY. Changes since v2: - Add comment about freeing crtc_state->event with TEST_ONLY. (Daniel Stone) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| * | | | Merge tag 'drm-intel-next-fixes-2015-09-10' of ↵Dave Airlie2015-09-119-32/+54
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://anongit.freedesktop.org/drm-intel into drm-next Fixes headed for v4.3-rc1, including Maarten's DP MST state checker fix you requested. * tag 'drm-intel-next-fixes-2015-09-10' of git://anongit.freedesktop.org/drm-intel: drm/i915: Allow DSI dual link to be configured on any pipe drm/i915: Don't try to use DDR DVFS on CHV when disabled in the BIOS drm/i915: Fix CSR MMIO address check drm/i915: Limit the number of loops for reading a split 64bit register drm/i915: Fix broken mst get_hw_state. drm/i915: Pass hpd_status_i915[] to intel_get_hpd_pins() in pre-g4x uapi/drm/i915_drm.h: fix userspace compilation. drm/i915: Always mark the object as dirty when used by the GPU
| | * | | | drm/i915: Allow DSI dual link to be configured on any pipeGaurav K Singh2015-09-101-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like single link MIPI panels, similarly for dual link panels, pipe to be configured is based on the DVO port from VBT Block 2. In hardware, Port A is mapped with Pipe A and Port C is mapped with Pipe B. This issue got introduced in - commit 7e9804fdcffc650515c60f524b8b2076ee59e710 Author: Jani Nikula <jani.nikula@intel.com> Date: Fri Jan 16 14:27:23 2015 +0200 drm/i915/dsi: add drm mipi dsi host support Cc: stable@vger.kernel.org # v4.0 Signed-off-by: Gaurav K Singh <gaurav.k.singh@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| | * | | | drm/i915: Don't try to use DDR DVFS on CHV when disabled in the BIOSVille Syrjälä2015-09-102-13/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If one disables DDR DVFS in the BIOS, Punit will apparently ignores all DDR DVFS request. Currently we assume that DDR DVFS is always operational, which leads to errors in dmesg when the DDR DVFS requests time out. Fix the problem by gently prodding Punit during driver load to find out whether it will respond to DDR DVFS requests. If the request times out, we assume that DDR DVFS has been permanenly disabled in the BIOS and no longer perster the Punit about it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91629 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com> Tested-by: Clint Taylor <Clinton.A.Taylor@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| | * | | | drm/i915: Fix CSR MMIO address checkTakashi Iwai2015-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a wrong logical AND (&&) used for the range check of CSR MMIO. Spotted nicely by gcc -Wlogical-op flag: drivers/gpu/drm/i915/intel_csr.c: In function ‘finish_csr_load’: drivers/gpu/drm/i915/intel_csr.c:353:41: warning: logical ‘and’ of mutually exclusive tests is always false [-Wlogical-op] Fixes: eb805623d8b1 ('drm/i915/skl: Add support to load SKL CSR firmware.') Cc: <stable@vger.kernel.org> # v4.2 Signed-off-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Animesh Manna <animesh.manna@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| | * | | | drm/i915: Limit the number of loops for reading a split 64bit registerChris Wilson2015-09-091-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In I915_READ64_2x32 we attempt to read a 64bit register using 2 32bit reads. Due to the nature of the registers we try to read in this manner, they may increment between the two instruction (e.g. a timestamp counter). To keep the result accurate, we repeat the read if we detect an overflow (i.e. the upper value varies). However, some hardware is just plain flaky and may endless loop as the the upper 32bits are not stable. Just give up after a couple of tries and report whatever we read last. v2: Use the most recent values when erring out on an unstable register. Reported-by: russianneuromancer@ya.ru Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91906 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| | * | | | drm/i915: Fix broken mst get_hw_state.Maarten Lankhorst2015-09-082-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | connector->encoder is initialized as NULL. Fix this by setting it in during pre enable. MST connectors are not read out during initial hw readout, and have no fixed encoder mappings. So it's harmless to return false when the connector has never been assigned to an encoder. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| | * | | | drm/i915: Pass hpd_status_i915[] to intel_get_hpd_pins() in pre-g4xVille Syrjälä2015-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass the correct hpd[] array to intel_get_hpd_pins() on pre-g4x platforms. This got broken in the following commit: commit fd63e2a972c670887e5e8a08440111d3812c0996 Author: Imre Deak <imre.deak@intel.com> Date: Tue Jul 21 15:32:44 2015 -0700 drm/i915: combine i9xx_get_hpd_pins and pch_get_hpd_pins Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Egbert Eich <eich@suse.de> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| | * | | | uapi/drm/i915_drm.h: fix userspace compilation.Artem Savkov2015-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 346add7834557b5b9628b9bf2387106d42e631d4 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jul 14 18:07:30 2015 +0200 drm/i915: Use expcitly fixed type in compat32 structs changed the type of param field in drm_i915_getparam from int to s32. This header is exported to userspace and needs to use userspace type __s32 instead. This fixes userspace compilation errors like the following: include/drm/i915_drm.h:361:2: error: unknown type name 's32' s32 param; Signed-off-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| | * | | | drm/i915: Always mark the object as dirty when used by the GPUChris Wilson2015-09-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There have been many hard to track down bugs whereby userspace forgot to flag a write buffer and then cause graphics corruption or a hung GPU when that buffer was later purged under memory pressure (as the buffer appeared clean, its pages would have been evicted rather than preserved and any changes more recent than in the backing storage would be lost). In retrospect this is a rare optimisation against memory pressure, already the slow path. If we always mark the buffer as dirty when accessed by the GPU, anything not used can still be evicted cheaply (ideal behaviour for mark-and-sweep eviction) but we do not run the risk of corruption. For correct read serialisation, userspace still has to notify when the GPU writes to an object. However, there are certain situations under which userspace may wish to tell white lies to the kernel... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Kristian Høgsberg <krh@bitplanet.net> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: "Goel, Akash" <akash.goel@intel.co> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
| * | | | | drm/qxl: validate monitors config modesJonathon Jongsma2015-09-112-26/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to some recent changes in drm_helper_probe_single_connector_modes_merge_bits(), old custom modes were not being pruned properly. In current kernels, drm_mode_validate_basic() is called to sanity-check each mode in the list. If the sanity-check passes, the mode's status gets set to to MODE_OK. In older kernels this check was not done, so old custom modes would still have a status of MODE_UNVERIFIED at this point, and would therefore be pruned later in the function. As a result of this new behavior, the list of modes for a device always includes every custom mode ever configured for the device, with the largest one listed first. Since desktop environments usually choose the first preferred mode when a hotplug event is emitted, this had the result of making it very difficult for the user to reduce the size of the display. The qxl driver did implement the mode_valid connector function, but it was empty. In order to restore the old behavior where old custom modes are pruned, we implement a proper mode_valid function for the qxl driver. This function now checks each mode against the last configured custom mode and the list of standard modes. If the mode doesn't match any of these, its status is set to MODE_BAD so that it will be pruned as expected. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
* | | | | | Merge branch 'for-4.3/blkcg' of git://git.kernel.dk/linux-blockLinus Torvalds2015-09-1017-1078/+1422
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull blk-cg updates from Jens Axboe: "A bit later in the cycle, but this has been in the block tree for a a while. This is basically four patchsets from Tejun, that improve our buffered cgroup writeback. It was dependent on the other cgroup changes, but they went in earlier in this cycle. Series 1 is set of 5 patches that has cgroup writeback updates: - bdi_writeback iteration fix which could lead to some wb's being skipped or repeated during e.g. sync under memory pressure. - Simplification of wb work wait mechanism. - Writeback tracepoints updated to report cgroup. Series 2 is is a set of updates for the CFQ cgroup writeback handling: cfq has always charged all async IOs to the root cgroup. It didn't have much choice as writeback didn't know about cgroups and there was no way to tell who to blame for a given writeback IO. writeback finally grew support for cgroups and now tags each writeback IO with the appropriate cgroup to charge it against. This patchset updates cfq so that it follows the blkcg each bio is tagged with. Async cfq_queues are now shared across cfq_group, which is per-cgroup, instead of per-request_queue cfq_data. This makes all IOs follow the weight based IO resource distribution implemented by cfq. - Switched from GFP_ATOMIC to GFP_NOWAIT as suggested by Jeff. - Other misc review points addressed, acks added and rebased. Series 3 is the blkcg policy cleanup patches: This patchset contains assorted cleanups for blkcg_policy methods and blk[c]g_policy_data handling. - alloc/free added for blkg_policy_data. exit dropped. - alloc/free added for blkcg_policy_data. - blk-throttle's async percpu allocation is replaced with direct allocation. - all methods now take blk[c]g_policy_data instead of blkcg_gq or blkcg. And finally, series 4 is a set of patches cleaning up the blkcg stats handling: blkcg's stats have always been somwhat of a mess. This patchset tries to improve the situation a bit. - The following patches added to consolidate blkcg entry point and blkg creation. This is in itself is an improvement and helps colllecting common stats on bio issue. - per-blkg stats now accounted on bio issue rather than request completion so that bio based and request based drivers can behave the same way. The issue was spotted by Vivek. - cfq-iosched implements custom recursive stats and blk-throttle implements custom per-cpu stats. This patchset make blkcg core support both by default. - cfq-iosched and blk-throttle keep track of the same stats multiple times. Unify them" * 'for-4.3/blkcg' of git://git.kernel.dk/linux-block: (45 commits) blkcg: use CGROUP_WEIGHT_* scale for io.weight on the unified hierarchy blkcg: s/CFQ_WEIGHT_*/CFQ_WEIGHT_LEGACY_*/ blkcg: implement interface for the unified hierarchy blkcg: misc preparations for unified hierarchy interface blkcg: separate out tg_conf_updated() from tg_set_conf() blkcg: move body parsing from blkg_conf_prep() to its callers blkcg: mark existing cftypes as legacy blkcg: rename subsystem name from blkio to io blkcg: refine error codes returned during blkcg configuration blkcg: remove unnecessary NULL checks from __cfqg_set_weight_device() blkcg: reduce stack usage of blkg_rwstat_recursive_sum() blkcg: remove cfqg_stats->sectors blkcg: move io_service_bytes and io_serviced stats into blkcg_gq blkcg: make blkg_[rw]stat_recursive_sum() to be able to index into blkcg_gq blkcg: make blkcg_[rw]stat per-cpu blkcg: add blkg_[rw]stat->aux_cnt and replace cfq_group->dead_stats with it blkcg: consolidate blkg creation in blkcg_bio_issue_check() blk-throttle: improve queue bypass handling blkcg: move root blkg lookup optimization from throtl_lookup_tg() to __blkg_lookup() blkcg: inline [__]blkg_lookup() ...
| * | | | | | blkcg: use CGROUP_WEIGHT_* scale for io.weight on the unified hierarchyTejun Heo2015-08-184-16/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cgroup is trying to make interface consistent across different controllers. For weight based resource control, the knob should have the range [1, 10000] and default to 100. This patch updates cfq-iosched so that the weight range conforms. The internal calculations have enough range and the widening of the weight range shouldn't cause any problem. * blkcg_policy->cpd_bind_fn() is added. If present, this is invoked when blkcg is attached to a hierarchy. * cfq_cpd_init() is updated to use the new default value on the unified hierarchy. * cfq_cpd_bind() callback is implemented to clear per-blkg configs and apply the default config matching the hierarchy type. * cfqd->root_group->[leaf_]weight initialization in cfq_init_queue() is moved into !CONFIG_CFQ_GROUP_IOSCHED block. cfq_cpd_bind() is now responsible for initializing the initial weights when blkcg is enabled. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | blkcg: s/CFQ_WEIGHT_*/CFQ_WEIGHT_LEGACY_*/Tejun Heo2015-08-181-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkcg is gonna switch to cgroup common weight range as defined by CGROUP_WEIGHT_* on the unified hierarchy. In preparation, rename CFQ_WEIGHT_* constants to CFQ_WEIGHT_LEGACY_*. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Arianna Avanzini <avanzini.arianna@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | blkcg: implement interface for the unified hierarchyTejun Heo2015-08-185-9/+279
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkcg interface grew to be the biggest of all controllers and unfortunately most inconsistent too. The interface files are inconsistent with a number of cloes duplicates. Some files have recursive variants while others don't. There's distinction between normal and leaf weights which isn't intuitive and there are a lot of stat knobs which don't make much sense outside of debugging and expose too much implementation details to userland. In the unified hierarchy, everything is always hierarchical and internal nodes can't have tasks rendering the two structural issues twisting the current interface. The interface has to be updated in a significant anyway and this is a good chance to revamp it as a whole. This patch implements blkcg interface for the unified hierarchy. * (from a previous patch) blkcg is identified by "io" instead of "blkio" on the unified hierarchy. Given that the whole interface is updated anyway, the rename shouldn't carry noticeable conversion overhead. * The original interface consisted of 27 files is replaced with the following three files. blkio.stat : per-blkcg stats blkio.weight : per-cgroup and per-cgroup-queue weight settings blkio.max : per-cgroup-queue bps and iops max limits Documentation/cgroups/unified-hierarchy.txt updated accordingly. v2: blkcg_policy->dfl_cftypes wasn't removed on blkcg_policy_unregister() corrupting the cftypes list. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | blkcg: misc preparations for unified hierarchy interfaceTejun Heo2015-08-183-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Export blkg_dev_name() * Drop unnecessary @cft from __cfq_set_weight(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | blkcg: separate out tg_conf_updated() from tg_set_conf()Tejun Heo2015-08-181-28/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tg_set_conf() is largely consisted of parsing and setting the new config and the follow-up application and propagation. This patch separates out the latter part into tg_conf_updated(). This will be used to implement interface for the unified hierarchy. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | blkcg: move body parsing from blkg_conf_prep() to its callersTejun Heo2015-08-184-22/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, blkg_conf_prep() expects input to be of the following form MAJ:MIN NUM and reads the NUM part into blkg_conf_ctx->v. This is quite restrictive and gets in the way in implementing blkcg interface for the unified hierarchy. This patch updates blkg_conf_prep() so that it expects MAJ:MIN BODY_STR where BODY_STR is an arbitrary string. blkg_conf_ctx->v is replaced with ->body which is a char pointer pointing to the start of BODY_STR. Parsing of the body is moved to blkg_conf_prep()'s callers. To allow using, for example, strsep() on blkg_conf_ctx->val, it is a non-const pointer and to accommodate that const is dropped from @input too. This doesn't cause any behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | blkcg: mark existing cftypes as legacyTejun Heo2015-08-184-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkcg is about to grow interface for the unified hierarchy. Add legacy to existing cftypes. * blkcg_policy->cftypes -> blkcg_policy->legacy_cftypes * blk-cgroup.c:blkcg_files -> blkcg_legacy_files * cfq-iosched.c:cfq_blkcg_files -> cfq_blkcg_legacy_files * blk-throttle.c:throtl_files -> throtl_legacy_files Pure renames. No functional change. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | blkcg: rename subsystem name from blkio to ioTejun Heo2015-08-186-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkio interface has become messy over time and is currently the largest. In addition to the inconsistent naming scheme, it has multiple stat files which report more or less the same thing, a number of debug stat files which expose internal details which shouldn't have been part of the public interface in the first place, recursive and non-recursive stats and leaf and non-leaf knobs. Both recursive vs. non-recursive and leaf vs. non-leaf distinctions don't make any sense on the unified hierarchy as only leaf cgroups can contain processes. cgroups is going through a major interface revision with the unified hierarchy involving significant fundamental usage changes and given that a significant portion of the interface doesn't make sense anymore, it's a good time to reorganize the interface. As the first step, this patch renames the external visible subsystem name from "blkio" to "io". This is more concise, matches the other two major subsystem names, "cpu" and "memory", and better suited as blkcg will be involved in anything writeback related too whether an actual block device is involved or not. As the subsystem legacy_name is set to "blkio", the only userland visible change outside the unified hierarchy is that blkcg is reported as "io" instead of "blkio" in the subsystem initialized message during boot. On the unified hierarchy, blkcg now appears as "io". Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Li Zefan <lizefan@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: cgroups@vger.kernel.org Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | blkcg: refine error codes returned during blkcg configurationTejun Heo2015-08-182-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkcg currently returns -EINVAL for most errors which can be pretty confusing given that the failure modes are quite varied. Update the error returns so that * -EINVAL only for syntactic errors. * -ERANGE if the value is out of range. * -ENODEV if the target device can't be found. * -EOPNOTSUPP if the policy is not enabled on the target device. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>
| * | | | | | blkcg: remove unnecessary NULL checks from __cfqg_set_weight_device()Tejun Heo2015-08-181-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blkg_to_cfqg() and blkcg_to_cfqgd() on a valid blkg with the policy enabled are guaranteed to return non-NULL and the counterpart in blk-throttle doesn't have these checks either. Remove the spurious NULL checks. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>