From 2988509dd8a0e9c2b64192a46ec2fe8211af6d3c Mon Sep 17 00:00:00 2001 From: Vladimir Murzin Date: Wed, 2 Nov 2016 11:55:34 +0000 Subject: ARM: KVM: Support vGICv3 ITS This patch allows to build and use vGICv3 ITS in 32-bit mode. Signed-off-by: Vladimir Murzin Reviewed-by: Andre Przywara Reviewed-by: Marc Zyngier Signed-off-by: Marc Zyngier --- Documentation/virtual/kvm/api.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation/virtual') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 739db9ab16b2..2feeae6a4c3f 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2198,7 +2198,7 @@ after pausing the vcpu, but before it is resumed. 4.71 KVM_SIGNAL_MSI Capability: KVM_CAP_SIGNAL_MSI -Architectures: x86 arm64 +Architectures: x86 arm arm64 Type: vm ioctl Parameters: struct kvm_msi (in) Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error -- cgit v1.2.1 From 0d808df06a44200f52262b6eb72bcb6042f5a7c5 Mon Sep 17 00:00:00 2001 From: Paul Mackerras Date: Mon, 7 Nov 2016 15:09:58 +1100 Subject: KVM: PPC: Book3S HV: Save/restore XER in checkpointed register state When switching from/to a guest that has a transaction in progress, we need to save/restore the checkpointed register state. Although XER is part of the CPU state that gets checkpointed, the code that does this saving and restoring doesn't save/restore XER. This fixes it by saving and restoring the XER. To allow userspace to read/write the checkpointed XER value, we also add a new ONE_REG specifier. The visible effect of this bug is that the guest may see its XER value being corrupted when it uses transactions. Fixes: e4e38121507a ("KVM: PPC: Book3S HV: Add transactional memory support") Fixes: 0a8eccefcb34 ("KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Paul Mackerras Reviewed-by: Thomas Huth Signed-off-by: Paul Mackerras --- Documentation/virtual/kvm/api.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation/virtual') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 739db9ab16b2..a7596e9fdf06 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2039,6 +2039,7 @@ registers, find a list below: PPC | KVM_REG_PPC_TM_VSCR | 32 PPC | KVM_REG_PPC_TM_DSCR | 64 PPC | KVM_REG_PPC_TM_TAR | 64 + PPC | KVM_REG_PPC_TM_XER | 64 | | MIPS | KVM_REG_MIPS_R0 | 64 ... -- cgit v1.2.1 From e9cf1e085647b433ccd98582681b17121ecfdc21 Mon Sep 17 00:00:00 2001 From: Paul Mackerras Date: Fri, 18 Nov 2016 13:11:42 +1100 Subject: KVM: PPC: Book3S HV: Add new POWER9 guest-accessible SPRs This adds code to handle two new guest-accessible special-purpose registers on POWER9: TIDR (thread ID register) and PSSCR (processor stop status and control register). They are context-switched between host and guest, and the guest values can be read and set via the one_reg interface. The PSSCR contains some fields which are guest-accessible and some which are only accessible in hypervisor mode. We only allow the guest-accessible fields to be read or set by userspace. Signed-off-by: Paul Mackerras --- Documentation/virtual/kvm/api.txt | 2 ++ 1 file changed, 2 insertions(+) (limited to 'Documentation/virtual') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index a7596e9fdf06..8a5ebd118313 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -2023,6 +2023,8 @@ registers, find a list below: PPC | KVM_REG_PPC_WORT | 64 PPC | KVM_REG_PPC_SPRG9 | 64 PPC | KVM_REG_PPC_DBSR | 32 + PPC | KVM_REG_PPC_TIDR | 64 + PPC | KVM_REG_PPC_PSSCR | 64 PPC | KVM_REG_PPC_TM_GPR0 | 64 ... PPC | KVM_REG_PPC_TM_GPR31 | 64 -- cgit v1.2.1 From 6ccad8cea5bcb0660f56677a5fdc52265f8ddf76 Mon Sep 17 00:00:00 2001 From: Suraj Jitindar Singh Date: Fri, 14 Oct 2016 11:53:24 +1100 Subject: KVM: Add halt polling documentation There is currently no documentation about the halt polling capabilities of the kvm module. Add some documentation describing the mechanism as well as the module parameters to all better understanding of how halt polling should be used and the effect of tuning the module parameters. Signed-off-by: Suraj Jitindar Singh Signed-off-by: Paul Mackerras --- Documentation/virtual/kvm/00-INDEX | 2 + Documentation/virtual/kvm/halt-polling.txt | 127 +++++++++++++++++++++++++++++ 2 files changed, 129 insertions(+) create mode 100644 Documentation/virtual/kvm/halt-polling.txt (limited to 'Documentation/virtual') diff --git a/Documentation/virtual/kvm/00-INDEX b/Documentation/virtual/kvm/00-INDEX index fee9f2bf9c64..69fe1a8b7ad1 100644 --- a/Documentation/virtual/kvm/00-INDEX +++ b/Documentation/virtual/kvm/00-INDEX @@ -6,6 +6,8 @@ cpuid.txt - KVM-specific cpuid leaves (x86). devices/ - KVM_CAP_DEVICE_CTRL userspace API. +halt-polling.txt + - notes on halt-polling hypercalls.txt - KVM hypercalls. locking.txt diff --git a/Documentation/virtual/kvm/halt-polling.txt b/Documentation/virtual/kvm/halt-polling.txt new file mode 100644 index 000000000000..4a8418318769 --- /dev/null +++ b/Documentation/virtual/kvm/halt-polling.txt @@ -0,0 +1,127 @@ +The KVM halt polling system +=========================== + +The KVM halt polling system provides a feature within KVM whereby the latency +of a guest can, under some circumstances, be reduced by polling in the host +for some time period after the guest has elected to no longer run by cedeing. +That is, when a guest vcpu has ceded, or in the case of powerpc when all of the +vcpus of a single vcore have ceded, the host kernel polls for wakeup conditions +before giving up the cpu to the scheduler in order to let something else run. + +Polling provides a latency advantage in cases where the guest can be run again +very quickly by at least saving us a trip through the scheduler, normally on +the order of a few micro-seconds, although performance benefits are workload +dependant. In the event that no wakeup source arrives during the polling +interval or some other task on the runqueue is runnable the scheduler is +invoked. Thus halt polling is especially useful on workloads with very short +wakeup periods where the time spent halt polling is minimised and the time +savings of not invoking the scheduler are distinguishable. + +The generic halt polling code is implemented in: + + virt/kvm/kvm_main.c: kvm_vcpu_block() + +The powerpc kvm-hv specific case is implemented in: + + arch/powerpc/kvm/book3s_hv.c: kvmppc_vcore_blocked() + +Halt Polling Interval +===================== + +The maximum time for which to poll before invoking the scheduler, referred to +as the halt polling interval, is increased and decreased based on the perceived +effectiveness of the polling in an attempt to limit pointless polling. +This value is stored in either the vcpu struct: + + kvm_vcpu->halt_poll_ns + +or in the case of powerpc kvm-hv, in the vcore struct: + + kvmppc_vcore->halt_poll_ns + +Thus this is a per vcpu (or vcore) value. + +During polling if a wakeup source is received within the halt polling interval, +the interval is left unchanged. In the event that a wakeup source isn't +received during the polling interval (and thus schedule is invoked) there are +two options, either the polling interval and total block time[0] were less than +the global max polling interval (see module params below), or the total block +time was greater than the global max polling interval. + +In the event that both the polling interval and total block time were less than +the global max polling interval then the polling interval can be increased in +the hope that next time during the longer polling interval the wake up source +will be received while the host is polling and the latency benefits will be +received. The polling interval is grown in the function grow_halt_poll_ns() and +is multiplied by the module parameter halt_poll_ns_grow. + +In the event that the total block time was greater than the global max polling +interval then the host will never poll for long enough (limited by the global +max) to wakeup during the polling interval so it may as well be shrunk in order +to avoid pointless polling. The polling interval is shrunk in the function +shrink_halt_poll_ns() and is divided by the module parameter +halt_poll_ns_shrink, or set to 0 iff halt_poll_ns_shrink == 0. + +It is worth noting that this adjustment process attempts to hone in on some +steady state polling interval but will only really do a good job for wakeups +which come at an approximately constant rate, otherwise there will be constant +adjustment of the polling interval. + +[0] total block time: the time between when the halt polling function is + invoked and a wakeup source received (irrespective of + whether the scheduler is invoked within that function). + +Module Parameters +================= + +The kvm module has 3 tuneable module parameters to adjust the global max +polling interval as well as the rate at which the polling interval is grown and +shrunk. These variables are defined in include/linux/kvm_host.h and as module +parameters in virt/kvm/kvm_main.c, or arch/powerpc/kvm/book3s_hv.c in the +powerpc kvm-hv case. + +Module Parameter | Description | Default Value +-------------------------------------------------------------------------------- +halt_poll_ns | The global max polling interval | KVM_HALT_POLL_NS_DEFAULT + | which defines the ceiling value | + | of the polling interval for | (per arch value) + | each vcpu. | +-------------------------------------------------------------------------------- +halt_poll_ns_grow | The value by which the halt | 2 + | polling interval is multiplied | + | in the grow_halt_poll_ns() | + | function. | +-------------------------------------------------------------------------------- +halt_poll_ns_shrink | The value by which the halt | 0 + | polling interval is divided in | + | the shrink_halt_poll_ns() | + | function. | +-------------------------------------------------------------------------------- + +These module parameters can be set from the debugfs files in: + + /sys/module/kvm/parameters/ + +Note: that these module parameters are system wide values and are not able to + be tuned on a per vm basis. + +Further Notes +============= + +- Care should be taken when setting the halt_poll_ns module parameter as a +large value has the potential to drive the cpu usage to 100% on a machine which +would be almost entirely idle otherwise. This is because even if a guest has +wakeups during which very little work is done and which are quite far apart, if +the period is shorter than the global max polling interval (halt_poll_ns) then +the host will always poll for the entire block time and thus cpu utilisation +will go to 100%. + +- Halt polling essentially presents a trade off between power usage and latency +and the module parameters should be used to tune the affinity for this. Idle +cpu time is essentially converted to host kernel time with the aim of decreasing +latency when entering the guest. + +- Halt polling will only be conducted by the host when no other tasks are +runnable on that cpu, otherwise the polling will cease immediately and +schedule will be invoked to allow that other task to run. Thus this doesn't +allow a guest to denial of service the cpu. -- cgit v1.2.1