hyperv.rst 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323
  1. Hyper-V Enlightenments
  2. ======================
  3. Description
  4. -----------
  5. In some cases when implementing a hardware interface in software is slow, KVM
  6. implements its own paravirtualized interfaces. This works well for Linux as
  7. guest support for such features is added simultaneously with the feature itself.
  8. It may, however, be hard-to-impossible to add support for these interfaces to
  9. proprietary OSes, namely, Microsoft Windows.
  10. KVM on x86 implements Hyper-V Enlightenments for Windows guests. These features
  11. make Windows and Hyper-V guests think they're running on top of a Hyper-V
  12. compatible hypervisor and use Hyper-V specific features.
  13. Setup
  14. -----
  15. No Hyper-V enlightenments are enabled by default by either KVM or QEMU. In
  16. QEMU, individual enlightenments can be enabled through CPU flags, e.g:
  17. .. parsed-literal::
  18. |qemu_system| --enable-kvm --cpu host,hv_relaxed,hv_vpindex,hv_time, ...
  19. Sometimes there are dependencies between enlightenments, QEMU is supposed to
  20. check that the supplied configuration is sane.
  21. When any set of the Hyper-V enlightenments is enabled, QEMU changes hypervisor
  22. identification (CPUID 0x40000000..0x4000000A) to Hyper-V. KVM identification
  23. and features are kept in leaves 0x40000100..0x40000101.
  24. Existing enlightenments
  25. -----------------------
  26. ``hv-relaxed``
  27. This feature tells guest OS to disable watchdog timeouts as it is running on a
  28. hypervisor. It is known that some Windows versions will do this even when they
  29. see 'hypervisor' CPU flag.
  30. ``hv-vapic``
  31. Provides so-called VP Assist page MSR to guest allowing it to work with APIC
  32. more efficiently. In particular, this enlightenment allows paravirtualized
  33. (exit-less) EOI processing.
  34. ``hv-spinlocks`` = xxx
  35. Enables paravirtualized spinlocks. The parameter indicates how many times
  36. spinlock acquisition should be attempted before indicating the situation to the
  37. hypervisor. A special value 0xffffffff indicates "never notify".
  38. ``hv-vpindex``
  39. Provides HV_X64_MSR_VP_INDEX (0x40000002) MSR to the guest which has Virtual
  40. processor index information. This enlightenment makes sense in conjunction with
  41. hv-synic, hv-stimer and other enlightenments which require the guest to know its
  42. Virtual Processor indices (e.g. when VP index needs to be passed in a
  43. hypercall).
  44. ``hv-runtime``
  45. Provides HV_X64_MSR_VP_RUNTIME (0x40000010) MSR to the guest. The MSR keeps the
  46. virtual processor run time in 100ns units. This gives guest operating system an
  47. idea of how much time was 'stolen' from it (when the virtual CPU was preempted
  48. to perform some other work).
  49. ``hv-crash``
  50. Provides HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 (0x40000100..0x40000105) and
  51. HV_X64_MSR_CRASH_CTL (0x40000105) MSRs to the guest. These MSRs are written to
  52. by the guest when it crashes, HV_X64_MSR_CRASH_P0..HV_X64_MSR_CRASH_P5 MSRs
  53. contain additional crash information. This information is outputted in QEMU log
  54. and through QAPI.
  55. Note: unlike under genuine Hyper-V, write to HV_X64_MSR_CRASH_CTL causes guest
  56. to shutdown. This effectively blocks crash dump generation by Windows.
  57. ``hv-time``
  58. Enables two Hyper-V-specific clocksources available to the guest: MSR-based
  59. Hyper-V clocksource (HV_X64_MSR_TIME_REF_COUNT, 0x40000020) and Reference TSC
  60. page (enabled via MSR HV_X64_MSR_REFERENCE_TSC, 0x40000021). Both clocksources
  61. are per-guest, Reference TSC page clocksource allows for exit-less time stamp
  62. readings. Using this enlightenment leads to significant speedup of all timestamp
  63. related operations.
  64. ``hv-synic``
  65. Enables Hyper-V Synthetic interrupt controller - an extension of a local APIC.
  66. When enabled, this enlightenment provides additional communication facilities
  67. to the guest: SynIC messages and Events. This is a pre-requisite for
  68. implementing VMBus devices (not yet in QEMU). Additionally, this enlightenment
  69. is needed to enable Hyper-V synthetic timers. SynIC is controlled through MSRs
  70. HV_X64_MSR_SCONTROL..HV_X64_MSR_EOM (0x40000080..0x40000084) and
  71. HV_X64_MSR_SINT0..HV_X64_MSR_SINT15 (0x40000090..0x4000009F)
  72. Requires: ``hv-vpindex``
  73. ``hv-stimer``
  74. Enables Hyper-V synthetic timers. There are four synthetic timers per virtual
  75. CPU controlled through HV_X64_MSR_STIMER0_CONFIG..HV_X64_MSR_STIMER3_COUNT
  76. (0x400000B0..0x400000B7) MSRs. These timers can work either in single-shot or
  77. periodic mode. It is known that certain Windows versions revert to using HPET
  78. (or even RTC when HPET is unavailable) extensively when this enlightenment is
  79. not provided; this can lead to significant CPU consumption, even when virtual
  80. CPU is idle.
  81. Requires: ``hv-vpindex``, ``hv-synic``, ``hv-time``
  82. ``hv-tlbflush``
  83. Enables paravirtualized TLB shoot-down mechanism. On x86 architecture, remote
  84. TLB flush procedure requires sending IPIs and waiting for other CPUs to perform
  85. local TLB flush. In virtualized environment some virtual CPUs may not even be
  86. scheduled at the time of the call and may not require flushing (or, flushing
  87. may be postponed until the virtual CPU is scheduled). hv-tlbflush enlightenment
  88. implements TLB shoot-down through hypervisor enabling the optimization.
  89. Requires: ``hv-vpindex``
  90. ``hv-ipi``
  91. Enables paravirtualized IPI send mechanism. HvCallSendSyntheticClusterIpi
  92. hypercall may target more than 64 virtual CPUs simultaneously, doing the same
  93. through APIC requires more than one access (and thus exit to the hypervisor).
  94. Requires: ``hv-vpindex``
  95. ``hv-vendor-id`` = xxx
  96. This changes Hyper-V identification in CPUID 0x40000000.EBX-EDX from the default
  97. "Microsoft Hv". The parameter should be no longer than 12 characters. According
  98. to the specification, guests shouldn't use this information and it is unknown
  99. if there is a Windows version which acts differently.
  100. Note: hv-vendor-id is not an enlightenment and thus doesn't enable Hyper-V
  101. identification when specified without some other enlightenment.
  102. ``hv-reset``
  103. Provides HV_X64_MSR_RESET (0x40000003) MSR to the guest allowing it to reset
  104. itself by writing to it. Even when this MSR is enabled, it is not a recommended
  105. way for Windows to perform system reboot and thus it may not be used.
  106. ``hv-frequencies``
  107. Provides HV_X64_MSR_TSC_FREQUENCY (0x40000022) and HV_X64_MSR_APIC_FREQUENCY
  108. (0x40000023) allowing the guest to get its TSC/APIC frequencies without doing
  109. measurements.
  110. ``hv-reenlightenment``
  111. The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
  112. enabled, it provides HV_X64_MSR_REENLIGHTENMENT_CONTROL (0x40000106),
  113. HV_X64_MSR_TSC_EMULATION_CONTROL (0x40000107)and HV_X64_MSR_TSC_EMULATION_STATUS
  114. (0x40000108) MSRs allowing the guest to get notified when TSC frequency changes
  115. (only happens on migration) and keep using old frequency (through emulation in
  116. the hypervisor) until it is ready to switch to the new one. This, in conjunction
  117. with ``hv-frequencies``, allows Hyper-V on KVM to pass stable clocksource
  118. (Reference TSC page) to its own guests.
  119. Note, KVM doesn't fully support re-enlightenment notifications and doesn't
  120. emulate TSC accesses after migration so 'tsc-frequency=' CPU option also has to
  121. be specified to make migration succeed. The destination host has to either have
  122. the same TSC frequency or support TSC scaling CPU feature.
  123. Recommended: ``hv-frequencies``
  124. ``hv-evmcs``
  125. The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
  126. enabled, it provides Enlightened VMCS version 1 feature to the guest. The feature
  127. implements paravirtualized protocol between L0 (KVM) and L1 (Hyper-V)
  128. hypervisors making L2 exits to the hypervisor faster. The feature is Intel-only.
  129. Note: some virtualization features (e.g. Posted Interrupts) are disabled when
  130. hv-evmcs is enabled. It may make sense to measure your nested workload with and
  131. without the feature to find out if enabling it is beneficial.
  132. Requires: ``hv-vapic``
  133. ``hv-stimer-direct``
  134. Hyper-V specification allows synthetic timer operation in two modes: "classic",
  135. when expiration event is delivered as SynIC message and "direct", when the event
  136. is delivered via normal interrupt. It is known that nested Hyper-V can only
  137. use synthetic timers in direct mode and thus ``hv-stimer-direct`` needs to be
  138. enabled.
  139. Requires: ``hv-vpindex``, ``hv-synic``, ``hv-time``, ``hv-stimer``
  140. ``hv-avic`` (``hv-apicv``)
  141. The enlightenment allows to use Hyper-V SynIC with hardware APICv/AVIC enabled.
  142. Normally, Hyper-V SynIC disables these hardware feature and suggests the guest
  143. to use paravirtualized AutoEOI feature.
  144. Note: enabling this feature on old hardware (without APICv/AVIC support) may
  145. have negative effect on guest's performance.
  146. ``hv-no-nonarch-coresharing`` = on/off/auto
  147. This enlightenment tells guest OS that virtual processors will never share a
  148. physical core unless they are reported as sibling SMT threads. This information
  149. is required by Windows and Hyper-V guests to properly mitigate SMT related CPU
  150. vulnerabilities.
  151. When the option is set to 'auto' QEMU will enable the feature only when KVM
  152. reports that non-architectural coresharing is impossible, this means that
  153. hyper-threading is not supported or completely disabled on the host. This
  154. setting also prevents migration as SMT settings on the destination may differ.
  155. When the option is set to 'on' QEMU will always enable the feature, regardless
  156. of host setup. To keep guests secure, this can only be used in conjunction with
  157. exposing correct vCPU topology and vCPU pinning.
  158. ``hv-version-id-build``, ``hv-version-id-major``, ``hv-version-id-minor``, ``hv-version-id-spack``, ``hv-version-id-sbranch``, ``hv-version-id-snumber``
  159. This changes Hyper-V version identification in CPUID 0x40000002.EAX-EDX from the
  160. default (WS2016).
  161. - ``hv-version-id-build`` sets 'Build Number' (32 bits)
  162. - ``hv-version-id-major`` sets 'Major Version' (16 bits)
  163. - ``hv-version-id-minor`` sets 'Minor Version' (16 bits)
  164. - ``hv-version-id-spack`` sets 'Service Pack' (32 bits)
  165. - ``hv-version-id-sbranch`` sets 'Service Branch' (8 bits)
  166. - ``hv-version-id-snumber`` sets 'Service Number' (24 bits)
  167. Note: hv-version-id-* are not enlightenments and thus don't enable Hyper-V
  168. identification when specified without any other enlightenments.
  169. ``hv-syndbg``
  170. Enables Hyper-V synthetic debugger interface, this is a special interface used
  171. by Windows Kernel debugger to send the packets through, rather than sending
  172. them via serial/network .
  173. When enabled, this enlightenment provides additional communication facilities
  174. to the guest: SynDbg messages.
  175. This new communication is used by Windows Kernel debugger rather than sending
  176. packets via serial/network, adding significant performance boost over the other
  177. comm channels.
  178. This enlightenment requires a VMBus device (-device vmbus-bridge,irq=15).
  179. Requires: ``hv-relaxed``, ``hv_time``, ``hv-vapic``, ``hv-vpindex``, ``hv-synic``, ``hv-runtime``, ``hv-stimer``
  180. ``hv-emsr-bitmap``
  181. The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
  182. enabled, it allows L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to
  183. avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. While the protocol is
  184. supported for both VMX (Intel) and SVM (AMD), the VMX implementation requires
  185. Enlightened VMCS (``hv-evmcs``) feature to also be enabled.
  186. Recommended: ``hv-evmcs`` (Intel)
  187. ``hv-xmm-input``
  188. Hyper-V specification allows to pass parameters for certain hypercalls using XMM
  189. registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows
  190. for faster hypercalls processing as KVM can avoid reading guest's memory.
  191. ``hv-tlbflush-ext``
  192. Allow for extended GVA ranges to be passed to Hyper-V TLB flush hypercalls
  193. (HvFlushVirtualAddressList/HvFlushVirtualAddressListEx).
  194. Requires: ``hv-tlbflush``
  195. ``hv-tlbflush-direct``
  196. The enlightenment is nested specific, it targets Hyper-V on KVM guests. When
  197. enabled, it allows L0 (KVM) to directly handle TLB flush hypercalls from L2
  198. guest without the need to exit to L1 (Hyper-V) hypervisor. While the feature is
  199. supported for both VMX (Intel) and SVM (AMD), the VMX implementation requires
  200. Enlightened VMCS (``hv-evmcs``) feature to also be enabled.
  201. Requires: ``hv-vapic``
  202. Recommended: ``hv-evmcs`` (Intel)
  203. Supplementary features
  204. ----------------------
  205. ``hv-passthrough``
  206. In some cases (e.g. during development) it may make sense to use QEMU in
  207. 'pass-through' mode and give Windows guests all enlightenments currently
  208. supported by KVM.
  209. Note: ``hv-passthrough`` flag only enables enlightenments which are known to QEMU
  210. (have corresponding 'hv-' flag) and copies ``hv-spinlocks`` and ``hv-vendor-id``
  211. values from KVM to QEMU. ``hv-passthrough`` overrides all other 'hv-' settings on
  212. the command line.
  213. Note: ``hv-passthrough`` does not enable ``hv-syndbg`` which can prevent certain
  214. Windows guests from booting when used without proper configuration. If needed,
  215. ``hv-syndbg`` can be enabled additionally.
  216. Note: ``hv-passthrough`` effectively prevents migration as the list of enabled
  217. enlightenments may differ between target and destination hosts.
  218. ``hv-enforce-cpuid``
  219. By default, KVM allows the guest to use all currently supported Hyper-V
  220. enlightenments when Hyper-V CPUID interface was exposed, regardless of if
  221. some features were not announced in guest visible CPUIDs. ``hv-enforce-cpuid``
  222. feature alters this behavior and only allows the guest to use exposed Hyper-V
  223. enlightenments.
  224. Recommendations
  225. ---------------
  226. To achieve the best performance of Windows and Hyper-V guests and unless there
  227. are any specific requirements (e.g. migration to older QEMU/KVM versions,
  228. emulating specific Hyper-V version, ...), it is recommended to enable all
  229. currently implemented Hyper-V enlightenments with the following exceptions:
  230. - ``hv-syndbg``, ``hv-passthrough``, ``hv-enforce-cpuid`` should not be enabled
  231. in production configurations as these are debugging/development features.
  232. - ``hv-reset`` can be avoided as modern Hyper-V versions don't expose it.
  233. - ``hv-evmcs`` can (and should) be enabled on Intel CPUs only. While the feature
  234. is only used in nested configurations (Hyper-V, WSL2), enabling it for regular
  235. Windows guests should not have any negative effects.
  236. - ``hv-no-nonarch-coresharing`` must only be enabled if vCPUs are properly pinned
  237. so no non-architectural core sharing is possible.
  238. - ``hv-vendor-id``, ``hv-version-id-build``, ``hv-version-id-major``,
  239. ``hv-version-id-minor``, ``hv-version-id-spack``, ``hv-version-id-sbranch``,
  240. ``hv-version-id-snumber`` can be left unchanged, guests are not supposed to
  241. behave differently when different Hyper-V version is presented to them.
  242. - ``hv-crash`` must only be enabled if the crash information is consumed via
  243. QAPI by higher levels of the virtualization stack. Enabling this feature
  244. effectively prevents Windows from creating dumps upon crashes.
  245. - ``hv-reenlightenment`` can only be used on hardware which supports TSC
  246. scaling or when guest migration is not needed.
  247. - ``hv-spinlocks`` should be set to e.g. 0xfff when host CPUs are overcommited
  248. (meaning there are other scheduled tasks or guests) and can be left unchanged
  249. from the default value (0xffffffff) otherwise.
  250. - ``hv-avic``/``hv-apicv`` should not be enabled if the hardware does not
  251. support APIC virtualization (Intel APICv, AMD AVIC).
  252. Useful links
  253. ------------
  254. Hyper-V Top Level Functional specification and other information:
  255. - https://github.com/MicrosoftDocs/Virtualization-Documentation
  256. - https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs