powernv.rst 7.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201
  1. PowerNV family boards (``powernv8``, ``powernv9``, ``powernv10``)
  2. ==================================================================
  3. PowerNV (as Non-Virtualized) is the "bare metal" platform using the
  4. OPAL firmware. It runs Linux on IBM and OpenPOWER systems and it can
  5. be used as an hypervisor OS, running KVM guests, or simply as a host
  6. OS.
  7. The PowerNV QEMU machine tries to emulate a PowerNV system at the
  8. level of the skiboot firmware, which loads the OS and provides some
  9. runtime services. Power Systems have a lower firmware (HostBoot) that
  10. does low level system initialization, like DRAM training. This is
  11. beyond the scope of what QEMU addresses today.
  12. Supported devices
  13. -----------------
  14. * Multi processor support for POWER8, POWER8NVL and POWER9.
  15. * XSCOM, serial communication sideband bus to configure chiplets.
  16. * Simple LPC Controller.
  17. * Processor Service Interface (PSI) Controller.
  18. * Interrupt Controller, XICS (POWER8) and XIVE (POWER9) and XIVE2 (Power10).
  19. * POWER8 PHB3 PCIe Host bridge and POWER9 PHB4 PCIe Host bridge.
  20. * Simple OCC is an on-chip micro-controller used for power management tasks.
  21. * iBT device to handle BMC communication, with the internal BMC simulator
  22. provided by QEMU or an external BMC such as an Aspeed QEMU machine.
  23. * PNOR containing the different firmware partitions.
  24. Missing devices
  25. ---------------
  26. A lot is missing, among which :
  27. * I2C controllers (yet to be merged).
  28. * NPU/NPU2/NPU3 controllers.
  29. * EEH support for PCIe Host bridge controllers.
  30. * NX controller.
  31. * VAS controller.
  32. * chipTOD (Time Of Day).
  33. * Self Boot Engine (SBE).
  34. * FSI bus.
  35. Firmware
  36. --------
  37. The OPAL firmware (OpenPower Abstraction Layer) for OpenPower systems
  38. includes the runtime services ``skiboot`` and the bootloader kernel and
  39. initramfs ``skiroot``. Source code can be found on the `OpenPOWER account at
  40. GitHub <https://github.com/open-power>`_.
  41. Prebuilt images of ``skiboot`` and ``skiroot`` are made available on the
  42. `OpenPOWER <https://github.com/open-power/op-build/releases/>`__ site.
  43. QEMU includes a prebuilt image of ``skiboot`` which is updated when a
  44. more recent version is required by the models.
  45. Current acceleration status
  46. ---------------------------
  47. KVM acceleration in Linux Power hosts is provided by the kvm-hv and
  48. kvm-pr modules. kvm-hv is adherent to PAPR and it's not compliant with
  49. powernv. kvm-pr in theory could be used as a valid accel option but
  50. this isn't supported by kvm-pr at this moment.
  51. To spare users from dealing with not so informative errors when attempting
  52. to use accel=kvm, the powernv machine will throw an error informing that
  53. KVM is not supported. This can be revisited in the future if kvm-pr (or
  54. any other KVM alternative) is usable as KVM accel for this machine.
  55. Boot options
  56. ------------
  57. Here is a simple setup with one e1000e NIC :
  58. .. code-block:: bash
  59. $ qemu-system-ppc64 -m 2G -machine powernv9 -smp 2,cores=2,threads=1 \
  60. -accel tcg,thread=single \
  61. -device e1000e,netdev=net0,mac=C0:FF:EE:00:00:02,bus=pcie.0,addr=0x0 \
  62. -netdev user,id=net0,hostfwd=::20022-:22,hostname=pnv \
  63. -kernel ./zImage.epapr \
  64. -initrd ./rootfs.cpio.xz \
  65. -nographic
  66. and a SATA disk :
  67. .. code-block:: bash
  68. -device ich9-ahci,id=sata0,bus=pcie.1,addr=0x0 \
  69. -drive file=./ubuntu-ppc64le.qcow2,if=none,id=drive0,format=qcow2,cache=none \
  70. -device ide-hd,bus=sata0.0,unit=0,drive=drive0,id=ide,bootindex=1 \
  71. Complex PCIe configuration
  72. ~~~~~~~~~~~~~~~~~~~~~~~~~~
  73. Six PHBs are defined per chip (POWER9) but no default PCI layout is
  74. provided (to be compatible with libvirt). One PCI device can be added
  75. on any of the available PCIe slots using command line options such as:
  76. .. code-block:: bash
  77. -device e1000e,netdev=net0,mac=C0:FF:EE:00:00:02,bus=pcie.0,addr=0x0
  78. -netdev bridge,id=net0,helper=/usr/libexec/qemu-bridge-helper,br=virbr0,id=hostnet0
  79. -device megasas,id=scsi0,bus=pcie.0,addr=0x0
  80. -drive file=./ubuntu-ppc64le.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none
  81. -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2
  82. Here is a full example with two different storage controllers on
  83. different PHBs, each with a disk, the second PHB is empty :
  84. .. code-block:: bash
  85. $ qemu-system-ppc64 -m 2G -machine powernv9 -smp 2,cores=2,threads=1 -accel tcg,thread=single \
  86. -kernel ./zImage.epapr -initrd ./rootfs.cpio.xz -bios ./skiboot.lid \
  87. \
  88. -device megasas,id=scsi0,bus=pcie.0,addr=0x0 \
  89. -drive file=./rhel7-ppc64le.qcow2,if=none,id=drive-scsi0-0-0-0,format=qcow2,cache=none \
  90. -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=2 \
  91. \
  92. -device pcie-pci-bridge,id=bridge1,bus=pcie.1,addr=0x0 \
  93. \
  94. -device ich9-ahci,id=sata0,bus=bridge1,addr=0x1 \
  95. -drive file=./ubuntu-ppc64le.qcow2,if=none,id=drive0,format=qcow2,cache=none \
  96. -device ide-hd,bus=sata0.0,unit=0,drive=drive0,id=ide,bootindex=1 \
  97. -device e1000e,netdev=net0,mac=C0:FF:EE:00:00:02,bus=bridge1,addr=0x2 \
  98. -netdev bridge,helper=/usr/libexec/qemu-bridge-helper,br=virbr0,id=net0 \
  99. -device nec-usb-xhci,bus=bridge1,addr=0x7 \
  100. \
  101. -serial mon:stdio -nographic
  102. You can also use VIRTIO devices :
  103. .. code-block:: bash
  104. -drive file=./fedora-ppc64le.qcow2,if=none,snapshot=on,id=drive0 \
  105. -device virtio-blk-pci,drive=drive0,id=blk0,bus=pcie.0 \
  106. \
  107. -netdev tap,helper=/usr/lib/qemu/qemu-bridge-helper,br=virbr0,id=netdev0 \
  108. -device virtio-net-pci,netdev=netdev0,id=net0,bus=pcie.1 \
  109. \
  110. -fsdev local,id=fsdev0,path=$HOME,security_model=passthrough \
  111. -device virtio-9p-pci,fsdev=fsdev0,mount_tag=host,bus=pcie.2
  112. Multi sockets
  113. ~~~~~~~~~~~~~
  114. The number of sockets is deduced from the number of CPUs and the
  115. number of cores. ``-smp 2,cores=1`` will define a machine with 2
  116. sockets of 1 core, whereas ``-smp 2,cores=2`` will define a machine
  117. with 1 socket of 2 cores. ``-smp 8,cores=2``, 4 sockets of 2 cores.
  118. BMC configuration
  119. ~~~~~~~~~~~~~~~~~
  120. OpenPOWER systems negotiate the shutdown and reboot with their
  121. BMC. The QEMU PowerNV machine embeds an IPMI BMC simulator using the
  122. iBT interface and should offer the same power features.
  123. If you want to define your own BMC, use ``-nodefaults`` and specify
  124. one on the command line :
  125. .. code-block:: bash
  126. -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0,irq=10
  127. The files `palmetto-SDR.bin <http://www.kaod.org/qemu/powernv/palmetto-SDR.bin>`__
  128. and `palmetto-FRU.bin <http://www.kaod.org/qemu/powernv/palmetto-FRU.bin>`__
  129. define a Sensor Data Record repository and a Field Replaceable Unit
  130. inventory for a Palmetto BMC. They can be used to extend the QEMU BMC
  131. simulator.
  132. .. code-block:: bash
  133. -device ipmi-bmc-sim,sdrfile=./palmetto-SDR.bin,fruareasize=256,frudatafile=./palmetto-FRU.bin,id=bmc0 \
  134. -device isa-ipmi-bt,bmc=bmc0,irq=10
  135. The PowerNV machine can also be run with an external IPMI BMC device
  136. connected to a remote QEMU machine acting as BMC, using these options
  137. :
  138. .. code-block:: bash
  139. -chardev socket,id=ipmi0,host=localhost,port=9002,reconnect-ms=10000 \
  140. -device ipmi-bmc-extern,id=bmc0,chardev=ipmi0 \
  141. -device isa-ipmi-bt,bmc=bmc0,irq=10 \
  142. -nodefaults
  143. NVRAM
  144. ~~~~~
  145. Use a MTD drive to add a PNOR to the machine, and get a NVRAM :
  146. .. code-block:: bash
  147. -drive file=./witherspoon.pnor,format=raw,if=mtd
  148. Maintainer contact information
  149. ------------------------------
  150. Cédric Le Goater <clg@kaod.org>