 176c0a4973
			
		
	
	
		176c0a4973
		
	
	
	
	
		
			
			Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
		
			
				
	
	
		
			238 lines
		
	
	
		
			8.2 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			238 lines
		
	
	
		
			8.2 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
| ==============
 | |
| NVMe Emulation
 | |
| ==============
 | |
| 
 | |
| QEMU provides NVMe emulation through the ``nvme``, ``nvme-ns`` and
 | |
| ``nvme-subsys`` devices.
 | |
| 
 | |
| See the following sections for specific information on
 | |
| 
 | |
|   * `Adding NVMe Devices`_, `additional namespaces`_ and `NVM subsystems`_.
 | |
|   * Configuration of `Optional Features`_ such as `Controller Memory Buffer`_,
 | |
|     `Simple Copy`_, `Zoned Namespaces`_, `metadata`_ and `End-to-End Data
 | |
|     Protection`_,
 | |
| 
 | |
| Adding NVMe Devices
 | |
| ===================
 | |
| 
 | |
| Controller Emulation
 | |
| --------------------
 | |
| 
 | |
| The QEMU emulated NVMe controller implements version 1.4 of the NVM Express
 | |
| specification. All mandatory features are implement with a couple of exceptions
 | |
| and limitations:
 | |
| 
 | |
|   * Accounting numbers in the SMART/Health log page are reset when the device
 | |
|     is power cycled.
 | |
|   * Interrupt Coalescing is not supported and is disabled by default.
 | |
| 
 | |
| The simplest way to attach an NVMe controller on the QEMU PCI bus is to add the
 | |
| following parameters:
 | |
| 
 | |
| .. code-block:: console
 | |
| 
 | |
|     -drive file=nvm.img,if=none,id=nvm
 | |
|     -device nvme,serial=deadbeef,drive=nvm
 | |
| 
 | |
| There are a number of optional general parameters for the ``nvme`` device. Some
 | |
| are mentioned here, but see ``-device nvme,help`` to list all possible
 | |
| parameters.
 | |
| 
 | |
| ``max_ioqpairs=UINT32`` (default: ``64``)
 | |
|   Set the maximum number of allowed I/O queue pairs. This replaces the
 | |
|   deprecated ``num_queues`` parameter.
 | |
| 
 | |
| ``msix_qsize=UINT16`` (default: ``65``)
 | |
|   The number of MSI-X vectors that the device should support.
 | |
| 
 | |
| ``mdts=UINT8`` (default: ``7``)
 | |
|   Set the Maximum Data Transfer Size of the device.
 | |
| 
 | |
| ``use-intel-id`` (default: ``off``)
 | |
|   Since QEMU 5.2, the device uses a QEMU allocated "Red Hat" PCI Device and
 | |
|   Vendor ID. Set this to ``on`` to revert to the unallocated Intel ID
 | |
|   previously used.
 | |
| 
 | |
| Additional Namespaces
 | |
| ---------------------
 | |
| 
 | |
| In the simplest possible invocation sketched above, the device only support a
 | |
| single namespace with the namespace identifier ``1``. To support multiple
 | |
| namespaces and additional features, the ``nvme-ns`` device must be used.
 | |
| 
 | |
| .. code-block:: console
 | |
| 
 | |
|    -device nvme,id=nvme-ctrl-0,serial=deadbeef
 | |
|    -drive file=nvm-1.img,if=none,id=nvm-1
 | |
|    -device nvme-ns,drive=nvm-1
 | |
|    -drive file=nvm-2.img,if=none,id=nvm-2
 | |
|    -device nvme-ns,drive=nvm-2
 | |
| 
 | |
| The namespaces defined by the ``nvme-ns`` device will attach to the most
 | |
| recently defined ``nvme-bus`` that is created by the ``nvme`` device. Namespace
 | |
| identifers are allocated automatically, starting from ``1``.
 | |
| 
 | |
| There are a number of parameters available:
 | |
| 
 | |
| ``nsid`` (default: ``0``)
 | |
|   Explicitly set the namespace identifier.
 | |
| 
 | |
| ``uuid`` (default: *autogenerated*)
 | |
|   Set the UUID of the namespace. This will be reported as a "Namespace UUID"
 | |
|   descriptor in the Namespace Identification Descriptor List.
 | |
| 
 | |
| ``eui64``
 | |
|   Set the EUI-64 of the namespace. This will be reported as a "IEEE Extended
 | |
|   Unique Identifier" descriptor in the Namespace Identification Descriptor List.
 | |
|   Since machine type 6.1 a non-zero default value is used if the parameter
 | |
|   is not provided. For earlier machine types the field defaults to 0.
 | |
| 
 | |
| ``bus``
 | |
|   If there are more ``nvme`` devices defined, this parameter may be used to
 | |
|   attach the namespace to a specific ``nvme`` device (identified by an ``id``
 | |
|   parameter on the controller device).
 | |
| 
 | |
| NVM Subsystems
 | |
| --------------
 | |
| 
 | |
| Additional features becomes available if the controller device (``nvme``) is
 | |
| linked to an NVM Subsystem device (``nvme-subsys``).
 | |
| 
 | |
| The NVM Subsystem emulation allows features such as shared namespaces and
 | |
| multipath I/O.
 | |
| 
 | |
| .. code-block:: console
 | |
| 
 | |
|    -device nvme-subsys,id=nvme-subsys-0,nqn=subsys0
 | |
|    -device nvme,serial=a,subsys=nvme-subsys-0
 | |
|    -device nvme,serial=b,subsys=nvme-subsys-0
 | |
| 
 | |
| This will create an NVM subsystem with two controllers. Having controllers
 | |
| linked to an ``nvme-subsys`` device allows additional ``nvme-ns`` parameters:
 | |
| 
 | |
| ``shared`` (default: ``off``)
 | |
|   Specifies that the namespace will be attached to all controllers in the
 | |
|   subsystem. If set to ``off`` (the default), the namespace will remain a
 | |
|   private namespace and may only be attached to a single controller at a time.
 | |
| 
 | |
| ``detached`` (default: ``off``)
 | |
|   If set to ``on``, the namespace will be be available in the subsystem, but
 | |
|   not attached to any controllers initially.
 | |
| 
 | |
| Thus, adding
 | |
| 
 | |
| .. code-block:: console
 | |
| 
 | |
|    -drive file=nvm-1.img,if=none,id=nvm-1
 | |
|    -device nvme-ns,drive=nvm-1,nsid=1,shared=on
 | |
|    -drive file=nvm-2.img,if=none,id=nvm-2
 | |
|    -device nvme-ns,drive=nvm-2,nsid=3,detached=on
 | |
| 
 | |
| will cause NSID 1 will be a shared namespace (due to ``shared=on``) that is
 | |
| initially attached to both controllers. NSID 3 will be a private namespace
 | |
| (i.e. only attachable to a single controller at a time) and will not be
 | |
| attached to any controller initially (due to ``detached=on``).
 | |
| 
 | |
| Optional Features
 | |
| =================
 | |
| 
 | |
| Controller Memory Buffer
 | |
| ------------------------
 | |
| 
 | |
| ``nvme`` device parameters related to the Controller Memory Buffer support:
 | |
| 
 | |
| ``cmb_size_mb=UINT32`` (default: ``0``)
 | |
|   This adds a Controller Memory Buffer of the given size at offset zero in BAR
 | |
|   2.
 | |
| 
 | |
| ``legacy-cmb`` (default: ``off``)
 | |
|   By default, the device uses the "v1.4 scheme" for the Controller Memory
 | |
|   Buffer support (i.e, the CMB is initially disabled and must be explicitly
 | |
|   enabled by the host). Set this to ``on`` to behave as a v1.3 device wrt. the
 | |
|   CMB.
 | |
| 
 | |
| Simple Copy
 | |
| -----------
 | |
| 
 | |
| The device includes support for TP 4065 ("Simple Copy Command"). A number of
 | |
| additional ``nvme-ns`` device parameters may be used to control the Copy
 | |
| command limits:
 | |
| 
 | |
| ``mssrl=UINT16`` (default: ``128``)
 | |
|   Set the Maximum Single Source Range Length (``MSSRL``). This is the maximum
 | |
|   number of logical blocks that may be specified in each source range.
 | |
| 
 | |
| ``mcl=UINT32`` (default: ``128``)
 | |
|   Set the Maximum Copy Length (``MCL``). This is the maximum number of logical
 | |
|   blocks that may be specified in a Copy command (the total for all source
 | |
|   ranges).
 | |
| 
 | |
| ``msrc=UINT8`` (default: ``127``)
 | |
|   Set the Maximum Source Range Count (``MSRC``). This is the maximum number of
 | |
|   source ranges that may be used in a Copy command. This is a 0's based value.
 | |
| 
 | |
| Zoned Namespaces
 | |
| ----------------
 | |
| 
 | |
| A namespaces may be "Zoned" as defined by TP 4053 ("Zoned Namespaces"). Set
 | |
| ``zoned=on`` on an ``nvme-ns`` device to configure it as a zoned namespace.
 | |
| 
 | |
| The namespace may be configured with additional parameters
 | |
| 
 | |
| ``zoned.zone_size=SIZE`` (default: ``128MiB``)
 | |
|   Define the zone size (``ZSZE``).
 | |
| 
 | |
| ``zoned.zone_capacity=SIZE`` (default: ``0``)
 | |
|   Define the zone capacity (``ZCAP``). If left at the default (``0``), the zone
 | |
|   capacity will equal the zone size.
 | |
| 
 | |
| ``zoned.descr_ext_size=UINT32`` (default: ``0``)
 | |
|   Set the Zone Descriptor Extension Size (``ZDES``). Must be a multiple of 64
 | |
|   bytes.
 | |
| 
 | |
| ``zoned.cross_read=BOOL`` (default: ``off``)
 | |
|   Set to ``on`` to allow reads to cross zone boundaries.
 | |
| 
 | |
| ``zoned.max_active=UINT32`` (default: ``0``)
 | |
|   Set the maximum number of active resources (``MAR``). The default (``0``)
 | |
|   allows all zones to be active.
 | |
| 
 | |
| ``zoned.max_open=UINT32`` (default: ``0``)
 | |
|   Set the maximum number of open resources (``MOR``). The default (``0``)
 | |
|   allows all zones to be open. If ``zoned.max_active`` is specified, this value
 | |
|   must be less than or equal to that.
 | |
| 
 | |
| ``zoned.zasl=UINT8`` (default: ``0``)
 | |
|   Set the maximum data transfer size for the Zone Append command. Like
 | |
|   ``mdts``, the value is specified as a power of two (2^n) and is in units of
 | |
|   the minimum memory page size (CAP.MPSMIN). The default value (``0``)
 | |
|   has this property inherit the ``mdts`` value.
 | |
| 
 | |
| Metadata
 | |
| --------
 | |
| 
 | |
| The virtual namespace device supports LBA metadata in the form separate
 | |
| metadata (``MPTR``-based) and extended LBAs.
 | |
| 
 | |
| ``ms=UINT16`` (default: ``0``)
 | |
|   Defines the number of metadata bytes per LBA.
 | |
| 
 | |
| ``mset=UINT8`` (default: ``0``)
 | |
|   Set to ``1`` to enable extended LBAs.
 | |
| 
 | |
| End-to-End Data Protection
 | |
| --------------------------
 | |
| 
 | |
| The virtual namespace device supports DIF- and DIX-based protection information
 | |
| (depending on ``mset``).
 | |
| 
 | |
| ``pi=UINT8`` (default: ``0``)
 | |
|   Enable protection information of the specified type (type ``1``, ``2`` or
 | |
|   ``3``).
 | |
| 
 | |
| ``pil=UINT8`` (default: ``0``)
 | |
|   Controls the location of the protection information within the metadata. Set
 | |
|   to ``1`` to transfer protection information as the first eight bytes of
 | |
|   metadata. Otherwise, the protection information is transferred as the last
 | |
|   eight bytes.
 |