intel-iommu: Document iova_tree
It seems not super clear on when iova_tree is used, and why. Add a rich comment above iova_tree to track why we needed the iova_tree, and when we need it. Also comment for the map/unmap messages, on how they're used and implications (e.g. unmap can be larger than the mapped ranges). Suggested-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20230109193727.1360190-1-peterx@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit is contained in:
parent
bad9c5a516
commit
8a7c606016
@ -129,6 +129,32 @@ struct IOMMUTLBEntry {
|
|||||||
/*
|
/*
|
||||||
* Bitmap for different IOMMUNotifier capabilities. Each notifier can
|
* Bitmap for different IOMMUNotifier capabilities. Each notifier can
|
||||||
* register with one or multiple IOMMU Notifier capability bit(s).
|
* register with one or multiple IOMMU Notifier capability bit(s).
|
||||||
|
*
|
||||||
|
* Normally there're two use cases for the notifiers:
|
||||||
|
*
|
||||||
|
* (1) When the device needs accurate synchronizations of the vIOMMU page
|
||||||
|
* tables, it needs to register with both MAP|UNMAP notifies (which
|
||||||
|
* is defined as IOMMU_NOTIFIER_IOTLB_EVENTS below).
|
||||||
|
*
|
||||||
|
* Regarding to accurate synchronization, it's when the notified
|
||||||
|
* device maintains a shadow page table and must be notified on each
|
||||||
|
* guest MAP (page table entry creation) and UNMAP (invalidation)
|
||||||
|
* events (e.g. VFIO). Both notifications must be accurate so that
|
||||||
|
* the shadow page table is fully in sync with the guest view.
|
||||||
|
*
|
||||||
|
* (2) When the device doesn't need accurate synchronizations of the
|
||||||
|
* vIOMMU page tables, it needs to register only with UNMAP or
|
||||||
|
* DEVIOTLB_UNMAP notifies.
|
||||||
|
*
|
||||||
|
* It's when the device maintains a cache of IOMMU translations
|
||||||
|
* (IOTLB) and is able to fill that cache by requesting translations
|
||||||
|
* from the vIOMMU through a protocol similar to ATS (Address
|
||||||
|
* Translation Service).
|
||||||
|
*
|
||||||
|
* Note that in this mode the vIOMMU will not maintain a shadowed
|
||||||
|
* page table for the address space, and the UNMAP messages can cover
|
||||||
|
* more than the pages that used to get mapped. The IOMMU notifiee
|
||||||
|
* should be able to take care of over-sized invalidations.
|
||||||
*/
|
*/
|
||||||
typedef enum {
|
typedef enum {
|
||||||
IOMMU_NOTIFIER_NONE = 0,
|
IOMMU_NOTIFIER_NONE = 0,
|
||||||
|
@ -109,7 +109,43 @@ struct VTDAddressSpace {
|
|||||||
QLIST_ENTRY(VTDAddressSpace) next;
|
QLIST_ENTRY(VTDAddressSpace) next;
|
||||||
/* Superset of notifier flags that this address space has */
|
/* Superset of notifier flags that this address space has */
|
||||||
IOMMUNotifierFlag notifier_flags;
|
IOMMUNotifierFlag notifier_flags;
|
||||||
IOVATree *iova_tree; /* Traces mapped IOVA ranges */
|
/*
|
||||||
|
* @iova_tree traces mapped IOVA ranges.
|
||||||
|
*
|
||||||
|
* The tree is not needed if no MAP notifier is registered with current
|
||||||
|
* VTD address space, because all guest invalidate commands can be
|
||||||
|
* directly passed to the IOMMU UNMAP notifiers without any further
|
||||||
|
* reshuffling.
|
||||||
|
*
|
||||||
|
* The tree OTOH is required for MAP typed iommu notifiers for a few
|
||||||
|
* reasons.
|
||||||
|
*
|
||||||
|
* Firstly, there's no way to identify whether an PSI (Page Selective
|
||||||
|
* Invalidations) or DSI (Domain Selective Invalidations) event is an
|
||||||
|
* MAP or UNMAP event within the message itself. Without having prior
|
||||||
|
* knowledge of existing state vIOMMU doesn't know whether it should
|
||||||
|
* notify MAP or UNMAP for a PSI message it received when caching mode
|
||||||
|
* is enabled (for MAP notifiers).
|
||||||
|
*
|
||||||
|
* Secondly, PSI messages received from guest driver can be enlarged in
|
||||||
|
* range, covers but not limited to what the guest driver wanted to
|
||||||
|
* invalidate. When the range to invalidates gets bigger than the
|
||||||
|
* limit of a PSI message, it can even become a DSI which will
|
||||||
|
* invalidate the whole domain. If the vIOMMU directly notifies the
|
||||||
|
* registered device with the unmodified range, it may confuse the
|
||||||
|
* registered drivers (e.g. vfio-pci) on either:
|
||||||
|
*
|
||||||
|
* (1) Trying to map the same region more than once (for
|
||||||
|
* VFIO_IOMMU_MAP_DMA, -EEXIST will trigger), or,
|
||||||
|
*
|
||||||
|
* (2) Trying to UNMAP a range that is still partially mapped.
|
||||||
|
*
|
||||||
|
* That accuracy is not required for UNMAP-only notifiers, but it is a
|
||||||
|
* must-to-have for notifiers registered with MAP events, because the
|
||||||
|
* vIOMMU needs to make sure the shadow page table is always in sync
|
||||||
|
* with the guest IOMMU pgtables for a device.
|
||||||
|
*/
|
||||||
|
IOVATree *iova_tree;
|
||||||
};
|
};
|
||||||
|
|
||||||
struct VTDIOTLBEntry {
|
struct VTDIOTLBEntry {
|
||||||
|
Loading…
x
Reference in New Issue
Block a user