89 lines
3.0 KiB
ReStructuredText
89 lines
3.0 KiB
ReStructuredText
|
==========================
|
||
|
Vector Predication Roadmap
|
||
|
==========================
|
||
|
|
||
|
.. contents:: Table of Contents
|
||
|
:depth: 3
|
||
|
:local:
|
||
|
|
||
|
Motivation
|
||
|
==========
|
||
|
|
||
|
This proposal defines a roadmap towards native vector predication in LLVM,
|
||
|
specifically for vector instructions with a mask and/or an explicit vector
|
||
|
length. LLVM currently has no target-independent means to model predicated
|
||
|
vector instructions for modern SIMD ISAs such as AVX512, ARM SVE, the RISC-V V
|
||
|
extension and NEC SX-Aurora. Only some predicated vector operations, such as
|
||
|
masked loads and stores, are available through intrinsics [MaskedIR]_.
|
||
|
|
||
|
The Vector Predication (VP) extensions is a concrete RFC and prototype
|
||
|
implementation to achieve native vector predication in LLVM. The VP prototype
|
||
|
and all related discussions can be found in the VP patch on Phabricator
|
||
|
[VPRFC]_.
|
||
|
|
||
|
Roadmap
|
||
|
=======
|
||
|
|
||
|
1. IR-level VP intrinsics
|
||
|
-------------------------
|
||
|
|
||
|
- There is a consensus on the semantics/instruction set of VP.
|
||
|
- VP intrinsics and attributes are available on IR level.
|
||
|
- TTI has capability flags for VP (``supportsVP()``?,
|
||
|
``haveActiveVectorLength()``?).
|
||
|
|
||
|
Result: VP usable for IR-level vectorizers (LV, VPlan, RegionVectorizer),
|
||
|
potential integration in Clang with builtins.
|
||
|
|
||
|
2. CodeGen support
|
||
|
------------------
|
||
|
|
||
|
- VP intrinsics translate to first-class SDNodes
|
||
|
(eg ``llvm.vp.fdiv.* -> vp_fdiv``).
|
||
|
- VP legalization (legalize explicit vector length to mask (AVX512), legalize VP
|
||
|
SDNodes to pre-existing ones (SSE, NEON)).
|
||
|
|
||
|
Result: Backend development based on VP SDNodes.
|
||
|
|
||
|
3. Lift InstSimplify/InstCombine/DAGCombiner to VP
|
||
|
--------------------------------------------------
|
||
|
|
||
|
- Introduce PredicatedInstruction, PredicatedBinaryOperator, .. helper classes
|
||
|
that match standard vector IR and VP intrinsics.
|
||
|
- Add a matcher context to PatternMatch and context-aware IR Builder APIs.
|
||
|
- Incrementally lift DAGCombiner to work on VP SDNodes as well as on regular
|
||
|
vector instructions.
|
||
|
- Incrementally lift InstCombine/InstSimplify to operate on VP as well as
|
||
|
regular IR instructions.
|
||
|
|
||
|
Result: Optimization of VP intrinsics on par with standard vector instructions.
|
||
|
|
||
|
4. Deprecate llvm.masked.* / llvm.experimental.reduce.*
|
||
|
-------------------------------------------------------
|
||
|
|
||
|
- Modernize llvm.masked.* / llvm.experimental.reduce* by translating to VP.
|
||
|
- DCE transitional APIs.
|
||
|
|
||
|
Result: VP has superseded earlier vector intrinsics.
|
||
|
|
||
|
5. Predicated IR Instructions
|
||
|
-----------------------------
|
||
|
|
||
|
- Vector instructions have an optional mask and vector length parameter. These
|
||
|
lower to VP SDNodes (from Stage 2).
|
||
|
- Phase out VP intrinsics, only keeping those that are not equivalent to
|
||
|
vectorized scalar instructions (reduce, shuffles, ..)
|
||
|
- InstCombine/InstSimplify expect predication in regular Instructions (Stage (3)
|
||
|
has laid the groundwork).
|
||
|
|
||
|
Result: Native vector predication in IR.
|
||
|
|
||
|
References
|
||
|
==========
|
||
|
|
||
|
.. [MaskedIR] `llvm.masked.*` intrinsics,
|
||
|
https://llvm.org/docs/LangRef.html#masked-vector-load-and-store-intrinsics
|
||
|
|
||
|
.. [VPRFC] RFC: Prototype & Roadmap for vector predication in LLVM,
|
||
|
https://reviews.llvm.org/D57504
|