3222 Commits

Author SHA1 Message Date
Richard Henderson
2b7b695757 tcg/optimize: Use fold_masks_s in fold_nor
Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Richard Henderson
d151fd34b0 tcg/optimize: Use fold_masks_z in fold_neg_no_const
Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Richard Henderson
fa3168ee93 tcg/optimize: Use fold_masks_s in fold_nand
Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Richard Henderson
cd9c5834d8 tcg/optimize: Use finish_folding in fold_mul*
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Richard Henderson
322027841f tcg/optimize: Use fold_masks_zs in fold_movcond
Avoid the use of the OptContext slots.  Find TempOptInfo once.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Richard Henderson
08abe2908f tcg/optimize: Use fold_masks_z in fold_extu
Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Richard Henderson
a96219204f tcg/optimize: Use fold_masks_zs in fold_exts
Avoid the use of the OptContext slots.  Find TempOptInfo once.
Explicitly sign-extend z_mask instead of doing that manually.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Richard Henderson
c9df99ee8d tcg/optimize: Use finish_folding in fold_extract2
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Richard Henderson
b6cd00f1ef tcg/optimize: Use fold_masks_z in fold_extract
Avoid the use of the OptContext slots.  Find TempOptInfo once.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Richard Henderson
ef6be624f6 tcg/optimize: Use fold_masks_s in fold_eqv
Add fold_masks_s as a trivial wrapper around fold_masks_zs.
Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Richard Henderson
e089d694e1 tcg/optimize: Use finish_folding in fold_dup, fold_dup2
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:15 -08:00
Richard Henderson
3d5ec804da tcg/optimize: Use finish_folding in fold_divide
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:14 -08:00
Richard Henderson
edb832cb51 tcg/optimize: Compute sign mask in fold_deposit
The input which overlaps the sign bit of the output can
have its input s_mask propagated to the output s_mask.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:14 -08:00
Richard Henderson
c7739ab83e tcg/optimize: Use fold_and and fold_masks_z in fold_deposit
Avoid the use of the OptContext slots.  Find TempOptInfo once.
When we fold to and, use fold_and.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:14 -08:00
Richard Henderson
81be07f905 tcg/optimize: Use fold_masks_z in fold_ctpop
Add fold_masks_z as a trivial wrapper around fold_masks_zs.
Avoid the use of the OptContext slots.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:14 -08:00
Richard Henderson
ce1d663ff8 tcg/optimize: Use fold_masks_zs in fold_count_zeros
Avoid the use of the OptContext slots. Find TempOptInfo once.
Compute s_mask from the union of the maximum count and the
op2 fallback for op1 being zero.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:14 -08:00
Richard Henderson
c1e7b989c8 tcg/optimize: Use fold_masks_zs in fold_bswap
Avoid the use of the OptContext slots.  Find TempOptInfo once.
Always set s_mask along the BSWAP_OS path, since the result is
being explicitly sign-extended.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:14 -08:00
Richard Henderson
21e2b5f9fa tcg/optimize: Use fold_masks_zs in fold_andc
Avoid the use of the OptContext slots.  Find TempOptInfo once.
Avoid double inversion of the value of second const operand.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:14 -08:00
Richard Henderson
1ca7372c03 tcg/optimize: Use fold_masks_zs in fold_and
Avoid the use of the OptContext slots.  Find TempOptInfo once.
Sink mask computation below fold_affected_mask early exit.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:14 -08:00
Richard Henderson
e1b6c141e9 tcg/optimize: Introduce const value accessors for TempOptInfo
Introduce ti_is_const, ti_const_val, ti_is_const_val.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 08:32:14 -08:00
Richard Henderson
f3ed3cffb9 tcg/optimize: Use finish_folding in fold_add, fold_add_vec, fold_addsub2
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 07:32:55 -08:00
Richard Henderson
6d70ddc635 tcg/optimize: Change representation of s_mask
Change the representation from sign bit repetitions to all bits equal
to the sign bit, including the sign bit itself.

The previous format has a problem in that it is difficult to recreate
a valid sign mask after a shift operation: the "repetitions" part of
the previous format meant that applying the same shift as for the value
lead to an off-by-one value.

The new format, including the sign bit itself, means that the sign mask
can be manipulated in exactly the same way as the value, canonicalization
is easier.

Canonicalize the s_mask in fold_masks_zs, rather than requiring callers
to do so.  Treat 0 as a non-canonical but typeless input for no sign
information, which will be reset as appropriate for the data type.
We can easily fold in the data from z_mask while canonicalizing.

Temporarily disable optimizations using s_mask while each operation is
converted to use fold_masks_zs and to the new form.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 07:32:51 -08:00
Richard Henderson
75c3bf324d tcg/optimize: Augment s_mask from z_mask in fold_masks_zs
Consider the passed s_mask to be a minimum deduced from
either existing s_mask or from a sign-extension operation.
We may be able to deduce more from the set of known zeros.
Remove identical logic from several opcode folders.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 07:32:51 -08:00
Richard Henderson
d582b14d80 tcg/optimize: Split out fold_masks_zs
Add a routine to which masks can be passed directly, rather than
storing them into OptContext.  To be used in upcoming patches.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 07:32:50 -08:00
Richard Henderson
56e06ecfa5 tcg/optimize: Copy mask writeback to fold_masks
Use of fold_masks should be restricted to those opcodes that
can reliably make use of it -- those with a single output,
and from higher-level folders that set up the masks.
Prepare for conversion of each folder in turn.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 07:32:50 -08:00
Richard Henderson
045ace35a8 tcg/optimize: Split out fold_affected_mask
There are only a few logical operations which can compute
an "affected" mask.  Split out handling of this optimization
to a separate function, only to be called when applicable.

Remove the a_mask field from OptContext, as the mask is
no longer stored anywhere.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 07:32:50 -08:00
Richard Henderson
1526855c01 tcg/optimize: Split out finish_bb, finish_ebb
Call them directly from the opcode switch statement in tcg_optimize,
rather than in finish_folding based on opcode flags.  Adjust folding
of conditional branches to match.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-24 07:32:50 -08:00
Philippe Mathieu-Daudé
069ea4c825 tcg/tci: Include missing 'disas/dis-asm.h' header
"disas/dis-asm.h" defines bfd_vma and disassemble_info,
include it in order to avoid (when refactoring other
headers):

  tcg/tci.c:1066:20: error: unknown type name 'bfd_vma'
  int print_insn_tci(bfd_vma addr, disassemble_info *info)
                     ^
  tcg/tci.c:1066:34: error: unknown type name 'disassemble_info'
  int print_insn_tci(bfd_vma addr, disassemble_info *info)
                                   ^

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20241218155202.71931-3-philmd@linaro.org>
2024-12-20 17:44:56 +01:00
Roman Artemev
242376e872 tcg/riscv: Fix StoreStore barrier generation
On RISC-V to StoreStore barrier corresponds
`fence w, w` not `fence r, r`

Cc: qemu-stable@nongnu.org
Fixes: efbea94c76b ("tcg/riscv: Add slowpath load and store instructions")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Denis Tomashev <denis.tomashev@syntacore.com>
Signed-off-by: Roman Artemev <roman.artemev@syntacore.com>
Message-ID: <e2f2131e294a49e79959d4fa9ec02cf4@syntacore.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit b438362a142527b97b638b7f0f35ebe11911a8d5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-12-13 15:54:32 +03:00
Richard Henderson
f838a7e365 tcg: Reset free_temps before tcg_optimize
When allocating new temps during tcg_optmize, do not re-use
any EBB temps that were used within the TB.  We do not have
any idea what span of the TB in which the temp was live.

Introduce tcg_temp_ebb_reset_freed and use before tcg_optimize,
as well as replacing the equivalent in plugin_gen_inject and
tcg_func_start.

Cc: qemu-stable@nongnu.org
Fixes: fb04ab7ddd8 ("tcg/optimize: Lower TCG_COND_TST{EQ,NE} if unsupported")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2711
Reported-by: wannacu <wannacu2049@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 04e006ab36a8565b92d4e21dd346367fbade7d74)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-12-13 15:51:03 +03:00
Roman Artemev
b438362a14 tcg/riscv: Fix StoreStore barrier generation
On RISC-V to StoreStore barrier corresponds
`fence w, w` not `fence r, r`

Cc: qemu-stable@nongnu.org
Fixes: efbea94c76b ("tcg/riscv: Add slowpath load and store instructions")
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Denis Tomashev <denis.tomashev@syntacore.com>
Signed-off-by: Roman Artemev <roman.artemev@syntacore.com>
Message-ID: <e2f2131e294a49e79959d4fa9ec02cf4@syntacore.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-12-12 14:28:38 -06:00
Richard Henderson
04e006ab36 tcg: Reset free_temps before tcg_optimize
When allocating new temps during tcg_optmize, do not re-use
any EBB temps that were used within the TB.  We do not have
any idea what span of the TB in which the temp was live.

Introduce tcg_temp_ebb_reset_freed and use before tcg_optimize,
as well as replacing the equivalent in plugin_gen_inject and
tcg_func_start.

Cc: qemu-stable@nongnu.org
Fixes: fb04ab7ddd8 ("tcg/optimize: Lower TCG_COND_TST{EQ,NE} if unsupported")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2711
Reported-by: wannacu <wannacu2049@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2024-12-12 14:28:38 -06:00
Peter Maydell
8377e3fb85 tcg: Allow top bit of SIMD_DATA_BITS to be set in simd_desc()
In simd_desc() we create a SIMD descriptor from various pieces
including an arbitrary data value from the caller.  We try to
sanitize these to make sure everything will fit: the 'data' value
needs to fit in the SIMD_DATA_BITS (== 22) sized field.  However we
do that sanitizing with:
   tcg_debug_assert(data == sextract32(data, 0, SIMD_DATA_BITS));

This works for the case where the data is supposed to be considered
as a signed integer (which can then be returned via simd_data()).
However, some callers want to treat the data value as unsigned.

Specifically, for the Arm SVE operations, make_svemte_desc()
assembles a data value as a collection of fields, and it needs to use
all 22 bits.  Currently if MTE is enabled then its MTEDESC SIZEM1
field may have the most significant bit set, and then it will trip
this assertion.

Loosen the assertion so that we only check that the data value will
fit into the field in some way, either as a signed or as an unsigned
value.  This means we will fail to detect some kinds of bug in the
callers, but we won't spuriously assert for intentional use of the
data field as unsigned.

Cc: qemu-stable@nongnu.org
Fixes: db432672dc50e ("tcg: Add generic vector expanders")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2601
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-ID: <20241115172515.1229393-1-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-11-16 08:42:25 -08:00
Romain Malmain
b01a0bc334
Fix helper function calls & support for new x86 decoder (#92)
* fix helper function calls

* cmp hooks: support for new x86 decoder
2024-10-31 16:31:54 +01:00
Romain Malmain
67dabac1ed v9.1.1 release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZKoqtTHVaQM2a/75gqpKJDselHgFAmcScB0ACgkQgqpKJDse
 lHgQ7g/7BIWV/LC7MqFmHlXl9S0S7ZHVsDc2x6Bx97Sk4sKAUKLvRsLFMa5F40Fn
 xY8v/aLsqOTmzWz38hdtgJR0rrv8DykWw9ft9nta2tFg20tilL/LaakT8TLKmjK2
 StZFzk7iijnY78Z3RcVliBTStLoPbOx9WCUs2evCV/qTxQDec1A7u4ukG9cAztGn
 ea8pNnKNgk+BN805w1uMMZ1wnh3FTVs9kdXVh7CzXlRAHHkVHQ47C9ZN6vh6N3xs
 3qj/Obi4k1N81NNRJFA4gR02t82LdPhg/WV33/q9TxSmHyZEmNXg0lRlDyIeSbpw
 bqYY+dsBbGyMJgN/LUZMNjPAfQL4S5VicFJcfKTXr6xYtkhqtlCun1kmI7O+ZIY5
 kGQYbAAhyPkFIOU6XedyKxM+0eUDqrr9fyzyn5NfISzETQiGFccYjfk/4fsHGfS8
 nOBTNtYBpnEXFeUk/jvv6OPOsh2L+K0PKbGefFbCjNng9Ix3Kz5zEY8xhtlv7C6m
 9YyGGAS1zwcWapwq8URy01GWkiKT2Ia/gD7c89oGY1bJmQKYf9lrLX5YtP+d/NYs
 UqWmk046ViapiKDF7VXWtF0f5axYpeaMMhkNM5RtkOq57nez4LuKPaKs1emRC6W9
 LE2om+28dyGJqHeJp5fqigM+wPxRJlecR57sDIuq4n0bJcvzLEA=
 =240n
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSq9xYmtep25y1RrMYC5KE/dBVGigUCZxv7TAAKCRAC5KE/dBVG
 isCPAP43SCLPw/W/su5jPShfNn4fvHHiY1f0a6t3Kf6414aqvQD/XKmYGFGl4V5k
 XYnW/9D6Bp/k8gBSjKzYeIt0+Mt/AAQ=
 =cRil
 -----END PGP SIGNATURE-----

Merge tag 'v9.1.1' into update_qemu_9_1_0

v9.1.1 release
2024-10-25 22:10:51 +02:00
Dani Szebenyi
9a2a5f1b63 tcg/ppc: Fix tcg_out_rlw_rc
The TCG IR sequence:

  mov_i32 tmp97,$0xc4240000             dead: 1  pref=0xffffffff
  mov_i32 tmp98,$0x0                    pref=0xffffffff
  rotr_i32 tmp97,tmp97,tmp98            dead: 1 2  pref=0xffffffff

was translated to `slwi r15, r14, 0` instead of `slwi r14, r14, 0`
due to SH field overflow.  SH field is 5 bits, and tcg_out_rlw is called
in some situations with `32-n`, when `n` is 0 it results in an overflow
to RA field.

This commit prevents overflow of that field and adds debug assertions
for the other fields

Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Dani Szebenyi <szedani@linux.ibm.com>
Message-ID: <20241022133535.69351-2-szedani@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 13:45:03 -07:00
TANG Tiancheng
4b7868f8c2 tcg/riscv: Enable native vector support for TCG host
Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241007025700.47259-13-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
TANG Tiancheng
d1843219a1 tcg/riscv: Implement vector roti/v/x ops
Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
Message-ID: <20241007025700.47259-12-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
TANG Tiancheng
cbde22f18b tcg/riscv: Implement vector shi/s/v ops
Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241007025700.47259-11-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
TANG Tiancheng
1631f19b04 tcg/riscv: Implement vector min/max ops
Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241007025700.47259-10-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
TANG Tiancheng
101c1ef562 tcg/riscv: Implement vector sat/mul ops
Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241007025700.47259-9-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
Richard Henderson
dc9cd4ec12 tcg/riscv: Accept constant first argument to sub_vec
Use vrsub.vi to subtract from a constant.

Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
TANG Tiancheng
c283c0748a tcg/riscv: Implement vector neg ops
Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241007025700.47259-8-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
TANG Tiancheng
a31768c019 tcg/riscv: Implement vector cmp/cmpsel ops
Extend comparison results from mask registers to SEW-width elements,
following recommendations in The RISC-V SPEC Volume I (Version 20240411).
This aligns with TCG's cmp_vec behavior by expanding compare results to
full element width: all 1s for true, all 0s for false.

Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241007025700.47259-7-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
TANG Tiancheng
5a63f59987 tcg/riscv: Add support for basic vector opcodes
Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241007025700.47259-6-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
TANG Tiancheng
d4be6ee111 tcg/riscv: Implement vector mov/dup{m/i}
Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241007025700.47259-5-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
Huang Shiyuan
f63e7089b4 tcg/riscv: Add basic support for vector
The RISC-V vector instruction set utilizes the LMUL field to group
multiple registers, enabling variable-length vector registers. This
implementation uses only the first register number of each group while
reserving the other register numbers within the group.

In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
host runtime needs to adjust LMUL based on the type to use different
register groups.

This presents challenges for TCG's register allocation. Currently, we
avoid modifying the register allocation part of TCG and only expose the
minimum number of vector registers.

For example, when the host vlen is 64 bits and type is TCG_TYPE_V256, with
LMUL equal to 4, we use 4 vector registers as one register group. We can
use a maximum of 8 register groups, but the V0 register number is reserved
as a mask register, so we can effectively use at most 7 register groups.
Moreover, when type is smaller than TCG_TYPE_V256, only 7 registers are
forced to be used. This is because TCG cannot yet dynamically constrain
registers with type; likewise, when the host vlen is 128 bits and
TCG_TYPE_V256, we can use at most 15 registers.

There is not much pressure on vector register allocation in TCG now, so
using 7 registers is feasible and will not have a major impact on code
generation.

This patch:
1. Reserves vector register 0 for use as a mask register.
2. When using register groups, reserves the additional registers within
   each group.

Signed-off-by: Huang Shiyuan <swung0x48@outlook.com>
Co-authored-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20241007025700.47259-3-zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
Richard Henderson
a7cfd751fb tcg: Reset data_gen_ptr correctly
This pointer needs to be reset after overflow just like
code_buf and code_ptr.

Cc: qemu-stable@nongnu.org
Fixes: 57a269469db ("tcg: Infrastructure for managing constant pools")
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-22 11:57:25 -07:00
Paolo Bonzini
615586cb35 tcg/s390x: fix constraint for 32-bit TSTEQ/TSTNE
32-bit TSTEQ and TSTNE is subject to the same constraints as
for 64-bit, but setcond_i32 and negsetcond_i32 were incorrectly
using TCG_CT_CONST ("i") instead of TCG_CT_CONST_CMP ("C").

Adjust the constraint and make tcg_target_const_match use the
same sequence as tgen_cmp2: first check if the constant is a
valid operand for TSTEQ/TSTNE, then accept everything for 32-bit
non-test comparisons, finally check if the constant is a valid
operand for 64-bit non-test comparisons.

Reported-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: qemu-stable@nongnu.org

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-17 19:41:22 +02:00
Richard Henderson
c5809eee45 include/exec/memop: Rename get_alignment_bits
Rename to use "memop_" prefix, like other functions
that operate on MemOp.

Reviewed-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-10-13 11:27:03 -07:00