No, unless I'm missing something clever, `ptest` with two unknown registers is generally not useful for checking some property about both of them. (Other than obvious stuff you'd already want a bitwise-AND for, like intersection between two bitmaps).
To test two registers for both being all-zero, OR them together and PTEST that against itself.
----
`ptest xmm0, xmm1` produces two results:
* ZF = is `xmm0 & xmm1` all-zero?
* CF = is `(~xmm0) & xmm1` all-zero?
**If the second vector is all-zero, the flags don't depend at all on the bits in the first vector.**
It may be useful to think of the "is-all-zero" checks as a `NOT(bitwise horizontal-OR())` of the AND and ANDNOT results. But probably not, because that's too many steps for my brain to think through easily. That sequence of vertical-AND and then horizontal-OR does maybe make it easier to understand why PTEST doesn't tell you much about a combination of two unknown registers, just like the integer TEST instruction.
Here's a truth table for a 2-bit `ptest a,mask`. Hopefully this helps in thinking about mixes of zeros and ones with 128b inputs.
Note that `CF(a,mask) == ZF(~a,mask)`.
a mask ZF CF
00 00 1 1
01 00 1 1
10 00 1 1
11 00 1 1
00 01 1 0
01 01 0 1
10 01 1 0
11 01 0 1
00 10 1 0
01 10 1 0
10 10 0 1
11 10 0 1
00 11 1 0
01 11 0 0
10 11 0 0
11 11 0 1
---
[Intel's intrinsics guide lists 2 interesting intrinsics for it][1]. Note the naming of the args: `a` and `mask` are a clue that they tell you about the parts of `a` selected by a known AND-mask.
* `_mm_test_mix_ones_zeros (__m128i a, __m128i mask)`: returns `(ZF == 0 && CF == 0)`
* `_mm_test_all_zeros (__m128i a, __m128i mask)`: returns `ZF`
There's also the more simply-named versions:
* `int _mm_testc_si128 (__m128i a, __m128i b)`: returns `CF`
* `int _mm_testnzc_si128 (__m128i a, __m128i b)`: returns `(ZF == 0 && CF == 0)`
* `int _mm_testz_si128 (__m128i a, __m128i b)`: returns `ZF`
There are AVX2 `__m256i` versions of those intrinsics, but the guide only lists the all_zeros and mix_ones_zeros alternate-name versions for `__m128i` operands.
If you want to test some other condition from C or C++, you should use `testc` and `testz` with the same operands, and hope that your compiler realizes that it only needs to do one PTEST, and hopefully even use a single JCC, SETCC, or CMOVCC to implement your logic. (I'd recommend checking the asm, at least for the compiler you care about most.)
----
Note that `_mm_testz_si128(v, set1(0xff))` is always the same as `_mm_testz_si128(v,v)`, because that's how AND works. But that's not true for the CF result.
**You can check for a vector being all-ones** using
bool is_all_ones = _mm_testc_si128(v, _mm_set1_epi8(0xff));
This is probably no faster, but smaller code-size, than a PCMPEQB against a vector of all-ones, then the usual movemask + cmp. It doesn't avoid the need for a vector constant.
PTEST does have the advantage that it doesn't destroy either input operand, even without AVX.
[1]:
[To see links please register here]