Intel Intrinsic SIMD translation

There is some Intel Intrinsic SIMD code that is easy to translate into the Jai inline assembly language. However, there are some examples where this can be difficult, especially when the Intel Intrinsic does not come with a corresponding SIMD instruction. This is a list of some difficult to translate instructions, and an effective way of translating them.

_mm256_set1_epi32

The Intel Intrinsic Instruction _mm256_set1_epi32 initializes 256-bit vector with scalar integer values. This instruction does not corresponding to any Intel AVX instruction.

The following C++ SIMD code snippet:

#include <immintrin.h>
int value = 1;
auto vector = _mm256_set1_epi16(value);

can be translated into:

#asm AVX, AVX2 {
  movd xmm0: vec, value;
  pbroadcastw vector: vec, xmm0; 
}

The movd assembly instruction transfers value into the xmm0 vector register, and pbroadcastw takes the xmm0 and broadcasts it to the rest of the values.