Windows ARM runner and build fixes #5979

Xarbirus · 2024-03-10T18:46:58Z

I would like to add several fixes for building the repo on Windows ARM, as well as a CI runner for automated build control.

fix error C2078: too many initializers with ggml_vld1q_u32 with a macro for MSVC ARM64
fix error C2065: '__fp16': undeclared identifier
suppress warning C4146: unary minus operator applied to unsigned type, result still unsigned

…r MSVC ARM64

…sult still unsigned`

ggerganov · 2024-03-10T20:43:45Z

ggml.c

@@ -857,7 +857,7 @@ inline static float vaddvq_f32(float32x4_t v) {
    #define GGML_F16x8              float16x8_t
    #define GGML_F16x8_ZERO         vdupq_n_f16(0.0f)
    #define GGML_F16x8_SET1(x)      vdupq_n_f16(x)
-    #define GGML_F16x8_LOAD(x)      vld1q_f16((const __fp16 *)(x))
+    #define GGML_F16x8_LOAD(x)      vld1q_f16((const ggml_fp16_internal_t *)(x))


It would call vld1q_f16 with uint16_t * argument.
Does this produce correct results?

Does this produce correct results?

Hm, yes - no reason not to.

Wouldn't it be better if instead of introducing ggml_fp16_internal_t, we simply change these to:

diff --git a/ggml.c b/ggml.c index 80efa6f2..ac0d15b2 100644 --- a/ggml.c +++ b/ggml.c @@ -857,7 +857,7 @@ inline static float vaddvq_f32(float32x4_t v) { #define GGML_F16x8 float16x8_t #define GGML_F16x8_ZERO vdupq_n_f16(0.0f) #define GGML_F16x8_SET1(x) vdupq_n_f16(x) - #define GGML_F16x8_LOAD(x) vld1q_f16((const __fp16 *)(x)) + #define GGML_F16x8_LOAD(x) vld1q_f16((const uint16_t *)(x)) #define GGML_F16x8_STORE vst1q_f16 #define GGML_F16x8_FMA(a, b, c) vfmaq_f16(a, b, c) #define GGML_F16x8_ADD vaddq_f16 @@ -900,7 +900,7 @@ inline static float vaddvq_f32(float32x4_t v) { #define GGML_F32Cx4 float32x4_t #define GGML_F32Cx4_ZERO vdupq_n_f32(0.0f) #define GGML_F32Cx4_SET1(x) vdupq_n_f32(x) - #define GGML_F32Cx4_LOAD(x) vcvt_f32_f16(vld1_f16((const __fp16 *)(x))) + #define GGML_F32Cx4_LOAD(x) vcvt_f32_f16(vld1_f16((const uint16_t *)(x))) #define GGML_F32Cx4_STORE(x, y) vst1_f16(x, vcvt_f16_f32(y)) #define GGML_F32Cx4_FMA(a, b, c) vfmaq_f32(a, b, c) #define GGML_F32Cx4_ADD vaddq_f32

We'll get some warnings (like incompatible pointer types assigning to 'const __fp16 *' from 'const uint16_t *') from GGML_F16_VEC_LOAD in this case.

But if it's ok, I'll commit the change.

ggerganov

Let's keep it like proposed

* windows arm ci * fix `error C2078: too many initializers` with ggml_vld1q_u32 macro for MSVC ARM64 * fix `warning C4146: unary minus operator applied to unsigned type, result still unsigned` * fix `error C2065: '__fp16': undeclared identifier`

Xarbirus added 4 commits March 8, 2024 14:22

windows arm ci

11dcc41

fix error C2078: too many initializers with ggml_vld1q_u32 macro fo…

bbde9e2

…r MSVC ARM64

fix `warning C4146: unary minus operator applied to unsigned type, re…

8dd3905

…sult still unsigned`

fix error C2065: '__fp16': undeclared identifier

2498f6a

ggerganov reviewed Mar 10, 2024

View reviewed changes

ggerganov approved these changes Mar 11, 2024

View reviewed changes

ggerganov merged commit 3202361 into ggerganov:master Mar 11, 2024
62 checks passed

Xarbirus deleted the windows-arm-runner branch April 17, 2024 10:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows ARM runner and build fixes #5979

Windows ARM runner and build fixes #5979

Xarbirus commented Mar 10, 2024

ggerganov Mar 10, 2024

ggerganov Mar 10, 2024

Xarbirus Mar 11, 2024

ggerganov left a comment

Windows ARM runner and build fixes #5979

Windows ARM runner and build fixes #5979

Conversation

Xarbirus commented Mar 10, 2024

ggerganov Mar 10, 2024

Choose a reason for hiding this comment

ggerganov Mar 10, 2024

Choose a reason for hiding this comment

Xarbirus Mar 11, 2024

Choose a reason for hiding this comment

ggerganov left a comment

Choose a reason for hiding this comment