-
Notifications
You must be signed in to change notification settings - Fork 42
Description
Describe the Issue
Unable to compile with ROCm 7.0.1 on Ubuntu 24.04.03 - looks a bit like #136 , though this is a released version, not a release candidate.
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/hip_assert.h:29:6: error: declaration of 'abort' has a different language linkage
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/hip_assert.h:46:6: error: redefinition of '__assert_fail'
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/hip_assert.h:88:6: error: redefinition of '__assertfail'
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:496:12: error: no template named 'conditional' in namespace '__hip_internal'; did you mean 'rocwmma::conditional'?
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:498:16: error: no template named 'conditional' in namespace '__hip_internal'; did you mean 'rocwmma::conditional'?
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:817:36: error: unknown type name '__hip_bfloat16_raw'
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:824:9: error: no viable conversion from '__hip_bfloat16' to 'float'
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:843:39: error: unknown type name '__hip_bfloat162_raw'; did you mean '__hip_bfloat162'?
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:850:10: error: no viable conversion from '__hip_bfloat162' to 'float2' (aka 'HIP_vector_type<float, 2>')
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:1052:36: error: cannot convert 'const __hip_bfloat16' to 'float' without a conversion operator
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:1089:12: error: no matching conversion for functional-style cast from 'float' to '__hip_bfloat16'
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:1610:36: error: cannot convert 'const __hip_bfloat16' to 'float' without a conversion operator
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:1660:12: error: no matching conversion for functional-style cast from 'float' to '__hip_bfloat16'
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:2167:36: error: cannot convert 'const __hip_bfloat16' to 'float' without a conversion operator
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:2205:12: error: no matching conversion for functional-style cast from 'float' to '__hip_bfloat16'
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:2729:36: error: cannot convert 'const __hip_bfloat16' to 'float' without a conversion operator
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:2779:12: error: no matching conversion for functional-style cast from 'float' to '__hip_bfloat16'
ggml/src/ggml-cuda/fattn-wmma-f16.cu:418:45: error: call to '__hadd' is ambiguous
ggml/src/ggml-cuda/fattn-wmma-f16.cu:418:45: error: call to '__hadd' is ambiguous
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated when compiling for gfx1100.
Additional Information:
Please provide as much relevant information about your setup as possible, such as the Operating System, CPU, GPU, KoboldCpp Version, and relevant logs (helpful to include the launch params from the terminal output, flags and crash logs)
$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=24.04
DISTRIB_CODENAME=noble
DISTRIB_DESCRIPTION="Ubuntu 24.04.3 LTS"
$ head /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 25
model : 33
model name : AMD Ryzen 9 5900XT 16-Core Processor
stepping : 2
microcode : 0xa201213
cpu MHz : 2188.078
cache size : 512 KB
physical id : 0
$ rocm-smi --showhw
====================================== ROCm System Management Interface ======================================
=========================================== Concise Hardware Info ============================================
GPU NODE DID GUID GFX VER GFX RAS SDMA RAS UMC RAS VBIOS BUS PARTITION ID
0 1 0x744c 10429 gfx1100 N/A N/A N/A 113-D70401XT-P11 0000:2D:00.0 0
==============================================================================================================
============================================ End of ROCm SMI Log =============================================
$ git log | head -5
commit 03eabefd615ad23d65924eb2e341f3ed61f02c1f
Author: YellowRoseCx <80486540+YellowRoseCx@users.noreply.github.com>
Date: Wed Aug 27 12:01:38 2025 -0500
workflow
Installed ROCm 7.0.1 for Ubuntu 24.04 as described at https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/quick-start.html
Build Log
$ make LLAMA_HIPBLAS=1
I koboldcpp build info:
I UNAME_S: Linux
I UNAME_P: x86_64
I UNAME_M: x86_64
I UNAME_O: GNU/Linux
I CFLAGS: -I. -Iggml/include -Iggml/src -Iggml/src/ggml-cpu -Iinclude -Isrc -I./common -I./vendor -I./vendor/stb -I./include -I./include/CL -I./otherarch -I./otherarch/tools -I./otherarch/sdcpp -I./otherarch/ttscpp/include -I./otherarch/ttscpp/src -I./otherarch/sdcpp/thirdparty -I./include/vulkan -O3 -fno-finite-math-only -std=c11 -fPIC -DLOG_DISABLE_LOGS -D_GNU_SOURCE -DGGML_USE_CPU -DGGML_USE_CPU_REPACK -DNDEBUG -s -DGGML_USE_LLAMAFILE -pthread -Wno-deprecated -Wno-deprecated-declarations -Wno-unused-variable -pthread -march=native -mtune=native
I CXXFLAGS: -I. -Iggml/include -Iggml/src -Iggml/src/ggml-cpu -Iinclude -Isrc -I./common -I./vendor -I./vendor/stb -I./include -I./include/CL -I./otherarch -I./otherarch/tools -I./otherarch/sdcpp -I./otherarch/ttscpp/include -I./otherarch/ttscpp/src -I./otherarch/sdcpp/thirdparty -I./include/vulkan -O3 -fno-finite-math-only -std=c++17 -fPIC -DLOG_DISABLE_LOGS -D_GNU_SOURCE -DGGML_USE_CPU -DGGML_USE_CPU_REPACK -DNDEBUG -s -DGGML_USE_LLAMAFILE -pthread -Wno-multichar -Wno-write-strings -Wno-deprecated -Wno-deprecated-declarations -Wno-unused-variable -pthread
I LDFLAGS: -ldl
I CC: cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
I CXX: g++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
I HIP CC: AMD clang version 20.0.0git (https://github.com/RadeonOpenCompute/llvm-project roc-7.0.1 25314 f4087f6b428f0e6f575ebac8a8a724dab123d06e)
I HIP CXX: AMD clang version 20.0.0git (https://github.com/RadeonOpenCompute/llvm-project roc-7.0.1 25314 f4087f6b428f0e6f575ebac8a8a724dab123d06e)
g++ -I. -Iggml/include -Iggml/src -Iggml/src/ggml-cpu -Iinclude -Isrc -I./common -I./vendor -I./vendor/stb -I./include -I./include/CL -I./otherarch -I./otherarch/tools -I./otherarch/sdcpp -I./otherarch/ttscpp/include -I./otherarch/ttscpp/src -I./otherarch/sdcpp/thirdparty -I./include/vulkan -O3 -fno-finite-math-only -std=c++17 -fPIC -DLOG_DISABLE_LOGS -D_GNU_SOURCE -DGGML_USE_CPU -DGGML_USE_CPU_REPACK -DNDEBUG -s -DGGML_USE_LLAMAFILE -pthread -Wno-multichar -Wno-write-strings -Wno-deprecated -Wno-deprecated-declarations -Wno-unused-variable -pthread ggml.o ggml-cpu.o ggml-ops.o ggml-vec.o ggml-binops.o ggml-unops.o ggml_v3.o ggml_v2.o ggml_v1.o expose.o gpttype_adapter.o sdcpp_default.o whispercpp_default.o tts_default.o embeddings_default.o llavaclip_default.o llava.o ggml-backend_default.o ggml-backend-reg_default.o ggml-repack.o ggml-alloc.o ggml-cpu-traits.o ggml-quants.o ggml-cpu-quants.o kcpp-quantmapper.o kcpp-repackmapper.o unicode.o unicode-data.o ggml-threading.o ggml-cpu-cpp.o gguf.o sgemm.o common.o sampling.o kcpputils.o mtmdaudio.o -shared -o koboldcpp_default.so -ldl
/opt/rocm/llvm/bin/clang++ -I. -Iggml/include -Iggml/src -Iinclude -Isrc -I./common -I./include -I./include/CL -I./otherarch -I./otherarch/tools -I./otherarch/sdcpp -I./otherarch/sdcpp/thirdparty -I./include/vulkan -O3 -fno-finite-math-only -DNDEBUG -std=c++17 -fPIC -DLOG_DISABLE_LOGS -D_GNU_SOURCE -DGGML_USE_LLAMAFILE -DGGML_HIP_ROCWMMA_FATTN -I/opt/rocm/include/rocwmma/ -DGGML_USE_HIPBLAS -DGGML_USE_HIP -DGGML_HIP_NO_VMM -DGGML_USE_CUDA -DSD_USE_CUDA -DSD_USE_CUBLAS -D__HIP_PLATFORM_HCC__= -D__HIP_PLATFORM_AMD__= -I/opt/rocm-7.0.1/include -I/include --offload-arch=gfx1100 -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=2 -DK_QUANTS_PER_ITERATION=2 -x hip -c -o ggml/src/ggml-cuda/fattn-wmma-f16.o ggml/src/ggml-cuda/fattn-wmma-f16.cu
In file included from ggml/src/ggml-cuda/fattn-wmma-f16.cu:18:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/rocwmma.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors_impl.hpp:31:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits_impl.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/layout.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../mapping_util.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../types.hpp:88:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../float8.hpp:118:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/hip_fp8.h:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:77:
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/hip_assert.h:29:6: error: declaration of 'abort' has a different language linkage
29 | void abort() {
| ^
/usr/include/hip/amd_detail/amd_device_functions.h:805:6: note: previous definition is here
805 | void abort() {
| ^
In file included from ggml/src/ggml-cuda/fattn-wmma-f16.cu:18:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/rocwmma.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors_impl.hpp:31:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits_impl.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/layout.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../mapping_util.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../types.hpp:88:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../float8.hpp:118:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/hip_fp8.h:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:77:
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/hip_assert.h:46:6: error: redefinition of '__assert_fail'
46 | void __assert_fail(const char *assertion,
| ^
/usr/include/hip/amd_detail/amd_device_functions.h:822:6: note: previous definition is here
822 | void __assert_fail(const char *assertion,
| ^
In file included from ggml/src/ggml-cuda/fattn-wmma-f16.cu:18:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/rocwmma.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors_impl.hpp:31:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits_impl.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/layout.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../mapping_util.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../types.hpp:88:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../float8.hpp:118:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/hip_fp8.h:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:77:
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/hip_assert.h:88:6: error: redefinition of '__assertfail'
88 | void __assertfail()
| ^
/usr/include/hip/amd_detail/amd_device_functions.h:864:6: note: previous definition is here
864 | void __assertfail()
| ^
In file included from ggml/src/ggml-cuda/fattn-wmma-f16.cu:18:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/rocwmma.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors_impl.hpp:31:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits_impl.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/layout.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../mapping_util.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../types.hpp:88:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../float8.hpp:118:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/hip_fp8.h:30:
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:496:12: error: no template named 'conditional' in namespace '__hip_internal'; did you mean 'rocwmma::conditional'?
496 | typename __hip_internal::conditional<
| ^~~~~~~~~~~~~~~~
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../utility/type_traits.hpp:109:16: note: 'rocwmma::conditional' declared here
109 | using std::conditional;
| ^
In file included from ggml/src/ggml-cuda/fattn-wmma-f16.cu:18:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/rocwmma.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors_impl.hpp:31:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits_impl.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/layout.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../mapping_util.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../types.hpp:88:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../float8.hpp:118:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/hip_fp8.h:30:
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:498:16: error: no template named 'conditional' in namespace '__hip_internal'; did you mean 'rocwmma::conditional'?
498 | typename __hip_internal::conditional<sizeof(T) == 4, unsigned int,
| ^~~~~~~~~~~~~~~~
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../utility/type_traits.hpp:109:16: note: 'rocwmma::conditional' declared here
109 | using std::conditional;
| ^
In file included from ggml/src/ggml-cuda/fattn-wmma-f16.cu:18:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/rocwmma.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors_impl.hpp:31:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits_impl.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/layout.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../mapping_util.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../types.hpp:88:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../float8.hpp:118:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/hip_fp8.h:30:
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:817:36: error: unknown type name '__hip_bfloat16_raw'
817 | __hip_cvt_bfloat16raw_to_fp8(const __hip_bfloat16_raw hr, const __hip_saturation_t sat,
| ^
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:824:9: error: no viable conversion from '__hip_bfloat16' to 'float'
824 | float fval = __hip_bfloat16(hr);
| ^ ~~~~~~~~~~~~~~~~~~
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:843:39: error: unknown type name '__hip_bfloat162_raw'; did you mean '__hip_bfloat162'?
843 | __hip_cvt_bfloat16raw2_to_fp8x2(const __hip_bfloat162_raw hr, const __hip_saturation_t sat,
| ^
/usr/include/hip/amd_detail/amd_hip_bf16.h:113:8: note: '__hip_bfloat162' declared here
113 | struct __hip_bfloat162 {
| ^
In file included from ggml/src/ggml-cuda/fattn-wmma-f16.cu:18:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/rocwmma.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors_impl.hpp:31:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits_impl.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/layout.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../mapping_util.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../types.hpp:88:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../float8.hpp:118:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/hip_fp8.h:30:
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:850:10: error: no viable conversion from '__hip_bfloat162' to 'float2' (aka 'HIP_vector_type<float, 2>')
850 | float2 f2 = __hip_bfloat162(hr);
| ^ ~~~~~~~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_vector_types.h:470:9: note: candidate constructor not viable: no known conversion from '__hip_bfloat162' to 'const HIP_vector_type<float, 2> &' for 1st argument
470 | HIP_vector_type(const HIP_vector_type&) = default;
| ^ ~~~~~~~~~~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_vector_types.h:473:9: note: candidate constructor not viable: no known conversion from '__hip_bfloat162' to 'HIP_vector_type<float, 2> &&' for 1st argument
473 | HIP_vector_type(HIP_vector_type&&) = default;
| ^ ~~~~~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_vector_types.h:465:9: note: candidate template ignored: requirement 'sizeof...(Us) == 2U' was not satisfied [with Us = <_hip_bfloat162>]
465 | HIP_vector_type(Us... xs) noexcept
| ^
/usr/include/hip/amd_detail/amd_hip_vector_types.h:456:9: note: explicit constructor is not a candidate
456 | HIP_vector_type(U x) noexcept
| ^
In file included from ggml/src/ggml-cuda/fattn-wmma-f16.cu:18:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/rocwmma.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors_impl.hpp:31:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits_impl.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/layout.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../mapping_util.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../types.hpp:88:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../float8.hpp:118:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/hip_fp8.h:30:
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:1052:36: error: cannot convert 'const __hip_bfloat16' to 'float' without a conversion operator
1052 | : __x(__hip_cvt_float_to_fp8(static_cast(f), __default_saturation,
| ^~~~~~~~~~~~~~~~~~~~~
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:1089:12: error: no matching conversion for functional-style cast from 'float' to '__hip_bfloat16'
1089 | return __hip_bfloat16(f);
| ^~~~~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'float' to 'const __hip_bfloat16' for 1st argument
108 | struct __hip_bfloat16 {
| ^~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit move constructor) not viable: no known conversion from 'float' to '__hip_bfloat16' for 1st argument
108 | struct __hip_bfloat16 {
| ^~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit default constructor) not viable: requires 0 arguments, but 1 was provided
In file included from ggml/src/ggml-cuda/fattn-wmma-f16.cu:18:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/rocwmma.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors_impl.hpp:31:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits_impl.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/layout.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../mapping_util.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../types.hpp:88:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../float8.hpp:118:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/hip_fp8.h:30:
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:1610:36: error: cannot convert 'const __hip_bfloat16' to 'float' without a conversion operator
1610 | : __x(__hip_cvt_float_to_fp8(static_cast(f), __default_saturation,
| ^~~~~~~~~~~~~~~~~~~~~
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:1660:12: error: no matching conversion for functional-style cast from 'float' to '__hip_bfloat16'
1660 | return __hip_bfloat16(f);
| ^~~~~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'float' to 'const __hip_bfloat16' for 1st argument
108 | struct __hip_bfloat16 {
| ^~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit move constructor) not viable: no known conversion from 'float' to '__hip_bfloat16' for 1st argument
108 | struct __hip_bfloat16 {
| ^~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit default constructor) not viable: requires 0 arguments, but 1 was provided
In file included from ggml/src/ggml-cuda/fattn-wmma-f16.cu:18:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/rocwmma.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors_impl.hpp:31:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits_impl.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/layout.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../mapping_util.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../types.hpp:88:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../float8.hpp:118:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/hip_fp8.h:30:
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:2167:36: error: cannot convert 'const __hip_bfloat16' to 'float' without a conversion operator
2167 | : __x(__hip_cvt_float_to_fp8(static_cast(f), __default_saturation,
| ^~~~~~~~~~~~~~~~~~~~~
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:2205:12: error: no matching conversion for functional-style cast from 'float' to '__hip_bfloat16'
2205 | return __hip_bfloat16(f);
| ^~~~~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'float' to 'const __hip_bfloat16' for 1st argument
108 | struct __hip_bfloat16 {
| ^~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit move constructor) not viable: no known conversion from 'float' to '__hip_bfloat16' for 1st argument
108 | struct __hip_bfloat16 {
| ^~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit default constructor) not viable: requires 0 arguments, but 1 was provided
In file included from ggml/src/ggml-cuda/fattn-wmma-f16.cu:18:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/rocwmma.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/accessors_impl.hpp:31:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/fragment_traits_impl.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/layout.hpp:30:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../mapping_util.hpp:29:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../types.hpp:88:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/rocwmma/internal/layout/../float8.hpp:118:
In file included from /opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/hip_fp8.h:30:
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:2729:36: error: cannot convert 'const __hip_bfloat16' to 'float' without a conversion operator
2729 | : __x(__hip_cvt_float_to_fp8(static_cast(f), __default_saturation,
| ^~~~~~~~~~~~~~~~~~~~~
/opt/rocm-7.0.1/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_fp8.h:2779:12: error: no matching conversion for functional-style cast from 'float' to '__hip_bfloat16'
2779 | return __hip_bfloat16(f);
| ^~~~~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'float' to 'const __hip_bfloat16' for 1st argument
108 | struct __hip_bfloat16 {
| ^~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit move constructor) not viable: no known conversion from 'float' to '__hip_bfloat16' for 1st argument
108 | struct __hip_bfloat16 {
| ^~~~~~~~~~~~~~
/usr/include/hip/amd_detail/amd_hip_bf16.h:108:8: note: candidate constructor (the implicit default constructor) not viable: requires 0 arguments, but 1 was provided
ggml/src/ggml-cuda/fattn-wmma-f16.cu:418:45: error: call to '__hadd' is ambiguous
418 | KQ_rowsum_h2[j0/nwarps].x = __hadd(KQ_rowsum_h2[j0/nwarps].x, val);
| ^~~~~~
ggml/src/ggml-cuda/fattn-wmma-f16.cu:526:24: note: in instantiation of function template specialization 'flash_attn_ext_f16<64, 16, 4, 64, float, false>' requested here
526 | fattn_kernel = flash_attn_ext_f16<
| ^
ggml/src/ggml-cuda/fattn-wmma-f16.cu:548:21: note: in instantiation of function template specialization 'ggml_cuda_flash_attn_ext_wmma_f16_case<64, 16, float>' requested here
548 | ggml_cuda_flash_attn_ext_wmma_f16_case< 64, cols_per_block, float>(ctx, dst);
| ^
/usr/include/hip/amd_detail/amd_device_functions.h:257:39: note: candidate function
257 | device static inline unsigned int __hadd(int x, int y) {
| ^
/usr/include/hip/amd_detail/amd_hip_fp16.h:1368:20: note: candidate function
1368 | __half __hadd(__half x, __half y)
| ^
ggml/src/ggml-cuda/fattn-wmma-f16.cu:418:45: error: call to '__hadd' is ambiguous
418 | KQ_rowsum_h2[j0/nwarps].x = __hadd(KQ_rowsum_h2[j0/nwarps].x, val);
| ^~~~~~
ggml/src/ggml-cuda/fattn-wmma-f16.cu:530:24: note: in instantiation of function template specialization 'flash_attn_ext_f16<64, 16, 4, 64, float, true>' requested here
530 | fattn_kernel = flash_attn_ext_f16<
| ^
ggml/src/ggml-cuda/fattn-wmma-f16.cu:548:21: note: in instantiation of function template specialization 'ggml_cuda_flash_attn_ext_wmma_f16_case<64, 16, float>' requested here
548 | ggml_cuda_flash_attn_ext_wmma_f16_case< 64, cols_per_block, float>(ctx, dst);
| ^
/usr/include/hip/amd_detail/amd_device_functions.h:257:39: note: candidate function
257 | device static inline unsigned int __hadd(int x, int y) {
| ^
/usr/include/hip/amd_detail/amd_hip_fp16.h:1368:20: note: candidate function
1368 | __half __hadd(__half x, __half y)
| ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated when compiling for gfx1100.
make: *** [Makefile:303: ggml/src/ggml-cuda/fattn-wmma-f16.o] Error 1
rocminfo
ROCk module version 6.14.14 is loaded
HSA System Attributes
Runtime Version: 1.18
Runtime Ext Version: 1.11
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
XNACK enabled: NO
DMAbuf Support: YES
VMM Support: YES
==========
HSA Agents
Agent 1
Name: AMD Ryzen 9 5900XT 16-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 9 5900XT 16-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 3300
BDFID: 0
Internal Node ID: 0
Compute Unit: 32
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Memory Properties:
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 63664968(0x3cb7348) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 63664968(0x3cb7348) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 63664968(0x3cb7348) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 4
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 63664968(0x3cb7348) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
Agent 2
Name: gfx1100
Uuid: GPU-d8607413b0e3a90b
Marketing Name: Radeon RX 7900 XT
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 32(0x20) KB
L2: 6144(0x1800) KB
L3: 81920(0x14000) KB
Chip ID: 29772(0x744c)
ASIC Revision: 0(0x0)
Cacheline Size: 128(0x80)
Max Clock Freq. (MHz): 2075
BDFID: 11520
Internal Node ID: 1
Compute Unit: 84
SIMDs per CU: 2
Shader Engines: 6
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Coherent Host Access: FALSE
Memory Properties:
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 32(0x20)
Max Work-item Per CU: 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 2147483647(0x7fffffff)
y 65535(0xffff)
z 65535(0xffff)
Max fbarriers/Workgrp: 32
Packet Processor uCode:: 542
SDMA engine uCode:: 24
IOMMU Support:: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 20955136(0x13fc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GLOBAL; FLAGS: EXTENDED FINE GRAINED
Size: 20955136(0x13fc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Recommended Granule:2048KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 3
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Recommended Granule:0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1100
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 2147483647(0x7fffffff)
y 65535(0xffff)
z 65535(0xffff)
FBarrier Max Size: 32
ISA 2
Name: amdgcn-amd-amdhsa--gfx11-generic
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 2147483647(0x7fffffff)
y 65535(0xffff)
z 65535(0xffff)
FBarrier Max Size: 32
*** Done ***