Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

b3215 v2 #186

Merged
merged 18 commits into from
Jun 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
c5a8d4b
JSON Schema to GBNF integration tests (#7790)
HanClinto Jun 22, 2024
5b48cd5
Update llama-quantize ppl/file size output from LLaMA-v1 to Llama-3 v…
ddh0 Jun 22, 2024
3aa184a
convert-hf : change assert to exception (#8015)
0xspringtime Jun 22, 2024
adf480c
cvector-generator: Moe Moe Fixie-Fixie for Lots of Formats~! ♡(ᐢ ᴥ ᐢ)…
HatsuneMikuUwU33 Jun 22, 2024
3e58b0e
cvector: fix CI + correct help message (#8064)
ngxson Jun 22, 2024
b5a5f34
Removing extra blank lines that were breaking Lint. (#8067)
HanClinto Jun 22, 2024
45c0e2e
Refactor Vulkan backend to allow multiple contexts (#7961)
0cc4m Jun 23, 2024
b6b9a8e
fix CI failures (#8066)
slaren Jun 23, 2024
11318d9
Fix typo in llama_set_embeddings comment (#8077)
danbev Jun 23, 2024
6a2f298
server : fix JSON-Scheme typo (#7975)
akx Jun 23, 2024
e112b61
llama : add support for BitnetForCausalLM (#7931)
Eddie-Wang1120 Jun 23, 2024
95f57bb
ggml : remove ggml_task_type and GGML_PERF (#8017)
slaren Jun 24, 2024
de0d6a6
gguf-py, convert-hf : model conversion support for T5 and FLAN-T5 mod…
fairydreaming Jun 24, 2024
646ef4a
embedding : more cli arguments (#7458)
YannFollet Jun 24, 2024
8cb508d
disable publishing the full-rocm docker image (#8083)
slaren Jun 24, 2024
52fc870
Option to split during conversion (#6942)
christianazinn Jun 24, 2024
9a590c8
CUDA: optimize MMQ int8 tensor core performance (#8062)
JohannesGaessler Jun 24, 2024
d62e4aa
gguf-py : fix tensor groups for encoder-decoder models in gguf-dump.p…
fairydreaming Jun 24, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,5 @@ indent_size = 2
indent_style = tab

[examples/cvector-generator/*.txt]
trim_trailing_whitespace = unset
insert_final_newline = unset
6 changes: 2 additions & 4 deletions .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,13 @@ jobs:
- { tag: "light", dockerfile: ".devops/llama-cli.Dockerfile", platforms: "linux/amd64,linux/arm64" }
- { tag: "server", dockerfile: ".devops/llama-server.Dockerfile", platforms: "linux/amd64,linux/arm64" }
- { tag: "full", dockerfile: ".devops/full.Dockerfile", platforms: "linux/amd64,linux/arm64" }
# NOTE(canardletter): The CUDA builds on arm64 are very slow, so I
# have disabled them for now until the reason why
# is understood.
- { tag: "light-cuda", dockerfile: ".devops/llama-cli-cuda.Dockerfile", platforms: "linux/amd64" }
- { tag: "server-cuda", dockerfile: ".devops/llama-server-cuda.Dockerfile", platforms: "linux/amd64" }
- { tag: "full-cuda", dockerfile: ".devops/full-cuda.Dockerfile", platforms: "linux/amd64" }
- { tag: "light-rocm", dockerfile: ".devops/llama-cli-rocm.Dockerfile", platforms: "linux/amd64,linux/arm64" }
- { tag: "server-rocm", dockerfile: ".devops/llama-server-rocm.Dockerfile", platforms: "linux/amd64,linux/arm64" }
- { tag: "full-rocm", dockerfile: ".devops/full-rocm.Dockerfile", platforms: "linux/amd64,linux/arm64" }
# Note: the full-rocm image is failing due to a "no space left on device" error. It is disabled for now to allow the workflow to complete.
#- { tag: "full-rocm", dockerfile: ".devops/full-rocm.Dockerfile", platforms: "linux/amd64,linux/arm64" }
- { tag: "light-intel", dockerfile: ".devops/llama-cli-intel.Dockerfile", platforms: "linux/amd64" }
- { tag: "server-intel", dockerfile: ".devops/llama-server-intel.Dockerfile", platforms: "linux/amd64" }
steps:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/server.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ jobs:

strategy:
matrix:
sanitizer: [ADDRESS, THREAD, UNDEFINED]
sanitizer: [ADDRESS, UNDEFINED] # THREAD is broken
build_type: [RelWithDebInfo]
include:
- build_type: Release
Expand Down
7 changes: 0 additions & 7 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -144,9 +144,6 @@ option(LLAMA_BUILD_SERVER "llama: build server example"
option(LLAMA_LASX "llama: enable lasx" ON)
option(LLAMA_LSX "llama: enable lsx" ON)

# add perf arguments
option(LLAMA_PERF "llama: enable perf" OFF)

# Required for relocatable CMake package
include(${CMAKE_CURRENT_SOURCE_DIR}/scripts/build-info.cmake)

Expand Down Expand Up @@ -870,10 +867,6 @@ if (LLAMA_CPU_HBM)
target_link_libraries(ggml PUBLIC memkind)
endif()

if (LLAMA_PERF)
add_compile_definitions(GGML_PERF)
endif()

function(get_flags CCID CCVER)
set(C_FLAGS "")
set(CXX_FLAGS "")
Expand Down
5 changes: 1 addition & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -344,9 +344,6 @@ ifdef LLAMA_GPROF
MK_CFLAGS += -pg
MK_CXXFLAGS += -pg
endif
ifdef LLAMA_PERF
MK_CPPFLAGS += -DGGML_PERF
endif

# Architecture specific
# TODO: probably these flags need to be tweaked on some architectures
Expand Down Expand Up @@ -1051,7 +1048,7 @@ tests/test-grammar-parser: tests/test-grammar-parser.cpp ggml.o llama.o grammar-
$(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)

tests/test-grammar-integration: tests/test-grammar-integration.cpp ggml.o llama.o grammar-parser.o $(OBJS)
tests/test-grammar-integration: tests/test-grammar-integration.cpp json-schema-to-grammar.o ggml.o llama.o grammar-parser.o $(OBJS)
$(CXX) $(CXXFLAGS) -c $< -o $(call GET_OBJ_FILE, $<)
$(CXX) $(CXXFLAGS) $(filter-out %.h $<,$^) $(call GET_OBJ_FILE, $<) -o $@ $(LDFLAGS)

Expand Down
Loading
Loading