Pull requests: ggerganov/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Update Server's README with undocumented options for RoPE, YaRN, and KV cache quantization
#7013
opened Apr 30, 2024 by
K-Mistele
Loading…
new tokenizer-verifier tool to check gguf tokenizer parameters
#6988
opened Apr 29, 2024 by
anisse
Loading…
Server: add test for num slots, fails on master
#6950
opened Apr 27, 2024 by
JohannesGaessler
Loading…
Updated server_queue to delete tasks from queue when server is shutdown. Feature Request #6421
demo
Demonstrate some concept or idea, not intended to be merged
#6941
opened Apr 27, 2024 by
rahsuri
Loading…
Implemented basic interface for llamacheck and link to weights, adapt…
#6940
opened Apr 27, 2024 by
Ferruolo
Loading…
fixed off by one error when context shifting in main.cpp example
#6921
opened Apr 26, 2024 by
l3utterfly
Loading…
Draft Idea... CPU Inference... This seems to perform better?
#6915
opened Apr 26, 2024 by
kunnis
Loading…
ggml-qnn: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend
#6869
opened Apr 24, 2024 by
zhouwg
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.