Skip to content

Navigation Menu

Explore
For
- Enterprise
- Teams
- Startups
- Education
By Solution
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

ggerganov / llama.cpp Public

Notifications
Fork 8.1k
Star 57.2k

Code
Issues 357
Pull requests 207
Discussions
Actions
Projects 4
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggerganov/llama.cpp

Labels 46 Milestones 0

Labels 46 Milestones 0

New pull request New

207 Open 2,861 Closed

207 Open 2,861 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

chore: Add hashsum for stablelm models

#7018 opened May 1, 2024 by teleprint-me

Loading…

Tidy Android Instructions README.md

#7016 opened Apr 30, 2024 by Jeximo

Loading…

5

ci : exempt confirmed bugs from being tagged as stale

#7014 opened Apr 30, 2024 by slaren

Loading…

Update Server's README with undocumented options for RoPE, YaRN, and KV cache quantization

#7013 opened Apr 30, 2024 by K-Mistele

Loading…

Fix flash attention for ROCm

#7011 opened Apr 30, 2024 by jdecourval • Draft

2

add chatglm3-6b model support [help wanted]

#6999 opened Apr 30, 2024 by mnlife • Draft

1

new tokenizer-verifier tool to check gguf tokenizer parameters

#6988 opened Apr 29, 2024 by anisse

Loading…

1

Attempt at OpenElm

#6986 opened Apr 29, 2024 by joshcarp • Draft

9

llama3 custom regex split

#6965 opened Apr 28, 2024 by jaime-m-p

Loading…

server: avoid breaking KV cache when prompt >= n_ctx

#6958 opened Apr 28, 2024 by prfd • Draft

7

move ndk code to a new library

#6951 opened Apr 27, 2024 by eltonkola

Loading…

Server: add test for num slots, fails on master

#6950 opened Apr 27, 2024 by JohannesGaessler

Loading…

7

Option to split during conversion

#6942 opened Apr 27, 2024 by christianazinn • Draft

Updated server_queue to delete tasks from queue when server is shutdown. Feature Request #6421

demo

Demonstrate some concept or idea, not intended to be merged

#6941 opened Apr 27, 2024 by rahsuri

Loading…

3

Implemented basic interface for llamacheck and link to weights, adapt…

#6940 opened Apr 27, 2024 by Ferruolo

Loading…

2

Fix clip build on windows + clang

#6934 opened Apr 26, 2024 by dhiltgen

Loading…

1

main : don't print special tokens with --grammar

#6923 opened Apr 26, 2024 by jart

Loading…

fixed off by one error when context shifting in main.cpp example

#6921 opened Apr 26, 2024 by l3utterfly

Loading…

support MiniCPM-V-2

#6919 opened Apr 26, 2024 by Achazwl

Loading…

3

Draft Idea... CPU Inference... This seems to perform better?

#6915 opened Apr 26, 2024 by kunnis

Loading…

Properly set clamp_qkv value in OLMo conversion

#6910 opened Apr 25, 2024 by nopperl

Loading…

2

Fix CORS for /health endpoint

#6892 opened Apr 25, 2024 by soulofmischief

Loading…

5

AVX Q4_0 and Q8_0 sgemm

#6891 opened Apr 25, 2024 by netrunnereve

Loading…

Clamp out of range values in K quantizer

#6888 opened Apr 25, 2024 by jart

Loading…

ggml-qnn: add Qualcomm QNN(Qualcomm Neural Network,aka Qualcomm AI Engine Direct) backend

#6869 opened Apr 24, 2024 by zhouwg

Loading…

Previous 1 2 3 4 5 … 8 9 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.