- Spain
Block or Report
Block or report slaren
Report abuse
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abusePopular repositories
-
llama.cpp
llama.cpp PublicForked from ggerganov/llama.cpp
Port of Facebook's LLaMA model in C/C++
C 6
-
-
-
-
-
Hearthstone-Deck-Tracker
Hearthstone-Deck-Tracker PublicForked from HearthSim/Hearthstone-Deck-Tracker
C#
1,137 contributions in the last year
Day of Week | April Apr | May May | June Jun | July Jul | August Aug | September Sep | October Oct | November Nov | December Dec | January Jan | February Feb | March Mar | |||||||||||||||||||||||||||||||||||||||||
Sunday Sun | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Monday Mon | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Tuesday Tue | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Wednesday Wed | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Thursday Thu | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Friday Fri | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Saturday Sat |
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More
Contribution activity
March 2024
Created 57 commits in 3 repositories
Created a pull request in ggerganov/llama.cpp that received 39 comments
backend : offload large batches to GPU
Moves the logic of auto-offloading to the GPU when processing large batches to ggml_backend_sched
. Currently only CUDA and Vulkan support this, thi…
+349
−396
lines changed
•
39
comments
Opened 22 other pull requests in 1 repository
ggerganov/llama.cpp
21
merged
1
open
-
ggml : fix bounds checking of zero size views
This contribution was made on Mar 27
-
cuda : rename build flag to LLAMA_CUDA
This contribution was made on Mar 25
-
cuda : fix LLAMA_CUDA_F16 build
This contribution was made on Mar 25
-
cuda : refactor into multiple files
This contribution was made on Mar 24
-
move BLAS to a separate backend
This contribution was made on Mar 21
-
cuda : add LLAMA_CUDA_NO_PEER_COPY to workaround broken ROCm p2p copy
This contribution was made on Mar 21
-
cuda : disable host register by default
This contribution was made on Mar 21
-
cuda : fix LLAMA_CUDA_F16 build
This contribution was made on Mar 21
-
cuda : fix conflict with std::swap
This contribution was made on Mar 20
-
cuda : print the returned error when CUDA initialization fails
This contribution was made on Mar 20
-
cuda : refactor to remove global resources
This contribution was made on Mar 19
-
ci : exempt some labels from being tagged as stale
This contribution was made on Mar 18
-
backend : set max split inputs to GGML_MAX_SRC
This contribution was made on Mar 18
-
llama : fix Baichuan2 13B
This contribution was made on Mar 15
-
cuda : disable unused cudaLaunchHostFunc code
This contribution was made on Mar 15
-
llama-bench : use random tokens to improve accuracy with mixtral
This contribution was made on Mar 14
-
test-backend-ops : skip CPU backend by default
This contribution was made on Mar 12
-
ci : remove tidy-review
This contribution was made on Mar 12
-
llama : add pipeline parallelism support
This contribution was made on Mar 12
-
perplexity : support using multiple sequences to allow larger batch sizes
This contribution was made on Mar 8
-
compare-llama-bench.py : remove mul_mat_q
This contribution was made on Mar 5
-
cuda : fix data race in soft max
This contribution was made on Mar 3
Reviewed 44 pull requests in 3 repositories
ggerganov/llama.cpp
25 pull requests
-
[Model] Add support for xverse
This contribution was made on Mar 27
-
split: allow --split-max-size option
This contribution was made on Mar 27
-
Vulkan k-quant mmq and ggml-backend offload functionality
This contribution was made on Mar 27
-
Control vectors in server
This contribution was made on Mar 26
-
llama : greatly reduce output buffer memory usage
This contribution was made on Mar 26
-
IQ1_M: 1.75 bpw quantization
This contribution was made on Mar 25
-
Fix heap corruption from wmode out-of-bound writes on windows
This contribution was made on Mar 24
-
imatrix : fix wname for mul_mat_id ops
This contribution was made on Mar 24
-
Fixed lookup compilation issues on Windows
This contribution was made on Mar 24
-
[SYCL] offload op
This contribution was made on Mar 24
-
server: docs:
--threads
and--threads
,--ubatch-size
,--log-disable
This contribution was made on Mar 23 -
split: add gguf-split in the make build target
This contribution was made on Mar 23
-
lookup: complement data from context with general text statistics
This contribution was made on Mar 22
-
sampling: remove duplicated code for probability distribution access
This contribution was made on Mar 22
-
quantize: be able to explicitly specify quantization type of output and token embedding tensors
This contribution was made on Mar 22
-
llama_model_loader: support multiple split/shard GGUFs
This contribution was made on Mar 22
-
common : add HF arg helpers
This contribution was made on Mar 22
-
Fix params underscore convert to dash.
This contribution was made on Mar 21
-
Add ability to use Q5_0, Q5_1, and IQ4_NL for quantized K cache
This contribution was made on Mar 20
-
mpt : implement backwards compatiblity with duped output tensor
This contribution was made on Mar 18
-
Fix memory leak in clip.cpp
This contribution was made on Mar 18
-
common: llama_load_model_from_url using --model-url
This contribution was made on Mar 16
-
backend : offload large batches to GPU
This contribution was made on Mar 16
-
Add qwen2moe
This contribution was made on Mar 15
-
[SYCL] fix set main gpu error, support single/mul gpu mode
This contribution was made on Mar 15
- Some pull request reviews not shown.
ggerganov/ggml
3 pull requests
-
sync : llama.cpp
This contribution was made on Mar 14
-
ggml : add RPC backend
This contribution was made on Mar 11
-
add some new ops, fix some operators and add batch operations to certain operators.
This contribution was made on Mar 3
ggerganov/whisper.cpp
1 pull request
-
whisper : allocate encoder results in dedicated buffer
This contribution was made on Mar 16
Created an issue in ggerganov/llama.cpp that received 3 comments
Opened 1 other issue in 1 repository
ggerganov/llama.cpp
1
open
-
Improve handling of connectivity failures when loading a model from URL
This contribution was made on Mar 24
Answered 6 discussions in 2 repositories
ggerganov/llama.cpp
ggerganov/llama.cpp
-
How bcast works
This contribution was made on Mar 25
-
How bcast works
This contribution was made on Mar 25
-
Force 4-bit weight quantization only?
This contribution was made on Mar 19
-
Where to insert code to deploy to custom accelerator?
This contribution was made on Mar 18
ggerganov/ggml
ggerganov/ggml
-
Should there be a matrix check in `ggml_transpose`?
This contribution was made on Mar 3
-
Is there a slice operator in ggml? If not, how do I use it / implement it?
This contribution was made on Mar 3