Trending
See what the GitHub community is most excited about today.
Spoken Language: Any
Language: Cuda
Date range: Today
NVIDIA / cuda-checkpoint
CUDA checkpoint and restore utility
flashinfer-ai / flashinfer
FlashInfer: Kernel Library for LLM Serving
NVIDIA / nvbench
CUDA Kernel Benchmarking Library
sbip-sg / CuEVM
Cuda implementation of EVM bytecode executor
NVIDIA / cub
[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
nerfstudio-project / gsplat
CUDA accelerated rasterization of gaussian splatting
rapidsai / raft
RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.
NVIDIA / nccl-tests
NCCL Tests
Dao-AILab / causal-conv1d
Causal depthwise conv1d in CUDA, with a PyTorch interface
ashawkey / diff-gaussian-rasterization
BBuf / how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
DefTruth / CUDA-Learn-Notes
🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
JonathonLuiten / diff-gaussian-rasterization-w-depth
rapidsai / cugraph
cuGraph - RAPIDS Graph Analytics Library
karpathy / llm.c
LLM training in simple, raw C/CUDA