gpu-acceleration

Here are 572 public repositories matching this topic...

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

deep-learning inference nvidia gpu-acceleration tensorrt

Updated May 24, 2024
C++

tensorflow / tfjs

Star

A WebGL accelerated JavaScript library for training and deploying ML models.

javascript webgl machine-learning typescript deep-learning neural-network wasm web-assembly gpu-acceleration deep-neural-network

Updated May 25, 2024
TypeScript

tensorflow / tfjs-core

Star

WebGL-accelerated ML // linear algebra // automatic differentiation for JavaScript.

javascript webgl machine-learning typescript deep-neural-networks deep-learning neural-network gpu-acceleration

Updated Aug 14, 2019
TypeScript

cornellius-gp / gpytorch

Star

A highly efficient implementation of Gaussian Processes in PyTorch

pytorch gpu-acceleration gaussian-processes

Updated May 15, 2024
Python

NVIDIA / GenerativeAIExamples

Star

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

microservice gpu-acceleration nemo tensorrt rag triton-inference-server large-language-models llm llm-inference retrieval-augmented-generation

Updated May 24, 2024
Python

NVIDIA-Merlin / HugeCTR

Star

HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training

deep-learning cpp recommendation-system gpu-acceleration recommender-system

Updated May 21, 2024
C++

limbo018 / DREAMPlace

Star

Deep learning toolkit-enabled VLSI placement

deep-learning pytorch gpu-acceleration vlsi vlsi-physical-design vlsi-placement

Updated May 20, 2024
C++

NVlabs / sionna

Star

Sionna: An Open-Source Library for Next-Generation Physical Layer Research

open-source machine-learning deep-learning reproducible-research gpu-acceleration communications 5g 6g link-level-simulation

Updated May 17, 2024
Jupyter Notebook

BlazingDB / blazingsql

Star

BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.

python data-science machine-learning sql gpu arrow artificial-intelligence gpu-acceleration sql-engine conda-environment machine-learning-workflow rapids cudf rapidsai blazingsql gpu-dataframes

Updated Sep 16, 2022
C++

Hedgehog-Computing / hedgehog-lab

Star

Run, compile and execute JavaScript for Scientific Computing and Data Visualization TOTALLY TOTALLY TOTALLY in your BROWSER! An open source scientific computing environment for JavaScript TOTALLY in your browser, matrix operations with GPU acceleration, TeX support, data visualization and symbolic computation.

javascript tex webgl machine-learning latex computer-algebra data-visualization scientific-computing gpu-acceleration webgl2 symbolic-computation matrix-library

Updated May 2, 2024
TypeScript

TianZerL / Anime4KCPP

Star

A high performance anime upscaler

machine-learning cpp anime computer-graphics cnn video-processing gpu-acceleration avisynth vapoursynth upscaling directshow-filter anime4k anime4kcpp vapoursynth-plugin avisynthplus-plugin

Updated May 21, 2024
C++

ROCm / Tensile

Star

Stretching GPU performance for GEMMs and tensor contractions.

python machine-learning amd gpu assembly opencl dnn matrix-multiplication neural-networks gpu-acceleration blas hip gpu-computing tensors tensor-contraction gemm radeon auto-tuning radeon-open-compute

Updated May 22, 2024
Python

marian-nmt / marian-dev

Star

Fast Neural Machine Translation in C++ - development repository

fast cuda cpp11 gpu-acceleration neural-machine-translation

Updated May 26, 2024
C++

Liu-xiandong / How_to_optimize_in_GPU

Star

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.

hpc reduce high-performance-computing gpu-acceleration sgemm elementwise sgemv

Updated Jul 29, 2023
Cuda

NVIDIA-Merlin / Merlin

Star

NVIDIA Merlin is an open source library providing end-to-end GPU-accelerated recommender systems, from feature engineering and preprocessing to training deep learning models and running inference in production.

machine-learning deep-learning end-to-end recommendation-system gpu-acceleration recommender-system