Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backend : offload large batches to GPU #6083

Merged
merged 9 commits into from Mar 18, 2024
Prev Previous commit
Next Next commit
cuda : fix memset without set_device
  • Loading branch information
slaren committed Mar 16, 2024
commit 9cba8a183d147158009c6b57803378aa9962c521
1 change: 1 addition & 0 deletions ggml-cuda.cu
slaren marked this conversation as resolved.
Show resolved Hide resolved
Expand Up @@ -10618,6 +10618,7 @@ GGML_CALL static void ggml_backend_cuda_buffer_init_tensor(ggml_backend_buffer_t
size_t padded_size = ggml_backend_buft_get_alloc_size(buffer->buft, tensor);

if (padded_size > original_size && tensor->view_src == nullptr) {
ggml_cuda_set_device(ctx->device);
CUDA_CHECK(cudaMemset((char *)tensor->data + original_size, 0, padded_size - original_size));
}
}
Expand Down