Issues: mlc-ai/mlc-llm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug] TVMError: Check failed: (result) is false: Failed to allocate 99121664 bytes with alignment 16 bytes
bug
Confirmed bugs
#2243
opened Apr 28, 2024 by
zhjunqin
[Bug] Unexpected Error: The model weight size may be larger than GPU memory size
bug
Confirmed bugs
#2239
opened Apr 27, 2024 by
ahz-r3v
[Model Request] Microsoft Phi-3 mini Instruct (Faster and better then LLama 3 8B)
new-models
#2238
opened Apr 27, 2024 by
sebastienbo
2 tasks
[Bug] libc++abi: terminating due to uncaught exception of type tvm::runtime::InternalError: [14:02:26]
bug
Confirmed bugs
#2233
opened Apr 26, 2024 by
ash-rk
[Question] Support for Custom Attention Mask
question
Question about the usage
#2232
opened Apr 26, 2024 by
Peng-YM
[Question] Is Apple Silicon Neural Engine (ANE) and Core ML model package format supported?
question
Question about the usage
#2230
opened Apr 26, 2024 by
qdrddr
[Question] Is there an embeddings model in MLC format?
question
Question about the usage
#2229
opened Apr 26, 2024 by
qdrddr
[Question] Can I serve multiple models with the same instance?
question
Question about the usage
#2228
opened Apr 26, 2024 by
qdrddr
[Question] Is GGUF model package format supported with quantized models?
question
Question about the usage
#2227
opened Apr 26, 2024 by
qdrddr
[Bug] Token IDs not accepted by JSON grammar
bug
Confirmed bugs
#2223
opened Apr 25, 2024 by
dtkettler
[Bug] Failed to compile because the correct code page is not set
bug
Confirmed bugs
#2219
opened Apr 25, 2024 by
MeroZemory
[Question] Rust SDK + WebAssembly + GPU?
question
Question about the usage
#2218
opened Apr 25, 2024 by
louis030195
[NOTICE] Transition from ChatModule to MLCEngine
status: tracking
Tracking work in progress
#2217
opened Apr 25, 2024 by
tqchen
2 tasks
How can I deploy a single-card MLC-LLM model? I want the model inference to run only on one card, not distributed.
documentation
Improvements or additions to documentation
#2213
opened Apr 25, 2024 by
137591
[Bug] Mistral-7B-Instruct-v0.2 Model is creating garbage responses to prompts
bug
Confirmed bugs
#2207
opened Apr 23, 2024 by
kalyan-nakka
Prefill rate degrade between old mlc llm stack and new mlc llm stack via Windows Vulkan
question
Question about the usage
#2204
opened Apr 23, 2024 by
zhongzhenluo
[Question] How to use the own polyglot model?
question
Question about the usage
#2200
opened Apr 22, 2024 by
Moon-Ahn
[Doc] Add info about git and lfs to documentation
documentation
Improvements or additions to documentation
#2199
opened Apr 22, 2024 by
AleksanderObuchowski
[Feature Request] Nightly or Weekly Android apk build
feature request
New feature or request
#2194
opened Apr 22, 2024 by
EwoutH
[Question] Can PagedKVCache support different size of kvcache in different layers?
question
Question about the usage
#2193
opened Apr 22, 2024 by
BenchuYee
Previous Next
ProTip!
Updated in the last three days: updated:>2024-04-26.