Force 4-bit weight quantization only? #6150
Answered
by
slaren
wilderfield
asked this question in
Q&A
-
I notice the Q4_K_M quantized llama2 models have some weight tensors that are 4-bit, and some that are 6-bit. Is there a way to force 4-bit only? |
Beta Was this translation helpful? Give feedback.
Answered by
slaren
Mar 19, 2024
Replies: 1 comment
-
|
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
wilderfield
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
./quantize --pure
should do it.