[Model] Add support for xverse #6301

hxer7963 · 2024-03-25T16:13:40Z

[Model] Add support for xverse

llama.cpp: Add support for Xverse model architecture

init_mappings: Turn off prefetching when load xveres model to avoid struck when call mmap on linux os.

gguf-py : Add convert for Xverse model architecture

2. Add LLM_ARCH_XVERSE inference in llama.cpp; 3. Add xverse item in Supported models in README.md;

…verse

slaren · 2024-03-25T16:18:44Z

init_mappings: Turn off prefetching when load xveres model to avoid struck when call mmap on linux os.

Can you elaborate, why is this necessary specifically for this model and not others?

cebtenzzre · 2024-03-25T16:59:23Z

convert-hf-to-gguf.py

@@ -212,6 +212,7 @@ def from_model_architecture(cls, arch):
        try:
            return cls._model_classes[arch]
        except KeyError:
+            print(f"{cls._model_classes}")


I don't think we need to leave this in. Unless you think the exception message should mention the full list of supported models.

I don't think we need to leave this in. Unless you think the exception message should mention the full list of supported models.

I will remove the redundant logs.

* llama: remove the init_mapping_prefetch custom parameter

hxer7963 · 2024-03-26T08:46:26Z

init_mappings: Turn off prefetching when load xveres model to avoid struck when call mmap on linux os.

Can you elaborate, why is this necessary specifically for this model and not others?

Nice question!

I just turned on the prefetch switch and retested the xverse model loading process locally, and found that there was no problem.
During previous testing, it was possible that the model was too large (65B model) and the loading process was slow, mistakenly thinking that the program was stuck.

The prefetch settings have been restored in the new submission.

llama.cpp

…sed outputs of the last layers.

hxer7963 · 2024-03-27T14:54:29Z

Hello, @cebtenzzre @compilade

I've addressed all the feedback in the pull request.

The EditorConfig Checker and flake8 Lint failed, how can i fix it?

Could you please take another look at the code changes and provide your review?

Thank you for your time and assistance.

Best regards,

hxer7963.

slaren · 2024-03-27T21:33:33Z

The EditorConfig Checker and flake8 Lint failed, how can i fix it?

You have to fix the formatting issues. Click on "Details" and it will tell you the reason.

 convert-hf-to-gguf.py:
	885: Trailing whitespace
	921: Trailing whitespace

./convert-hf-to-gguf.py:781:1: E302 expected 2 blank lines, found 1
./convert-hf-to-gguf.py:885:1: W293 blank line contains whitespace
./convert-hf-to-gguf.py:921:1: W293 blank line contains whitespace
./convert-hf-to-gguf.py:922:1: E302 expected 2 blank lines, found 1

llama.cpp

- Remove duplicate set kqv_out to llm_build_kv

ggerganov · 2024-03-28T06:27:20Z

The macOS runner is occasionally started without access to GPU, so this is not related to this PR. The Benchmark CI was recently added so we can expect some instability at the start - nothing to worry about

hxer7963 · 2024-03-29T12:06:20Z

The macOS runner is occasionally started without access to GPU, so this is not related to this PR. The Benchmark CI was recently added so we can expect some instability at the start - nothing to worry about

Thank you very much for ggerganov's patient reply

May I ask how long it usually takes to merge to the main branch?

ggerganov · 2024-03-29T13:23:43Z

We can merge after @slaren's approval

llama.cpp

* Support xverse model convert to gguf format. * 1. Convert xverse models to gguf; 2. Add LLM_ARCH_XVERSE inference in llama.cpp; 3. Add xverse item in Supported models in README.md; * * gguf-py: remove redundant logs * llama: remove the init_mapping_prefetch custom parameter * llama.cpp: Include the changes from ggerganov#6122 to exclude the unused outputs of the last layers. * - Fix format issues - Remove duplicate set kqv_out to llm_build_kv * Update llama.cpp --------- Co-authored-by: willhe <willhe@xverse.cn> Co-authored-by: willhe <hexin@xverse.cn>

willhe and others added 4 commits March 13, 2024 11:35

Support xverse model convert to gguf format.

62629eb

1. Convert xverse models to gguf;

46b3cca

2. Add LLM_ARCH_XVERSE inference in llama.cpp; 3. Add xverse item in Supported models in README.md;

Merge branch 'xverse' into master

e139651

Merge branch 'master' of https://github.com/hxer7963/llama.cpp into x…

c14d4e8

…verse

cebtenzzre reviewed Mar 25, 2024

View reviewed changes

root and others added 2 commits March 26, 2024 08:32

* gguf-py: remove redundant logs

458c1d1

* llama: remove the init_mapping_prefetch custom parameter

Merge branch 'ggerganov:master' into master

a11a72c

compilade reviewed Mar 26, 2024

View reviewed changes

llama.cpp Show resolved Hide resolved

hxer7963 requested review from cebtenzzre and compilade March 27, 2024 02:54

hxer7963 and others added 2 commits March 27, 2024 11:17

Merge branch 'ggerganov:master' into master

3c0b830

llama.cpp: Include the changes from ggerganov#6122 to exclude the unu…

e4a16f2

…sed outputs of the last layers.

ggerganov approved these changes Mar 27, 2024

View reviewed changes

slaren reviewed Mar 27, 2024

View reviewed changes

llama.cpp Outdated Show resolved Hide resolved

hxer7963 and others added 3 commits March 28, 2024 08:42

Merge branch 'ggerganov:master' into master

16a5d0a

- Fix format issues

4364308

- Remove duplicate set kqv_out to llm_build_kv

Merge branch 'master' of https://github.com/hxer7963/llama.cpp

a5b98e9

hxer7963 requested a review from slaren March 28, 2024 02:46

slaren approved these changes Mar 29, 2024

View reviewed changes

llama.cpp Outdated Show resolved Hide resolved

Update llama.cpp

7dcd160

slaren merged commit 0695747 into ggerganov:master Mar 29, 2024
51 of 57 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] Add support for xverse #6301

[Model] Add support for xverse #6301

hxer7963 commented Mar 25, 2024

slaren commented Mar 25, 2024

cebtenzzre Mar 25, 2024

hxer7963 Mar 26, 2024

hxer7963 commented Mar 26, 2024

hxer7963 commented Mar 27, 2024

slaren commented Mar 27, 2024

ggerganov commented Mar 28, 2024

hxer7963 commented Mar 29, 2024

ggerganov commented Mar 29, 2024

[Model] Add support for xverse #6301

[Model] Add support for xverse #6301

Conversation

hxer7963 commented Mar 25, 2024

slaren commented Mar 25, 2024

cebtenzzre Mar 25, 2024

Choose a reason for hiding this comment

hxer7963 Mar 26, 2024

Choose a reason for hiding this comment

hxer7963 commented Mar 26, 2024

hxer7963 commented Mar 27, 2024

slaren commented Mar 27, 2024

ggerganov commented Mar 28, 2024

hxer7963 commented Mar 29, 2024

ggerganov commented Mar 29, 2024