[Bug]: Issues with llama_index.core #13453

Wung8 · 2024-05-12T21:32:16Z

Bug Description

Recently I installed llama-cpp and llama-index, and while llama-cpp seems to work, I keep getting error messages from llama-index-core. First it was an error with llm_metadata in llama_index\core\indices\prompt_helper.py, and then it was an issue with _llm in llama_index\core\response_synthesizers\refine.py. I've tried installing different versions of llama-index and using a venv but nothing seems to work. What could be the issue?

Version

0.10.36

Steps to Reproduce

install most recent versions of the libraries

Code:
`from llama_cpp import Llama

llm = Llama.from_pretrained(
repo_id=r"QuantFactory/Meta-Llama-3-8B-Instruct-GGUF",
filename=r"Meta-Llama-3-8B-Instruct.Q6_K.gguf",
verbose=False
)

def prompt():
usr = input("usr-")

output = llm(
      f"Q: {usr} A: ", # Prompt
      max_tokens=64, # Generate up to 32 tokens, set to None to generate up to the end of the context window
      stop=["Q:"], # Stop generating just before the model would generate a new question
      echo=False # Echo the prompt back in the output
) # Generate a completion, can also call create_completion

print(output["choices"][0]["text"])
return output["choices"][0]["text"]

#while True:

prompt()

import os

from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings, VectorStoreIndex, SimpleDirectoryReader
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.llms.ollama import Ollama

embed_model = HuggingFaceEmbedding(model_name="nomic-ai/nomic-embed-text-v1.5", trust_remote_code=True)
Settings.llm = llm
Settings.embed_model = embed_model
#Settings.context_window = 256000

documents = SimpleDirectoryReader("./HexEmpire").load_data()
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()

def query():
usr = input("usr-")
response = query_engine.query(usr)
print(response)
return response

while True:
query()
`

Relevant Logs/Tracbacks

Traceback (most recent call last):
  File "C:\Users\micha\Documents\CS_Projects\llama_test\test.py", line 52, in <module>
    query_engine = index.as_query_engine()
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\indices\base.py", line 407, in as_query_engine
    return RetrieverQueryEngine.from_args(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\query_engine\retriever_query_engine.py", line 110, in from_args
    response_synthesizer = response_synthesizer or get_response_synthesizer(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\factory.py", line 72, in get_response_synthesizer
    or PromptHelper.from_llm_metadata(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\indices\prompt_helper.py", line 117, in from_llm_metadata
    context_window = llm_metadata.context_window
AttributeError: 'dict' object has no attribute 'context_window'

After editing prompt_helper.py:

Traceback (most recent call last):
  File "C:\Users\micha\Documents\CS_Projects\llama_test\test.py", line 61, in <module>
    query()
  File "C:\Users\micha\Documents\CS_Projects\llama_test\test.py", line 56, in query
    response = query_engine.query(usr)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\base\base_query_engine.py", line 53, in query
    query_result = self._query(str_or_query_bundle)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\query_engine\retriever_query_engine.py", line 190, in _query
    response = self._response_synthesizer.synthesize(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\base.py", line 242, in synthesize
    response_str = self.get_response(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\compact_and_refine.py", line 43, in get_response
    return super().get_response(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\refine.py", line 185, in get_response
    response = self._give_response_single(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\refine.py", line 240, in _give_response_single
    program(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\refine.py", line 84, in __call__
    answer = self._llm.predict(
AttributeError: 'Llama' object has no attribute 'predict'

The text was updated successfully, but these errors were encountered:

dosubot · 2024-05-12T21:32:46Z

To address the issues with llama_index.core you're encountering, follow these targeted steps:

Compatibility Check: Ensure llama-cpp-python and llama-index-core versions are compatible. The project's pyproject.toml suggests compatible versions are llama-index-core = "^0.10.1" and llama-cpp-python = "^0.2.32". Make sure to use these versions or newer ones that are confirmed to be compatible.
Update Dependencies: Use Poetry to update your dependencies to match those specified in the pyproject.toml file:
```
poetry update
```
Python Version: The project requires Python version >=3.8.1,<4.0. Verify your environment is using a compatible Python version.
Review Import Paths: Ensure that the import paths in your code are correct. The import_path for llama_index in the pyproject.toml is "llama_index.llms.llama_cpp", which should match your project's import statements.
Static Analysis Tools: Run MyPy, as specified in the project's pyproject.toml, to identify type errors or other issues:
```
mypy .
```
Consult Documentation/Community: If issues persist, refer to the official documentation or community forums for llama-index and llama-cpp for potential known issues or solutions.
Isolate the Issue: Create minimal examples that reproduce the errors to help pinpoint whether the issue is with package versions, your environment, or usage in your code.

By following these steps, you should be able to resolve the errors related to llm_metadata and _llm in your project.

Details

Sources

About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

logan-markewich · 2024-05-12T22:22:31Z

@Wung8 you aren't using the LlamaCPP llm class, so none of the methods will work

Install the integration

pip install llama-index-llms-llama-cpp

Then something like

from llama_index.llms.llama_cpp import LlamaCPP

llm = LlamaCPP(
    # You can pass in the URL to a GGML model to download it automatically
    model_url=model_url,
    # optionally, you can set the path to a pre-downloaded model instead of model_url
    model_path=None,
    temperature=0.1,
    max_new_tokens=256,
    context_window=3900,
    # kwargs to pass to __call__()
    generate_kwargs={},
    # kwargs to pass to __init__()
    # set to at least 1 to use GPU
    model_kwargs={"n_gpu_layers": -1},
    # transform inputs into Llama3 format
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    verbose=True,
)

Since llamacpp does not handle any prompt formatting, messages_to_prompt and completion_to,_prompt are functions passed in to handle formatting a single a completion (I.e. a single string) or a list of messages into the expected format.

def completion_to_prompt(completion: str) -> str:
    """Format a string"""
    <code>

def messages_to_prompt(messages) -> str:
    """Format messages"""
    prompt_str = ""
    for message in messages:
        <do something with message.role and message.content>

Tbh I would just use ollama though, the setup is much easier, and it handles the prompt formatting

Wung8 · 2024-05-13T02:42:46Z

@logan-markewich I tried to use Ollama earlier, but ran into a WinError and tried using llama-cpp instead, but after reading your response I will try to switch back.
I downloaded Ollama for windows and the llama3 model, but when I try to download the nomic-embed-text model I get this:

ollama pull nomic-embed-text
You want me to "pull" another one, do you?

Okay, let's see...

The Self-Referential Snafu

[] A rule is embedded within these brackets.
[] The rules inside these brackets are self-referential, meaning they reference themselves.
( A sub-rule is embedded within these parentheses.
) The sub-rules inside these parentheses are also self-referential.
[ A meta-rule is embedded within these square brackets.
] This meta-rule contains a self-referential loop that wraps around to the beginning.
[] Follow the instructions above, even though they're inside more brackets.
( ( A nested sub-sub-rule is embedded within these parentheses.
) ) The nested sub-sub-rules are also self-referential.
[ [ A doubly-nested meta-meta-rule is embedded within these square brackets.
] ] This doubly-nested rule contains a doubly-self-referential loop that wraps around to the beginning.
[] And so on, ad infinitum...
[]

How's your head doing after reading this nomic-embed-text?

Wung8 added bug Something isn't working triage Issue needs to be triaged/prioritized labels May 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Issues with llama_index.core #13453

[Bug]: Issues with llama_index.core #13453

Wung8 commented May 12, 2024 •

edited

dosubot bot commented May 12, 2024 •

edited

Details

logan-markewich commented May 12, 2024

Wung8 commented May 13, 2024 •

edited

[Bug]: Issues with llama_index.core #13453

[Bug]: Issues with llama_index.core #13453

Comments

Wung8 commented May 12, 2024 • edited

Bug Description

Version

Steps to Reproduce

prompt()

Relevant Logs/Tracbacks

dosubot bot commented May 12, 2024 • edited

Details

logan-markewich commented May 12, 2024

Wung8 commented May 13, 2024 • edited

Wung8 commented May 12, 2024 •

edited

dosubot bot commented May 12, 2024 •

edited

Wung8 commented May 13, 2024 •

edited