Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Issues with llama_index.core #13453

Open
Wung8 opened this issue May 12, 2024 · 3 comments
Open

[Bug]: Issues with llama_index.core #13453

Wung8 opened this issue May 12, 2024 · 3 comments
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized

Comments

@Wung8
Copy link

Wung8 commented May 12, 2024

Bug Description

Recently I installed llama-cpp and llama-index, and while llama-cpp seems to work, I keep getting error messages from llama-index-core. First it was an error with llm_metadata in llama_index\core\indices\prompt_helper.py, and then it was an issue with _llm in llama_index\core\response_synthesizers\refine.py. I've tried installing different versions of llama-index and using a venv but nothing seems to work. What could be the issue?

Version

0.10.36

Steps to Reproduce

install most recent versions of the libraries

Code:
`from llama_cpp import Llama

llm = Llama.from_pretrained(
repo_id=r"QuantFactory/Meta-Llama-3-8B-Instruct-GGUF",
filename=r"Meta-Llama-3-8B-Instruct.Q6_K.gguf",
verbose=False
)

def prompt():
usr = input("usr-")

output = llm(
      f"Q: {usr} A: ", # Prompt
      max_tokens=64, # Generate up to 32 tokens, set to None to generate up to the end of the context window
      stop=["Q:"], # Stop generating just before the model would generate a new question
      echo=False # Echo the prompt back in the output
) # Generate a completion, can also call create_completion

print(output["choices"][0]["text"])
return output["choices"][0]["text"]

#while True:

prompt()

import os

from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings, VectorStoreIndex, SimpleDirectoryReader
from llama_index.embeddings.ollama import OllamaEmbedding
from llama_index.llms.ollama import Ollama

embed_model = HuggingFaceEmbedding(model_name="nomic-ai/nomic-embed-text-v1.5", trust_remote_code=True)
Settings.llm = llm
Settings.embed_model = embed_model
#Settings.context_window = 256000

documents = SimpleDirectoryReader("./HexEmpire").load_data()
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()

def query():
usr = input("usr-")
response = query_engine.query(usr)
print(response)
return response

while True:
query()
`

Relevant Logs/Tracbacks

Traceback (most recent call last):
  File "C:\Users\micha\Documents\CS_Projects\llama_test\test.py", line 52, in <module>
    query_engine = index.as_query_engine()
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\indices\base.py", line 407, in as_query_engine
    return RetrieverQueryEngine.from_args(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\query_engine\retriever_query_engine.py", line 110, in from_args
    response_synthesizer = response_synthesizer or get_response_synthesizer(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\factory.py", line 72, in get_response_synthesizer
    or PromptHelper.from_llm_metadata(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\indices\prompt_helper.py", line 117, in from_llm_metadata
    context_window = llm_metadata.context_window
AttributeError: 'dict' object has no attribute 'context_window'

After editing prompt_helper.py:

Traceback (most recent call last):
  File "C:\Users\micha\Documents\CS_Projects\llama_test\test.py", line 61, in <module>
    query()
  File "C:\Users\micha\Documents\CS_Projects\llama_test\test.py", line 56, in query
    response = query_engine.query(usr)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\base\base_query_engine.py", line 53, in query
    query_result = self._query(str_or_query_bundle)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\query_engine\retriever_query_engine.py", line 190, in _query
    response = self._response_synthesizer.synthesize(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\base.py", line 242, in synthesize
    response_str = self.get_response(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\compact_and_refine.py", line 43, in get_response
    return super().get_response(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\instrumentation\dispatcher.py", line 274, in wrapper
    result = func(*args, **kwargs)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\refine.py", line 185, in get_response
    response = self._give_response_single(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\refine.py", line 240, in _give_response_single
    program(
  File "C:\Users\micha\AppData\Local\Programs\Python\Python311\Lib\site-packages\llama_index\core\response_synthesizers\refine.py", line 84, in __call__
    answer = self._llm.predict(
AttributeError: 'Llama' object has no attribute 'predict'
@Wung8 Wung8 added bug Something isn't working triage Issue needs to be triaged/prioritized labels May 12, 2024
Copy link

dosubot bot commented May 12, 2024

To address the issues with llama_index.core you're encountering, follow these targeted steps:

  1. Compatibility Check: Ensure llama-cpp-python and llama-index-core versions are compatible. The project's pyproject.toml suggests compatible versions are llama-index-core = "^0.10.1" and llama-cpp-python = "^0.2.32". Make sure to use these versions or newer ones that are confirmed to be compatible.

  2. Update Dependencies: Use Poetry to update your dependencies to match those specified in the pyproject.toml file:

    poetry update
  3. Python Version: The project requires Python version >=3.8.1,<4.0. Verify your environment is using a compatible Python version.

  4. Review Import Paths: Ensure that the import paths in your code are correct. The import_path for llama_index in the pyproject.toml is "llama_index.llms.llama_cpp", which should match your project's import statements.

  5. Static Analysis Tools: Run MyPy, as specified in the project's pyproject.toml, to identify type errors or other issues:

    mypy .
  6. Consult Documentation/Community: If issues persist, refer to the official documentation or community forums for llama-index and llama-cpp for potential known issues or solutions.

  7. Isolate the Issue: Create minimal examples that reproduce the errors to help pinpoint whether the issue is with package versions, your environment, or usage in your code.

By following these steps, you should be able to resolve the errors related to llm_metadata and _llm in your project.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@logan-markewich
Copy link
Collaborator

@Wung8 you aren't using the LlamaCPP llm class, so none of the methods will work

Install the integration

pip install llama-index-llms-llama-cpp

Then something like

from llama_index.llms.llama_cpp import LlamaCPP

llm = LlamaCPP(
    # You can pass in the URL to a GGML model to download it automatically
    model_url=model_url,
    # optionally, you can set the path to a pre-downloaded model instead of model_url
    model_path=None,
    temperature=0.1,
    max_new_tokens=256,
    context_window=3900,
    # kwargs to pass to __call__()
    generate_kwargs={},
    # kwargs to pass to __init__()
    # set to at least 1 to use GPU
    model_kwargs={"n_gpu_layers": -1},
    # transform inputs into Llama3 format
    messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    verbose=True,
)

Since llamacpp does not handle any prompt formatting, messages_to_prompt and completion_to,_prompt are functions passed in to handle formatting a single a completion (I.e. a single string) or a list of messages into the expected format.

def completion_to_prompt(completion: str) -> str:
    """Format a string"""
    <code>

def messages_to_prompt(messages) -> str:
    """Format messages"""
    prompt_str = ""
    for message in messages:
        <do something with message.role and message.content>

Tbh I would just use ollama though, the setup is much easier, and it handles the prompt formatting

@Wung8
Copy link
Author

Wung8 commented May 13, 2024

@logan-markewich I tried to use Ollama earlier, but ran into a WinError and tried using llama-cpp instead, but after reading your response I will try to switch back.
I downloaded Ollama for windows and the llama3 model, but when I try to download the nomic-embed-text model I get this:

ollama pull nomic-embed-text
You want me to "pull" another one, do you?

Okay, let's see...

The Self-Referential Snafu

[] A rule is embedded within these brackets.
[] The rules inside these brackets are self-referential, meaning they reference themselves.
( A sub-rule is embedded within these parentheses.
) The sub-rules inside these parentheses are also self-referential.
[ A meta-rule is embedded within these square brackets.
] This meta-rule contains a self-referential loop that wraps around to the beginning.
[] Follow the instructions above, even though they're inside more brackets.
( ( A nested sub-sub-rule is embedded within these parentheses.
) ) The nested sub-sub-rules are also self-referential.
[ [ A doubly-nested meta-meta-rule is embedded within these square brackets.
] ] This doubly-nested rule contains a doubly-self-referential loop that wraps around to the beginning.
[] And so on, ad infinitum...
[]

How's your head doing after reading this nomic-embed-text?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

2 participants