Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: How can i get all nodes from the PGVectorDB? #13363

Open
1 task done
doganaktarr opened this issue May 8, 2024 · 2 comments
Open
1 task done

[Question]: How can i get all nodes from the PGVectorDB? #13363

doganaktarr opened this issue May 8, 2024 · 2 comments
Labels
question Further information is requested

Comments

@doganaktarr
Copy link

doganaktarr commented May 8, 2024

Question Validation

  • I have searched both the documentation and discord for an answer.

Question

Hi, I have stored a collection of nodes in a PGVector DB, I'd like to be able to retrieve all the nodes to retrieve docs with BM25 Retriever. I don't want to read the document every time i want to retrieve with bm25.

I've already tried to get it with:

index = VectorStoreIndex.from_vector_store(vector_store=vector_store)
nodes = index.docstore.docs.values()
print("nodes: ", nodes)

nodes: dict_values([])

It returns an empty dict even it has a value in the table.

@doganaktarr doganaktarr added the question Further information is requested label May 8, 2024
Copy link

dosubot bot commented May 8, 2024

To retrieve all nodes from your PGVectorDB for use with a BM25 Retriever, ensure your document store (docstore) is correctly initialized with documents from PGVectorDB. The BM25Retriever can be initialized with this document store using the from_defaults method, which will automatically handle the retrieval of nodes if the docstore parameter is provided.

Here's how you can initialize the BM25Retriever with a document store from PGVectorDB:

from llama_index.retrievers.bm25.base import BM25Retriever
from llama_index.vector_stores.postgres import PGVectorStore

# Assuming you have a PGVectorStore instance
vector_store = PGVectorStore.from_params(
    database="vector_db",
    host="localhost",
    password="password",
    port=5432,
    user="postgres",
    table_name="your_table_name",
    embed_dim=1536  # Dimension of your embeddings
)

# Initialize BM25Retriever with the document store
bm25_retriever = BM25Retriever.from_defaults(docstore=vector_store, tokenizer=None, verbose=True)

Make sure your PGVectorDB instance or wrapper correctly implements the BaseDocumentStore interface, especially the .docs attribute or method that returns documents in a format compatible with BaseNode. This will allow the BM25Retriever to access and use the documents stored in PGVectorDB for retrieval tasks efficiently.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

@yzgrfsy
Copy link

yzgrfsy commented May 23, 2024

@doganaktarr 请问您这个问题解决了吗 ,我在使用ES数据库的时候也遇到这个问题,nodes: dict_values([]) ,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants