How to load existing db to similarity search?

weaviate-client==4.7.1
langchain-weaviate==0.0.2
langchain==0.2.11

I am able to create a simple example to create a ‘db’ and use that db to do inference in one flow:

from bge import bge_m3_embedding

print(f'Read in text ...')
loader = TextLoader('state_of_the_union.txt')
documents = loader.load()

print('Split text ...')
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

print('Load embedding model ...')
embedding_model = bge_m3_embedding
print('Embed docs ...')
weaviate_client = weaviate.connect_to_local()
db = WeaviateVectorStore.from_documents(docs, embedding_model, client=weaviate_client, index_name='test')

#db = WeaviateVectorStore.from_documents([], embedding_model, client=weaviate_client, index_name='test')
# print('Perform search ...')
query = 'What did the president say about Ketanji Brown Jackson'
results = db.similarity_search_with_score(query, alpha=1)
for i, doc in enumerate(results):
    print(f'{i}--->{doc[1]:.3f}')
print(results[0])
#
weaviate_client.close()

This works all fine. The db is created and similar docs are retrieved. However, now if I wan to use this ‘db’ to run the same query, I got an outofindex error:

print('Load embedding model ...')
embedding_model = bge_m3_embedding
print('Load embedded docs ...')
weaviate_client = weaviate.connect_to_local()

db = WeaviateVectorStore.from_documents([], embedding_model, client=weaviate_client, index_name='test')
# print('Perform search ...')
query = 'What did the president say about Ketanji Brown Jackson'
results = db.similarity_search_with_score(query, alpha=1)
for i, doc in enumerate(results):
    print(f'{i}--->{doc[1]:.3f}')
print(results[0])

And the error message is below:

Traceback (most recent call last):
  File "/Users/I747411/ai/lc_weaviate.py", line 22, in <module>
    db = WeaviateVectorStore.from_documents([], embedding_model, client=weaviate_client, index_name='test')
  File "/Users/I747411/ai/venv/lib/python3.10/site-packages/langchain_core/vectorstores/base.py", line 1058, in from_documents
    return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
  File "/Users/I747411/ai/venv/lib/python3.10/site-packages/langchain_weaviate/vectorstores.py", line 487, in from_texts
    weaviate_vector_store.add_texts(texts, metadatas, tenant=tenant, **kwargs)
  File "/Users/I747411/ai/venv/lib/python3.10/site-packages/langchain_weaviate/vectorstores.py", line 165, in add_texts
    embeddings = self._embedding.embed_documents(list(texts))
  File "/Users/I747411/ai/venv/lib/python3.10/site-packages/langchain_community/embeddings/huggingface.py", line 331, in embed_documents
    embeddings = self.client.encode(
  File "/Users/I747411/ai/venv/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 565, in encode
    if all_embeddings[0].dtype == torch.bfloat16:
IndexError: list index out of range
/Users/I747411/ai/venv/lib/python3.10/site-packages/weaviate/warnings.py:303: ResourceWarning: Con004: The connection to Weaviate was not closed properly. This can lead to memory leaks.
            Please make sure to close the connection using `client.close()`.

Please see the error message: “IndexError: list index out of range”.

What’s the proper way to use existing vector db to do inference? Please help!

How to load existing db to similarity search?

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List