opened 08:15PM - 18 Dec 24 UTC
bug
### How to reproduce this bug?
Here is a reproducible code:
```python
impor…t os
import weaviate
from weaviate import classes as wvc
import weaviate.error_msgs
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY", "CHANGE_ME"),
}
)
print(f"Client: {weaviate.__version__}, Server: {client.get_meta().get('version')}")
client.collections.delete("Test")
collection = client.collections.create(
name="Test",
vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(
model="text-embedding-3-large",
dimensions=1024,
#type_="text",
vectorize_collection_name=False
),
properties=[
wvc.config.Property(
name="text",
data_type=wvc.config.DataType.TEXT,
tokenization=wvc.config.Tokenization.WORD
)
]
)
# Create a single object
response = collection.data.insert(
properties={ "text": "COVID-19 has many symptoms." }
)
# objects indeed has 1024 dimensions
response = collection.query.fetch_objects(
limit=5,
include_vector=True
)
for obj in response.objects:
print(
f"fetch_objects: {obj.uuid} ({len(obj.vector['default'])}) | Properties: {obj.properties}")
# you can perform a neartext
response = collection.query.near_text(
query="hybrid query with 1024 dimensions",
#alpha=0.75,
limit=5,
include_vector=True
)
for obj in response.objects:
print(
f"near text query: {obj.uuid} ({len(obj.vector['default'])}) | Properties: {obj.properties}")
#but it fails to hybrid
try:
response = collection.query.hybrid(
query="hybrid query with 1024 dimensions",
alpha=0.75,
limit=5,
include_vector=True
)
for obj in response.objects:
print(
f"hybrid query: {obj.uuid} ({len(obj.vector['default'])}) | Properties: {obj.properties}")
except Exception as e:
print("ERROR!!!", e)
# if we close the client
client.close()
# and point it to a catch endpoint
client = weaviate.connect_to_local(
headers={
"X-OpenAI-Api-Key": os.getenv("OPENAI_APIKEY", "CHANGE_ME"),
"X-OpenAI-BaseUrl": "https://webhook.site/beef60de-4d45-4c61-9928-b20fa619f91e",
}
)
collection = client.collections.get("Test")
response = collection.query.hybrid(
query="hybrid query with 1024 dimensions",
#alpha=0.75,
limit=5,
include_vector=True
)
for obj in response.objects:
print(
f"hybrid query: {obj.uuid} ({len(obj.vector['default'])}) | Properties: {obj.properties}")
# we get this payload
payload = {
"input": [
"hybrid query with 1024 dimensions"
],
"model": "text-embedding-3-small",
"dimensions": 1536
}
```
### What is the expected behavior?
The hybrid search should work. It should generate the query vectorization payload as:
```json
{
"input": [
"hybrid query with 1024 dimensions"
],
"model": "text-embedding-3-large",
"dimensions": 1024
}
```
### What is the actual behavior?
The generated payload to vectorize a hybrid query is passing the wrong model and dimension as the payload:
```json
{
"input": [
"hybrid query with 1024 dimensions"
],
"model": "text-embedding-3-small",
"dimensions": 1536
}
```
### Supporting information
Client: 4.10.2, Server: 1.28.1
### Server Version
1.28.1
### Weaviate Setup
Single Node
### Nodes count
1
### Code of Conduct
- [X] I have read and agree to the Weaviate's [Contributor Guide](https://weaviate.io/developers/contributor-guide) and [Code of Conduct](https://weaviate.io/service/code-of-conduct)