Query Latency
Hi folks!
I’m using Weaviate on the cloud, and I’m getting query latencies like this… does anyone have anything they suggest to do?
I mostly just stuck with the “bare minimum” (Ie, tutorial level) to see what would happen!
- My collection is just a bunch of text jfk_files/jfk_text at main · amasad/jfk_files · GitHub
- I generated summaries of each of the text items (~1000 tokens for each doc)
- I just used default embeddings (openai 1536-dimensional one)
- There’s only ~1000 documents (~2-3k if i break them up into chunks)
Debugging details
Cluster size & region
- Sandbox cluster (I tried US East and US West), and it didn’t really make a difference
- I upgraded to “Serverless” but that didn’t seem to improve it either
Things I tried
- Switch to “Flat” indexing (
vector_index_config=Configure.VectorIndex.flat(),
) – the latency is about the same though
Collection Config
Collection found: <weaviate.Collection config={
"name": "DocSummaries7_hnsw",
"description": null,
"generative_config": null,
"inverted_index_config": {
"bm25": {
"b": 0.75,
"k1": 1.2
},
"cleanup_interval_seconds": 60,
"index_null_state": false,
"index_property_length": false,
"index_timestamps": false,
"stopwords": {
"preset": "en",
"additions": null,
"removals": null
}
},
"multi_tenancy_config": {
"enabled": false,
"auto_tenant_creation": false,
"auto_tenant_activation": false
},
"properties": [
{
"name": "title",
"description": null,
"data_type": "text",
"index_filterable": true,
"index_range_filters": false,
"index_searchable": true,
"nested_properties": null,
"tokenization": "word",
"vectorizer_config": {
"skip": false,
"vectorize_property_name": true
},
"vectorizer": "text2vec-openai",
"vectorizer_configs": null
},
{
"name": "content",
"description": null,
"data_type": "text",
"index_filterable": true,
"index_range_filters": false,
"index_searchable": true,
"nested_properties": null,
"tokenization": "word",
"vectorizer_config": {
"skip": false,
"vectorize_property_name": true
},
"vectorizer": "text2vec-openai",
"vectorizer_configs": null
}
],
"references": [],
"replication_config": {
"factor": 1,
"async_enabled": false,
"deletion_strategy": "NoAutomatedResolution"
},
"reranker_config": null,
"sharding_config": {
"virtual_per_physical": 128,
"desired_count": 1,
"actual_count": 1,
"desired_virtual_count": 128,
"actual_virtual_count": 128,
"key": "_id",
"strategy": "hash",
"function": "murmur3"
},
"vector_index_config": {
"multi_vector": null,
"quantizer": null,
"cleanup_interval_seconds": 300,
"distance_metric": "cosine",
"dynamic_ef_min": 100,
"dynamic_ef_max": 500,
"dynamic_ef_factor": 8,
"ef": -1,
"ef_construction": 128,
"filter_strategy": "sweeping",
"flat_search_cutoff": 40000,
"max_connections": 32,
"skip": false,
"vector_cache_max_objects": 1000000000000
},
"vector_index_type": "hnsw",
"vectorizer_config": {
"vectorizer": "text2vec-openai",
"model": {
"baseURL": "https://api.openai.com",
"isAzure": false,
"model": "text-embedding-3-small"
},
"vectorize_collection_name": true
},
"vectorizer": "text2vec-openai",
"vector_config": null
}>
I tried to keep it as simple as possible:
self.client.collections.create(
name,
vectorizer_config=Configure.Vectorizer.text2vec_openai(),
# vector_index_config=Configure.VectorIndex.flat(),
properties=[ # properties configuration is optional
Property(name="title", data_type=DataType.TEXT),
Property(name="content", data_type=DataType.TEXT),
],
)
Other tags:
slow, semantic search, hybrid search