Quantcast
Channel: Weaviate Community Forum - Latest posts
Viewing all articles
Browse latest Browse all 3878

How to handle error for Batch Import (add_object) when weaviate instance becomes unavailable

$
0
0

Description

I am trying to use weaviate v4 python client to batch import data into my weaviate. This is the code setup:

client = weaviate.connect_to_local(WEAVIATE_HOST, WEAVIATE_PORT)

data_jsons = ... # a list of dict of key/values that match up with the collection schema
collection = client.collections.get('my_collection')
try:
  with collection.batch.dynamic() as batch:
    for a_json in tqdm(data_jsons[:10000]):     
      key = create_key(a_json)    # could be a hash of the data
      vector = a_json.pop('vector')   # bring my own vector use case
      batch.add_object(properties=a_json, 
                      uuid=key, 
                      vector=vector)

  failed_objects = collection.batch.failed_objects
  if len(failed_objects) > 0:
    raise Exception(f"Failed to insert {len(failed_objects)} objects")
except Exception as e:
  print(f"Error: {e}")

when there’s intermittent failure, it will complete and failed_objects will indeed be >0, such that I can raise the error to the caller.

However, if the weaviate instance is permanently down (I just pause it to simulate this), then the above code will take a long time to complete and slowly printing out something like:

UserWarning: Bat003: The dynamic batch-size could not be refreshed successfully: error WeaviateTimeoutError('The request to Weaviate timed out while awaiting a response. Try adjusting the timeout config for your client. Details: ')
  warnings.warn(
{'message': 'Failed to send 260 objects in a batch of 260. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
{'message': 'Failed to send 260 objects in a batch of 260. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
{'message': 'Failed to send 260 objects in a batch of 260. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
{'message': 'Failed to send 260 objects in a batch of 260. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
{'message': 'Failed to send 260 objects in a batch of 260. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
{'message': 'Failed to send 260 objects in a batch of 260. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
{'message': 'Failed to send 260 objects in a batch of 260. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
{'message': 'Failed to send 110 objects in a batch of 110. Please inspect client.batch.failed_objects or collection.batch.failed_objects for the failed objects.'}
Error: Failed to insert 1930 objects

I think these are from the logger in weaviate and it seems add_object never throw any exceptions all along (so the try/except is actually useless above). What I want to achieve is if there are 3 messages like this getting triggered, I want it to just quit and throw exception. Right now, it seems to be waiting for a timeout, then do something, trigger that message, then timeout again, which result in this code running for a very long time before it hits my raise Exception.

Is there a proper way to handle connection error (e.g. if weaviate instance just died)? my goal is I dont want a very large batch import job to get stuck forever.

Server Setup Information

  • Weaviate Server Version: 1.27.0
  • Deployment Method: docker on Mac OS
  • Multi Node? Number of Running Nodes: 1 (no multi tenancy, no replication, no cluster)
  • Client Language and Version: En
  • Multitenancy?: No

Any additional Information

I didnt specify any specific timeout in the client. its just plain simple connect_to_local(WEAVIATE_HOST, WEAVIATE_PORT)


Viewing all articles
Browse latest Browse all 3878

Trending Articles