opened 02:16AM - 12 Jun 24 UTC
closed 04:35PM - 07 Aug 24 UTC
bug
needs more info
### What is the issue?
I am using embedding with AnythingLLM for RAG. I found… the embedding service always failed for several minutes calls. The error log is showing this every time. I am not sure why the context was cancelled. Please kindly help.
Below is the debug log:
```
[GIN] 2024/06/11 - 11:09:07 | 200 | 2m28s | 10.100.34.236 | POST "/api/embeddings"
time=2024-06-11T11:09:07.830+08:00 level=DEBUG source=sched.go:304 msg="context for request finished"
time=2024-06-11T11:09:07.830+08:00 level=DEBUG source=sched.go:255 msg="after processing request finished event" modelPath=C:\Users\admin_env\.ollama\models\blobs\sha256-ada9f88e89df0ea53c31fabf8b1e7c8c0c22fa95ab3a3cad4cdd86103ce9f3d3 refCount=119
DEBUG [update_slots] kv cache rm [p0, end) | p0=0 slot_id=0 task_id=14500 tid="17876" timestamp=1718075347
DEBUG [update_slots] slot released | n_cache_tokens=52 n_ctx=2048 n_past=52 n_system_tokens=0 slot_id=0 task_id=14500 tid="17876" timestamp=1718075350 truncated=false
DEBUG [log_server_request] request | method="POST" params={} path="/embedding" remote_addr="127.0.0.1" remote_port=51069 status=200 tid="16400" timestamp=1718075350
DEBUG [process_single_task] slot data | n_idle_slots=1 n_processing_slots=0 task_id=14503 tid="17876" timestamp=1718075350
DEBUG [launch_slot_with_data] slot is processing task | slot_id=0 task_id=14504 tid="17876" timestamp=1718075350
[GIN] 2024/06/11 - 11:09:10 | 200 | 2m30s | 10.100.34.236 | POST "/api/embeddings"
time=2024-06-11T11:09:10.289+08:00 level=DEBUG source=sched.go:304 msg="context for request finished"
time=2024-06-11T11:09:10.290+08:00 level=DEBUG source=sched.go:255 msg="after processing request finished event" modelPath=C:\Users\admin_env\.ollama\models\blobs\sha256-ada9f88e89df0ea53c31fabf8b1e7c8c0c22fa95ab3a3cad4cdd86103ce9f3d3 refCount=118
DEBUG [update_slots] kv cache rm [p0, end) | p0=0 slot_id=0 task_id=14504 tid="17876" timestamp=1718075350
time=2024-06-11T11:09:11.910+08:00 level=ERROR source=server.go:836 msg="Failed to acquire semaphore" error="context canceled"
time=2024-06-11T11:09:11.910+08:00 level=DEBUG source=sched.go:304 msg="context for request finished"
time=2024-06-11T11:09:11.911+08:00 level=INFO source=routes.go:401 msg="embedding generation failed: context canceled"
time=2024-06-11T11:09:11.911+08:00 level=DEBUG source=sched.go:255 msg="after processing request finished event" modelPath=C:\Users\admin_env\.ollama\models\blobs\sha256-ada9f88e89df0ea53c31fabf8b1e7c8c0c22fa95ab3a3cad4cdd86103ce9f3d3 refCount=117
[GIN] 2024/06/11 - 11:09:11 | 500 | 2m32s | 10.100.34.236 | POST "/api/embeddings"
time=2024-06-11T11:09:11.911+08:00 level=ERROR source=server.go:836 msg="Failed to acquire semaphore" error="context canceled"
time=2024-06-11T11:09:11.911+08:00 level=DEBUG source=sched.go:304 msg="context for request finished"
time=2024-06-11T11:09:11.911+08:00 level=INFO source=routes.go:401 msg="embedding generation failed: context canceled"
time=2024-06-11T11:09:11.911+08:00 level=ERROR source=server.go:836 msg="Failed to acquire semaphore" error="context canceled"
[GIN] 2024/06/11 - 11:09:11 | 500 | 2m32s | 10.100.34.236 | POST "/api/embeddings"
time=2024-06-11T11:09:11.911+08:00 level=DEBUG source=sched.go:304 msg="context for request finished"
time=2024-06-11T11:09:11.911+08:00 level=DEBUG source=sched.go:255 msg="after processing request finished event" modelPath=C:\Users\admin_env\.ollama\models\blobs\sha256-ada9f88e89df0ea53c31fabf8b1e7c8c0c22fa95ab3a3cad4cdd86103ce9f3d3 refCount=116
time=2024-06-11T11:09:11.911+08:00 level=DEBUG source=sched.go:255 msg="after processing request finished event" modelPath=C:\Users\admin_env\.ollama\models\blobs\sha256-ada9f88e89df0ea53c31fabf8b1e7c8c0c22fa95ab3a3cad4cdd86103ce9f3d3 refCount=115
time=2024-06-11T11:09:11.911+08:00 level=INFO source=routes.go:401 msg="embedding generation failed: context canceled"
```
### OS
Windows
### GPU
Nvidia
### CPU
Intel
### Ollama version
0.1.41