I have build a multi modal similarity search using CLIP model for both text and image based search. CLIP models are old. Wanted to try out different models and compare the results. Is there any example code on how to integrate with paligemma or any other multi modal model?. Instead of CLIP.
↧