Hey both, thanks so much for sharing this notebook @DudaNogueira!
Hey @JK_Rider, could you please point me to a more specific passage where this is mentioned? ColBERT / v2 / PLAID variants will all zero-pad queries and documents to have a fixed input length as far as I understand.
→ I think the key innovations in subsequent ColBERT works is compressing the vectors along the length dimension with forms of low-rank decompositions or maybe PCA – I think the discrete PQ-style methods won out in PLAID.
It does make sense to think the variable-length decoding stuff could make its way into embedding models, but I haven’t seen too many examples of this outside of maybe Cohere Compass.