.

Technology

instructor

INSTRUCTOR is a state-of-the-art (SOTA) text embedding model: it generates task- and domain-aware embeddings using simple instructions, eliminating the need for further finetuning.

INSTRUCTOR is a powerful, instruction-finetuned text embedding model, designed for versatility across diverse NLP tasks (e.g., classification, retrieval). The model was trained on a multitask mixture of 330 annotated tasks using a contrastive loss mechanism. This approach allows it to generate specialized, fixed-length embeddings on the fly: users simply provide an instruction (e.g., 'Represent the Science title:'). Evaluation across 70 diverse embedding datasets confirmed its performance, achieving SOTA results with an average improvement of 3.4% over previous best models, often with an order of magnitude fewer parameters. This efficiency makes INSTRUCTOR a highly general, high-performance solution for embedding generation.

https://instructor-embedding.github.io/
4 projects · 5 cities

Related technologies

Recent Talks & Demos

Showing 1-4 of 4

Members-Only

Sign in to see who built these projects