Technology
instructor
INSTRUCTOR is a state-of-the-art (SOTA) text embedding model: it generates task- and domain-aware embeddings using simple instructions, eliminating the need for further finetuning.
INSTRUCTOR is a powerful, instruction-finetuned text embedding model, designed for versatility across diverse NLP tasks (e.g., classification, retrieval). The model was trained on a multitask mixture of 330 annotated tasks using a contrastive loss mechanism. This approach allows it to generate specialized, fixed-length embeddings on the fly: users simply provide an instruction (e.g., 'Represent the Science title:'). Evaluation across 70 diverse embedding datasets confirmed its performance, achieving SOTA results with an average improvement of 3.4% over previous best models, often with an order of magnitude fewer parameters. This efficiency makes INSTRUCTOR a highly general, high-performance solution for embedding generation.
Related technologies
Recent Talks & Demos
Showing 1-4 of 4