Technology
Copilot Evaluations
Copilot Evaluations, a core feature in Microsoft Copilot Studio, use automated test sets and LLM-driven criteria to measure custom agent accuracy and performance.
Copilot Evaluations provide a robust framework for assessing agent performance within Microsoft Copilot Studio. The technology employs automated test sets to simulate real-world scenarios, measuring response accuracy, relevancy, and quality against defined standards. Key evaluation methods include *Exact Match* for precise data (e.g., specific codes or numbers), *Similarity* (using cosine metrics, score 0 to 1) for assessing meaning and intent alignment, and *Quality* tests leveraging large language models (LLMs) to judge relevance and completeness. This structured approach helps developers efficiently optimize agent behavior, validate performance, and ensure solutions meet strict business and user expectations.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1