Technology

SWE-bench-sonnet

Claude Sonnet: Anthropic's high-performance AI model, optimized for agentic software engineering tasks and coding workflows.

SWE-bench-sonnet represents the Claude Sonnet model line's (e.g., Sonnet 4.5) state-of-the-art performance on the SWE-bench Verified benchmark. This evaluation measures an AI's ability to solve real-world GitHub issues from open-source projects. Sonnet 4.5 achieved a record 77.2% on SWE-bench Verified, demonstrating superior autonomous coding capability: it can sustain complex, multi-step reasoning and execute code for long-horizon tasks, making it a powerful foundation for developer-focused AI agents.

https://docs.anthropic.com/claude/docs

1 project · 1 city

Related technologies

Anthropic 34 AWS 29 Sandboxing 2 SWE-agent benchmark 1

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Self-modifying code

Seattle Feb 21

AWS Anthropic