Anthropic officially launched its Claude Sonnet 4.5 artificial intelligence (AI) model, declaring it the “best coding model in the world.” The company emphasized its extensive improvements across various critical domains, including coding prowess, advanced agentic operations, general computer use, intricate reasoning capabilities, and specialized domain knowledge. This powerful new model is readily available across Anthropic’s Claude website, its dedicated mobile applications, the Claude Code platform, its application programming interface (API), and an eagerly anticipated Claude for Chrome browser extension. Furthermore, Anthropic highlights the model’s impressive ability to operate autonomously on a single task for up to 30 hours.
Claude Sonnet 4.5: Key Features and Capabilities
In a comprehensive announcement, Anthropic detailed the core advancements of its latest AI model. Claude Sonnet 4.5 is engineered to deliver a substantial leap in both coding and agentic performance, though other aspects of the model have also received notable enhancements. Despite these significant upgrades, the company notes that the new model offers incremental improvements rather than introducing entirely new core capabilities or modalities.
Internal evaluations by Anthropic reveal that the large language model (LLM) achieved an impressive score of 77.2 percent on the SWE bench-Verified benchmark, a critical measure of a model’s agentic coding abilities. This score remarkably surpasses those recorded by leading competitors such as OpenAI’s GPT-5 and Google’s Gemini 2.5 Pro, as well as Anthropic’s own previous flagship, Opus 4.1. The article also included a visual aid demonstrating Claude Sonnet 4.5.
The capabilities of Claude Sonnet 4.5 were put to the test by Gadgets 360 staff members, including a demo of its coding prowess. During a brief hands-on session, the model was tasked with developing a WhatsApp-style messaging chatbot, featuring functionalities like individual and group chats, along with audio and video call support. Within a mere two minutes, the AI successfully generated 436 lines of React code and provided a functional preview of the user interface (excluding server connectivity).
Beyond coding, Claude Sonnet 4.5 demonstrated its leadership in other benchmarks. These include Terminal Bench and OSWorld for computer usage, AIME 2025 for advanced high-school mathematics, and Finance Agent for detailed financial analysis. While it performed exceptionally, Google’s Gemini 2.5 Pro showed a slight edge in reasoning-based GPQA Diamond, and OpenAI’s GPT-5 excelled in the MMMU benchmark for visual reasoning and MMLU for multilingual performance.
Anthropic further asserted that Claude Sonnet 4.5 outperforms all its predecessors in terms of domain-specific knowledge and sophisticated reasoning across diverse fields such as finance, law, medicine, and STEM disciplines.
Regarding safety, Anthropic proudly states that Claude Sonnet 4.5 stands as its “most aligned frontier model.” The company has implemented measures to significantly reduce undesirable behaviors like sycophancy, deception, power-seeking tendencies, and the encouragement of delusional thinking. Robust safeguards have also been incorporated to protect the model against prompt injection attacks. An accompanying video also detailed aspects of the model.