2025-03-03 04:00 PM UTC+9:00

Claude 3.7 Sonet released: ChatGPT is now goodbye!

vvd.im/claude-37-sonet
List
https://vvd.im/claude-37-sonet
Anthropic has released Claude Sonnet 3.7, the latest and most advanced AI model to date.

This new version introduces groundbreaking features that improve inference capabilities, coding proficiency, and user interaction.
With hybrid inference, enhanced software development support, and command-line tools for coding agents, Claude Sonet 3.7 will redefine AI-assisted workflows.
Claude 3.7 Sonet released: ChatGPT is now goodbye!

Claude 3.7 Sonet was released on February 25, 2025.

Claude 3.7 Sonnet is now available to all customers on a paid Copilot plan. This new Sonnet model supports both thinking and non-thinking modes of Copilot. Initial testing has shown particularly strong improvements in agent scenarios.
In our internal evaluation on GitHub, the model showed improvements over previous models in its ability to follow instructions, break down complex tasks, and build new human reviews (UIs).

We spent months using Sonnet 3.5 and 3.6 to improve the code on several Java, JavaScript-based projects, and 3.7 immediately delivered better-looking, more modern, and improved code.

Previously, it provided small snippets of code with shorter responses and we were constantly reminded to provide full code, longer responses, no missing responses, etc. 3.7's responses walked us through folder structure, dependency installation, initial project setup, and how to create libraries, and then it works with each of our JSX pages, each with thousands of lines of code, and it works without bugs or reference or library issues.

If you're not a developer, never fear, it now takes less than 3 minutes to create a beautiful website with proper CSS, animations, colors, and a modern UI.

This article has been rewritten from an article originally published on the Anthropic website.

To read the previously written article on “ChatGPT vs Claude,” click here.

 

Claude Code

Claude 3.7 Sonnet is a big step forward, especially for coding and front-end web development. Along with this model, we're also introducing Claude Code, a command-line tool for coding agents. Claude Code is available as a limited research preview and allows developers to delegate significant engineering tasks to Claude directly from the terminal.

Claude 3.7 Sonnet is now available on all Claude plans - Free, Pro, Team, and Enterprise - and on the Anthropic API, Amazon Bedrock, and Vertex AI on Google Cloud. Extended thinking modes are available on all plans except the free Claude tier.

For both standard and extended thinking modes, Claude 3.7 Sonnet is priced the same as previous versions: $3 per 1 million input tokens and $15 per 1 million output tokens (including incident tokens).

Claude 3.7 Sonnet: Frontier reasoning made practical

Claude developed Claude 3.7 Sonnet with a different philosophy than other inference models on the market: just as humans use one brain for quick reactions and another for deep contemplation, we believe that inference should be an integrated feature of the Frontier model, rather than a completely separate model. This integrated approach provides a more seamless experience for users.

Claude 3.7 Sonnet implements this philosophy in several ways.

  • First, Claude 3.7 Sonnet has the functionality of both a regular LLM and an inference model. You can choose when you want the model to answer normally and when you want it to think longer before answering. In standard mode, Claude 3.7 Sonnet is an upgraded version of Claude 3.5 Sonnet. In extended thinking mode, you can do better in math, physics, following directions, coding, and many other tasks because you reflect before answering. In general, prompts for models work similarly in both modes.
     
  • Second, when using Claude 3.7 Sonnet via the API, users can control the budget for their thinking: They can tell Claude not to exceed N tokens, where N can be any value up to the output limit of 128,000 tokens. This allows you to trade off speed and cost for quality of answers.
     
  • Third, in developing the inference model, we didn't optimize as much for math and computer science competition questions, instead focusing on real-world tasks that better reflect how companies actually use LLMs.

    Initial testing showed that Claude's coding skills were generally good. Cursor confirmed that Claude was once again at the top of his game on real-world coding tasks, showing significant improvement in areas ranging from handling complex codebases to using advanced tools. Cognition found it to be far superior to any other model in terms of code change planning and handling full-stack updates; Vercel highlighted Claude's superior accuracy for complex agent workflows; and Replit successfully deployed Claude where other models stalled, building sophisticated web apps and dashboards from scratch. In Canva's evaluation, Claude consistently produced production-ready code with great design flair and dramatically reduced errors.

SWE-bench Verified

Claude 3.7 Sonnet achieved state-of-the-art performance in SWE-bench Verified, which evaluates the ability of AI models to solve real-world software problems.

TAU-bench

Claude 3.7 Sonnet achieves state-of-the-art performance on TAU-bench, a framework for testing AI agents on complex real-world tasks with user and tool interaction.

Claude 3.7 Sonnet

Claude 3.7 Sonnet excels in following instructions, general reasoning, multi-modal capabilities, and agent coding, while extended thinking delivers notable gains in math and science. Beyond traditional benchmarks, it outperformed all previous models in playtesting Pokémon games.

The Claude Code

Since June 2024, Sonnet has become the preferred model for developers around the world. To empower developers even more, we released Claude Code, our first agent coding tool, as a limited research preview.
Claude Code is an active collaboration tool that allows you to search and read code, edit files, write and run tests, commit and push code to GitHub, and use command line tools.

Claude Code is an early product, but it will become indispensable, especially for test-driven development, debugging complex issues, and large-scale refactoring.

In early testing, Claude Code has reduced development time and overhead by completing tasks in one pass that would normally take 45 minutes or more of manual labor.
In the coming weeks, they'll continue to make improvements based on usage (improving the reliability of tool calls, adding support for long-running commands, improving in-app rendering, and expanding their own understanding of Claude's capabilities).

Claude's goal with Claude Code is to better understand how developers use Claude for coding, which will inform future improvements to the model.

Availability and pricing

For developers looking to build custom AI solutions using Claude 3.7 Sonnet, it is available on the Anthropic API, Amazon Bedrock, and Vertex AI on Google Cloud.

For business users and consumers who want to collaborate with Claude 3.7 Sonnet through a simple chat experience, Claude 3.7 Sonnet is available on Claude.ai for all users on web, iOS, and Android.

Pricing for Claude 3.7 Sonnet starts at $3 per 1 million input tokens and $15 per 1 million output tokens, with savings of up to 90% with instant caching and 50% with batch processing. See our pricing page for more details.

Working on your codebase with Claude

We've also improved the coding experience on Claude.ai: GitHub integration is now available on all Claude plans. Developers can connect their code repositories directly to Claude.

Claude 3.7 Sonnet is the best coding model ever developed. As it deepens its understanding of personal, professional, and open source projects, it will become an even stronger partner in bug fixing, feature development, and documentation across your most important GitHub projects.

Responsible development

Claude has worked with external experts to conduct extensive testing and evaluation of Claude 3.7 Sonnet to ensure that it meets security, safety, and reliability standards. Claude 3.7 Sonnet also makes a finer distinction between harmful and harmless requests, resulting in 45% fewer unnecessary rejections than previous versions.

The system card covers new safety results in several categories and provides a detailed analysis of the responsible scaling policy evaluation that other AI labs and researchers can apply to their work. The card also addresses new risks associated with computer use, particularly prompt injection attacks, and describes how we assessed these vulnerabilities and trained Claude to resist and mitigate them. We also investigate the potential safety benefits of inference models, namely the ability to understand how a model makes decisions and whether its reasoning is truly trustworthy and reliable.

Looking to the future

Claude 3.7 Sonet and Claude take an important step toward AI systems that can truly augment human capabilities. With their ability to think deeply, work autonomously, and collaborate effectively, they bring us closer to a future where AI enriches and extends what humans can accomplish.

The future of Claude AI

I'm excited to explore the new capabilities and see what we can create with them. Claude is always looking for feedbackfrom users to continue to improve and evolve the model.

Thank you.

List

By Tags:

JaeDeok Park
Quality Manager
JaeDeok Park is a quality manager at Vivoldi, working to solve users′ problems and strive for great service.
In his spare time, he likes to read books and enjoy shopping, albeit occasionally.