All tags
Model: "claude-4-sonnet"
not much happened today
claude-4 claude-4-opus claude-4-sonnet gemini-2.5-pro gemma-3n imagen-4-ultra anthropic google-deepmind openai codebase-understanding coding agentic-performance multimodality text-to-speech video-generation model-integration benchmarking memory-optimization cline amanrsanger ryanpgreenblatt johnschulman2 alexalbert__ nearcyan mickeyxfriedman jeremyphoward gneubig teortaxesTex scaling01 artificialanlys philschmid
Anthropic's Claude 4 models (Opus 4, Sonnet 4) demonstrate strong coding abilities, with Sonnet 4 achieving 72.7% on SWE-bench and Opus 4 at 72.5%. Claude Sonnet 4 excels in codebase understanding and is considered SOTA on large codebases. Criticism arose over Anthropic's handling of ASL-3 security requirements. Demand for Claude 4 is high, with integration into IDEs and support from Cherry Studio and FastHTML. Google DeepMind introduced Gemini 2.5 Pro Deep Think and Gemma 3n, a mobile multimodal model reducing RAM usage by nearly 3x. Google's Imagen 4 Ultra ranks third in the Artificial Analysis Image Arena, available on Vertex AI Studio. Google also promoted Google Beam, an AI video model for immersive 3D experiences, and new text-to-speech models with multi-speaker support. The GAIA benchmark shows Claude 4 Opus and Sonnet leading in agentic performance.
Anthropic releases Claude 4 Sonnet and Opus: Memory, Agent Capabilities, Claude Code, Redteam Drama
claude-4 claude-4-opus claude-4-sonnet claude-3.5-sonnet anthropic instruction-following token-accounting pricing-models sliding-window-attention inference-techniques open-sourcing model-accessibility agent-capabilities-api extended-context model-deployment
Anthropic has officially released Claude 4 with two variants: Claude Opus 4, a high-capability model for complex tasks priced at $15/$75 per million tokens, and Claude Sonnet 4, optimized for efficient everyday use. The release emphasizes instruction following and extended work sessions up to 7 hours. Community discussions highlight concerns about token pricing, token accounting transparency, and calls for open-sourcing Claude 3.5 Sonnet weights to support local model development. The news also covers Claude Code GA, new Agent Capabilities API, and various livestreams and reports detailing these updates. There is notable debate around sliding window attention and advanced inference techniques for local deployment.