All tags
Topic: "knowledge-work"
Anthropic Claude Fable 5
claude-fable-5 claude-mythos-5 claude-opus-4.8 gpt-5.5 anthropic cursor_ai cognition benchmarking software-engineering knowledge-work scientific-research vision context-windows model-pricing sdk rate-limiting mikeyk scaling01
Anthropic released two major models: Claude Fable 5 for general availability and Claude Mythos 5 for restricted access, with fallback to Claude Opus 4.8 for sensitive queries. Fable 5 features a 1M-token context window and pricing at $10/million input tokens and $50/million output tokens. It leads benchmarks in software engineering, knowledge work, scientific research, and vision, outperforming GPT-5.5 and setting new state-of-the-art scores on CursorBench, FrontierCode, Terminal-Bench 2.1, and Artificial Analysis Intelligence Index. The rollout includes Pro, Max, Team, and Enterprise plans with temporary usage credits due to capacity constraints. Middleware SDK support is available in Python, TypeScript, Go, Java, and C#.
Claude Sonnet 4.6: clean upgrade of 4.5, mostly better with some caveats
claude-3-sonnet-4.6 claude-3-sonnet-4.5 claude-3-opus-4.5 claude-3-opus-4.6 anthropic cursor microsoft perplexity-ai cognition long-context agent-planning knowledge-work benchmarking tokenization model-integration code-execution model-updates aesthetic-quality alexalbert__ scaling01 rishdotblog claudeai kimmonismus artificialanlys
Anthropic launched Claude Sonnet 4.6, an upgrade over Sonnet 4.5, featuring broad improvements in coding, long-context reasoning, agent planning, knowledge work, and design, plus a 1M-token context window (beta). Benchmarks show Sonnet 4.6 leading on GDPval-AA ELO 1633, with significant token usage increases and improved output aesthetics. Integrations include Cursor, Windsurf, Microsoft Foundry, and Perplexity Pro/Max. Early user feedback noted some regression issues that were later fixed. Pricing remains the same as Sonnet 4.5. Tooling enhancements include code execution for filtering results, improving accuracy and efficiency.
GPT-5.2 (Instant/Thinking/Pro): 74% on GDPVal, 1.4x cost of GPT 5.1, on 10 Year OpenAI Anniversary
gpt-5.2 openai scientific-reasoning knowledge-work long-context benchmarking performance-optimization pricing software-engineering vision sama yanndubs polynoamial scaling01
OpenAI celebrates its 10 year anniversary with the launch of GPT-5.2, featuring significant across-the-board improvements including a rare 40% price increase. GPT-5.2 shows strong performance gains in scientific reasoning, knowledge work, and economic value tasks, achieving over 70.9% human expert parity on GDPval tasks and reaching 90.5% on ARC-AGI-1 with a large efficiency gain. Despite some mixed results in coding benchmarks and vision capabilities, GPT-5.2 is well received as a major update with extended context and tiered reasoning controls. Pricing is set at $1.75/M input and $14/M output tokens with a 90% cache discount. The update is live in ChatGPT and API, marking a significant milestone for OpenAI's LLM development.