All tags
Model: "sonnet-4"
not much happened today
gpt-5 gpt-oss-120b opus-4.1 sonnet-4 openai anthropic minimax context-windows model-routing model-hosting multi-tool-pipelines prompt-caching model-extraction model-pairing cost-efficiency model-optimization sama jeremyphoward jxmnop _catwu
OpenAI continues small updates to GPT-5, introducing "Auto/Fast/Thinking" modes with 196k token context, 3,000 messages/week, and dynamic routing to cheaper models for cost efficiency. The MiniMax AI Agent Challenge offers $150,000 in prizes for AI agent development by August 25. The community discusses GPT-OSS-120B base model extraction, hosting, and tooling improvements, including multi-tool pipelines and flex-attention. Anthropic announces model pairing in Claude Code with Opus 4.1 for planning and Sonnet 4 for execution, expanding context to 1M tokens and introducing prompt caching. Key figures include @sama, @jeremyphoward, @jxmnop, and @_catwu.
Kimi K2 - SOTA Open MoE proves that Muon can scale to 15T tokens/1T params
kimi-k2 kimi-k2-1t deepseek-v3 grok-4 devstral-2507 gpt-4.1 sonnet-4 moonshot-ai alibaba tencent deepseek x-ai mistral-ai weights-biases hugging-face mixture-of-experts model-training model-optimization optimizer benchmarking long-context model-performance open-weights model-release yuchenj_uw andrew_n_carr scaling01 novita_labs teknium1 aravsrinivas mparakhin simonw
Moonshot AI has released Kimi K2, a 1 trillion parameter Mixture-of-Experts model trained on 15.5 trillion tokens using the new MuonClip optimizer, achieving state-of-the-art results on benchmarks like SWE-Bench Verified (65.8%) and TAU2 (58.4%). This model is competitive with GPT-4.1 and Sonnet 4 on non-thinking tasks and is available under an MIT license. Meanwhile, xAI announced Grok-4, noted for its "LEAST censored frontier model" status and strong long-context performance but criticized for rushed post-training. Mistral AI updated its Devstral 2507 models with improved performance and cost efficiency. The community is excited about the potential of the MuonClip optimizer, which may surpass the long-standing AdamW optimizer in machine learning.