All tags
Person: "_catwu"
DeepSeek V3.1: 840B token continued pretrain, beating Claude 4 Sonnet at 11% of its cost
deepseek-v3.1 seed-oss-36b computerrl gemini-2.5-pro gpt-5 claude-code gpt-oss-120b gpt-oss-20b deepseek bytedance zhipu-ai github microsoft anthropic together-ai baseten huggingface token-efficiency coding agentic-benchmarks long-context reinforcement-learning developer-tools fine-tuning multinode-training model-release teortaxestex rasbt lukehoban burkeholland _catwu cline winglian
DeepSeek released DeepSeek V3.1, a quietly rolled out open model with an 128K context window and improvements in token efficiency, coding, and agentic benchmarks. ByteDance launched the permissive Seed-OSS 36B model on Hugging Face, noted for long-context and reasoning capabilities. Zhipu AI introduced ComputerRL, a reinforcement learning framework for computer-use agents, achieving strong benchmark results. In developer tooling, GitHub Copilot expanded globally, Microsoft VS Code integrated Gemini 2.5 Pro and updated GPT-5 agent prompts, and Anthropic launched Claude Code seats with spend controls. Open-source fine-tuning advances include Together AI adding SFT for gpt-oss-120B/20B and Baseten enabling multinode 120B training with Truss CLI. The community noted mixed performance and ongoing post-training adjustments for DeepSeek V3.1.
not much happened today
gpt-5 gpt-oss-120b opus-4.1 sonnet-4 openai anthropic minimax context-windows model-routing model-hosting multi-tool-pipelines prompt-caching model-extraction model-pairing cost-efficiency model-optimization sama jeremyphoward jxmnop _catwu
OpenAI continues small updates to GPT-5, introducing "Auto/Fast/Thinking" modes with 196k token context, 3,000 messages/week, and dynamic routing to cheaper models for cost efficiency. The MiniMax AI Agent Challenge offers $150,000 in prizes for AI agent development by August 25. The community discusses GPT-OSS-120B base model extraction, hosting, and tooling improvements, including multi-tool pipelines and flex-attention. Anthropic announces model pairing in Claude Code with Opus 4.1 for planning and Sonnet 4 for execution, expanding context to 1M tokens and introducing prompt caching. Key figures include @sama, @jeremyphoward, @jxmnop, and @_catwu.