All tags
Company: "block"
Anthropic launches the MCP Apps open spec, in Claude.ai
claude-ai toolorchestra-8b qwen3-max-thinking anthropic openai block vs-code antigravity jetbrains aws nvidia alibaba claude-ai agent-orchestration reinforcement-learning recursive-language-models context-management user-experience security prompt-injection reasoning adaptive-tool-use model-evaluation benchmarking
Anthropic has officially absorbed the independent MCP UI project and, collaborating with OpenAI, Block, VS Code, Antigravity, JetBrains, and AWS, released the MCP Apps spec and official support in Claude.ai. This standard aims to enable a rich ecosystem of interoperable applications with rich UI, addressing the proliferation of subscription services. Meanwhile, NVIDIA introduced ToolOrchestra with an 8B orchestrator model trained via scalable reinforcement learning for efficient agent orchestration. The concept of Recursive Language Models (RLMs) is gaining traction for efficient context management in agent stacks. The “Clawdbot” UX pattern emphasizes outcome-first assistant design with tight context and tool integration, sparking security concerns around prompt injection. Alibaba launched Qwen3-Max-Thinking, a flagship reasoning and agent model with adaptive tool use and strong benchmark scores, now available in public evaluation platforms like LM Arena and Yupp.
MCP -> Agentic AI Foundation, Mistral Devstral 2
devstral-2 devstral-small-2 sonnet-4.3 deepseek-v3.2 qwen3-vl openai anthropic block mistral-ai alibaba linux-foundation deepseek agentic-ai coding-models reinforcement-learning model-performance model-optimization open-weights cli-tools multi-file-code-automation data-decontamination moe reward-models rl-stability guillaumelample b_roziere qtnx_ charliermarsh omarsar0 eliebakouch justinwaugh cwolferesearch pan
OpenAI Engineering sees a significant collaborative milestone with the launch of the Agentic AI Foundation under the Linux Foundation, uniting projects from Anthropic, OpenAI, and Block. Mistral released Devstral 2, a coding model with 123B parameters and open weights, offering a cost-effective alternative to Sonnet 4.3 and competitive performance against DeepSeek v3.2. The new Mistral Vibe CLI supports agentic coding workflows with rapid ecosystem integration. Alibaba introduced Soft Adaptive Policy Optimization (SAPO) for reinforcement learning tuning, improving stability and performance in Qwen3-VL across multiple tasks. Research highlights include the importance of data decontamination in RL and ongoing discussions on MoE RL stability and reward hacking mitigation.