All tags
Model: "glm-4.6"
Air Street's State of AI 2025 Report
glm-4.6 jamba-1.5 rnd1 claude-code reflection mastra datacurve spellbook kernel figure softbank abb radicalnumerics zhipu-ai ai21-labs anthropic humanoid-robots mixture-of-experts diffusion-models open-weight-models reinforcement-learning benchmarking small-language-models plugin-systems developer-tools agent-stacks adcock_brett achowdhery clementdelangue
Reflection raised $2B to build frontier open-weight models with a focus on safety and evaluation, led by a team with backgrounds from AlphaGo, PaLM, and Gemini. Figure launched its next-gen humanoid robot, Figure 03, emphasizing non-teleoperated capabilities for home and large-scale use. Radical Numerics released RND1, a 30B-parameter sparse MoE diffusion language model with open weights and code to advance diffusion LM research. Zhipu posted strong results with GLM-4.6 on the Design Arena benchmark, while AI21 Labs' Jamba Reasoning 3B leads tiny reasoning models. Anthropic introduced a plugin system for Claude Code to enhance developer tools and agent stacks. The report also highlights SoftBank's acquisition of ABB's robotics unit for $5.4B and the growing ecosystem around open frontier modeling and small-model reasoning.
Gemini 2.5 Computer Use preview beats Sonnet 4.5 and OAI CUA
gemini-2.5 gpt-5-pro glm-4.6 codex google-deepmind openai microsoft anthropic zhipu-ai llamaindex mongodb agent-frameworks program-synthesis security multi-agent-systems computer-use-models open-source moe developer-tools workflow-automation api vision reasoning swyx demishassabis philschmid assaf_elovic hwchase17 jerryjliu0 skirano fabianstelzer blackhc andrewyng
Google DeepMind released a new Gemini 2.5 Computer Use model for browser and Android UI control, evaluated by Browserbase. OpenAI showcased GPT-5 Pro, new developer tools including Codex with Slack integration, and agent-building SDKs at Dev Day. Google DeepMind's CodeMender automates security patching for large codebases. Microsoft introduced an open-source Agent Framework for multi-agent enterprise systems. AI community discussions highlight agent orchestration, program synthesis, and UI control advancements. GLM-4.6 update from Zhipu features a large Mixture-of-Experts model with 355B parameters.