All tags
Company: "moonshot"
not much happened today
kimi-k2.6 qwen-3.6-max-preview moonshot alibaba vllm openrouter cloudflare baseten mlx nous-research opencode ollama mixture-of-experts multimodality int4-quantization long-context agentic-coding multi-agent-systems model-orchestration memory-consolidation llm-driven-replanning dynamic-context-injection
Moonshot's Kimi K2.6 is a major open-weight 1T-parameter MoE model featuring 32B active parameters, 384 experts, MLA attention, 256K context window, native multimodality, and INT4 quantization. It supports day-0 integration with platforms like vLLM, OpenRouter, Cloudflare Workers AI, and others, showcasing state-of-the-art performance on benchmarks such as HLE w/ tools 54.0, SWE-Bench Pro 58.6, and Math Vision w/ python 93.2. The model excels in long-horizon execution with over 4,000 tool calls, 12+ hour continuous runs, and 300 parallel sub-agents. Meanwhile, Alibaba's Qwen3.6-Max-Preview previewed enhanced agentic coding, improved world knowledge, and instruction following, with notable performance on AIME 2026 #15 and ranking in Code Arena. Hermes Agent is rapidly expanding its ecosystem, surpassing 100K GitHub stars and integrating with tools like Ollama and Copilot CLI, while pioneering advanced multi-agent orchestration techniques such as stateless ephemeral units, LLM-driven replanning, and dynamic context injection. These developments highlight the competitive momentum of Chinese open and semi-open labs in coding and agent models.
not much happened today
kimi-linear-48b codex gpt-5.4 claude-code moonshot openai assemblyai langchain attention-mechanisms model-architecture inference-speed agent-feedback agent-skills multi-agent-systems knowledge-transfer cli-tools coding-agents model-deployment kimi_moonshot elonmusk yuchenj_uw nathancgy4 eliebakouch tokenbender behrouz_ali cloneofsimo fidjissimo sama gdb andrewyng itsafiz simplifyinai
Moonshot's Attention Residuals paper introduced an input-dependent attention mechanism over prior layers with a 1.25x compute advantage and less than 2% inference latency overhead, validated on Kimi Linear 48B total / 3B active. The paper sparked debate on novelty versus prior art like DeepCrossAttention and Googleโs earlier work, highlighting tensions in idea novelty, citation quality, and frontier-scale validation. OpenAI's Codex showed strong momentum with over 2M weekly active users, nearly 4x growth YTD, and GPT-5.4 hitting 5T tokens/day and a $1B annualized run-rate. Codex added subagents supporting multi-agent coding workflows. Infrastructure for coding agents matured with tools like Context Hub / chub supporting agent feedback loops, AssemblyAI's skill for Claude Code and Codex, and automated skill extraction from GitHub repos yielding 40% knowledge-transfer gains. LangChain launched LangGraph CLI and open-sourced Deep Agents, recreating top coding agent workflows with planning, filesystem ops, shell access, and sub-agents.
Qwen-Image 2.0 and Seedance 2.0
gpt-5.2 gpt-5.3-codex claude-opus-4.6 gemini-3-pro qwen-image-2.0 seedance-2.0 openai langchain-ai anthropic google-deepmind mistral-ai alibaba bytedance moonshot agentic-sandboxes multi-model-orchestration server-side-compaction coding-agent-ux long-running-agents model-release text-to-video image-generation parallel-execution funding git-compatible-database token-efficiency workflow-optimization hwchase17 nabbilkhan sydneyrunkle joecuevasjr pierceboggan reach_vb gdb ashtom
OpenAI advances its Responses API for multi-hour agent workflows with features like server-side compaction, hosted containers, and Skills API, alongside upgrading Deep Research to GPT-5.2 and adding connectors. Discussions around sandbox design highlight a shift towards sandbox-as-a-tool architectures, with LangChain enhancing its deepagents v0.4 with pluggable sandbox backends. Coding agent UX evolves with multi-model orchestration involving Claude Opus 4.6, GPT-5.3-Codex, and Gemini 3 Pro. EntireHQ raised $60M seed funding for a Git-compatible database capturing code intent and agent context. In model releases, Alibaba Qwen launched Qwen-Image-2.0 emphasizing 2K resolution and 1K-token prompts for unified generation and editing. ByteDance's Seedance 2.0 marks a significant leap in text-to-video quality, while Moonshot's Kimi introduces an Agent Swarm with up to 100 sub-agents and 4.5ร faster parallel execution.
not much happened today
Poolside raised $1B at a $12B valuation. Eric Zelikman raised $1B after leaving Xai. Weavy joined Figma. New research highlights FP16 precision reduces training-inference mismatch in reinforcement-learning fine-tuning compared to BF16. Kimi AI introduced a hybrid KDA (Kimi Delta Attention) architecture improving long-context throughput and RL stability, alongside a new Kimi CLI for coding with agent protocol support. OpenAI previewed Agent Mode in ChatGPT enabling autonomous research and planning during browsing.