Company: "minimax"

MiniMax 2.7: GLM-5 at 1/3 cost SOTA Open Model

minimax-m2.7 sonnet-4.6 glm-5 mimo-v2-pro mamba-3 qwen-3.5 kimi-k2.5 gpt-5.4-mini minimax xiaomi artificial-analysis ollama trae yupp openrouter vercel zo opencode kilocode cartesia self-evolving-agents reasoning cost-efficiency token-efficiency hybrid-architecture harness-engineering agent-harnesses skills memory-optimization architecture feedback-loops api inference execution-environment

MiniMax M2.7 is the headline model release, described as a "self-evolving agent" with strong performance metrics including 56.22% on SWE-Pro, 57.0% on Terminal Bench 2, and parity with Sonnet 4.6. It features recursive self-improvement in skills, memory, and architecture. Artificial Analysis places M2.7 on the cost/performance frontier with an Intelligence Index score of 50, matching GLM-5 (Reasoning) but at a fraction of the cost. Distribution is available via platforms like Ollama cloud and OpenRouter. Xiaomi’s MiMo-V2-Pro is noted as a serious Chinese API-only reasoning model with a score of 49 on the Intelligence Index and favorable token efficiency. Cartesia’s Mamba-3 is highlighted as an SSM optimized for inference-heavy use, with early reactions focusing on hybrid transformer architectures like Qwen3.5 and Kimi Linear. The report emphasizes a shift from prompting to harness engineering, where the execution environment and agent harnesses, including skills and MCP, are becoming key differentiators in AI system design. This includes discussions on tools, repo legibility, constraints, and feedback loops, with mentions of DSPy and GPT-5.4 mini as important components in this evolving landscape.

Feb 24

Anthropic accuses DeepSeek, Moonshot, and MiniMax of "industrial-scale distillation attacks".

claude claude-3 codex claude-code anthropic deepseek moonshot-ai minimax openai ollama api-abuse-resistance model-security agentic-engineering coding-agents model-distillation workflow-automation sandboxing realtime-communication simon_willison

Anthropic alleges industrial-scale distillation attacks on its Claude model by DeepSeek, Moonshot AI, and MiniMax, involving ~24,000 fraudulent accounts and >16M Claude exchanges to extract capabilities, raising concerns about competitive risks and safety. The community debates the difference between scraping and API-output extraction, highlighting a shift toward protecting models via API abuse resistance techniques. Meanwhile, coding agents like Codex and Claude Code see real adoption and failures, with emerging best practices in "agentic engineering" led by Simon Willison. The OpenClaw ecosystem expands with alternatives like NanoClaw and integrations such as Ollama 0.17 simplifying open model usage.

Feb 16

Qwen3.5-397B-A17B: the smallest Open-Opus class, very efficient model

qwen3.5-397b-a17b qwen3.5-plus qwen3-max qwen3-vl kimi alibaba openai deepseek z-ai minimax kimi unsloth ollama vllm native-multimodality spatial-intelligence sparse-moe long-context model-quantization model-architecture model-deployment inference-optimization apache-2.0-license pete_steinberger justinlin610

Alibaba released Qwen3.5-397B-A17B, an open-weight model featuring native multimodality, spatial intelligence, and a hybrid linear attention + sparse MoE architecture supporting 201 languages and long context windows up to 256K tokens. The model shows improvements over previous versions like Qwen3-Max and Qwen3-VL, with a sparsity ratio of about 4.3%. Community discussions highlighted the Gated Delta Networks enabling efficient inference despite large model size (~800GB BF16), with successful local runs on Apple Silicon using quantization techniques. The hosted API version, Qwen3.5-Plus, supports 1M context and integrates search and code interpreter features. This release follows other Chinese labs like Z.ai, Minimax, and Kimi in refreshing large models. The model is licensed under Apache-2.0 and is expected to be the last major release before DeepSeek v4. The news also notes Pete Steinberger joining OpenAI.

Dec 29, 2025

Meta Superintelligence Labs acquires Manus AI for over $2B, at $100M ARR, 9months after launch

glm-4.7 minimax-m2.1 vllm manus benchmark meta-ai-fair vllm amd sglang weaviate teknim baseten alphaxiv minimax performance-optimization inference-frameworks model-benchmarking model-deployment open-source-models multimodality api code-generation community-building alex_wang nat_friedman

Manus achieved a rapid growth trajectory in 2025, raising $500M from Benchmark and reaching $100M ARR before being acquired by Meta for an estimated $4B. The vLLM team launched a dedicated community site with new resources, while performance issues with AMD MI300X FP8 were noted in vLLM and sglang benchmarks. Weaviate released operational features including Object TTL, Java v6 client GA, and multimodal document embeddings. API fragmentation concerns were raised by Teknium advocating for unified SDK wrappers. In open-weight models, GLM-4.7 gained recognition as a reliable coding model with faster throughput on Baseten, and MiniMax-M2.1 rose as a leading open agentic coder model, topping WebDev leaderboards.

Oct 30, 2025

not much happened today

kimi-linear kimi-delta-attention minimax-m2 looped-llms aardvark-gpt-5 moonshot-ai minimax bytedance princeton mila openai cursor cognition hkust long-context attention-mechanisms agentic-ai tool-use adaptive-compute coding-agents performance-optimization memory-optimization reinforcement-learning model-architecture kimi_moonshot scaling01 uniartisan omarsar0 aicodeking songlinyang4 iscienceluvr nrehiew_ gdb embeddedsec auchenberg simonw

Moonshot AI released Kimi Linear (KDA) with day-0 infrastructure and strong long-context metrics, achieving up to 75% KV cache reduction and 6x decoding throughput. MiniMax M2 pivoted to full attention for multi-hop reasoning, maintaining strong agentic coding performance with 200k context and ~100 TPS. ByteDance, Princeton, and Mila introduced Looped LLMs showing efficiency gains comparable to larger transformers. OpenAI's Aardvark (GPT-5) entered private beta as an agentic security researcher for scalable vulnerability discovery. Cursor launched faster cloud coding agents, though transparency concerns arose regarding base-model provenance. Cognition released a public beta for a desktop/mobile tool-use agent named Devin. The community discussed advanced attention mechanisms and adaptive compute techniques.

Aug 13, 2025

not much happened today

gpt-5 gpt-oss-120b opus-4.1 sonnet-4 openai anthropic minimax context-windows model-routing model-hosting multi-tool-pipelines prompt-caching model-extraction model-pairing cost-efficiency model-optimization sama jeremyphoward jxmnop _catwu

OpenAI continues small updates to GPT-5, introducing "Auto/Fast/Thinking" modes with 196k token context, 3,000 messages/week, and dynamic routing to cheaper models for cost efficiency. The MiniMax AI Agent Challenge offers $150,000 in prizes for AI agent development by August 25. The community discusses GPT-OSS-120B base model extraction, hosting, and tooling improvements, including multi-tool pipelines and flex-attention. Anthropic announces model pairing in Claude Code with Opus 4.1 for planning and Sonnet 4 for execution, expanding context to 1M tokens and introducing prompt caching. Key figures include @sama, @jeremyphoward, @jxmnop, and @_catwu.

Jun 18, 2025

Zuck goes Superintelligence Founder Mode: $100M bonuses + $100M+ salaries + NFDG Buyout?

llama-4 maverick scout minimax-m1 afm-4.5b chatgpt midjourney-v1 meta-ai-fair openai deeplearning-ai essential-ai minimax arcee midjourney long-context multimodality model-release foundation-models dataset-release model-training video-generation enterprise-ai model-architecture moe prompt-optimization sama nat dan ashvaswani clementdelangue amit_sangani andrewyng _akhaliq

Meta AI is reportedly offering 8-9 figure signing bonuses and salaries to top AI talent, confirmed by Sam Altman. They are also targeting key figures like Nat and Dan from the AI Grant fund for strategic hires. Essential AI released the massive 24-trillion-token Essential-Web v1.0 dataset with rich metadata and a 12-category taxonomy. DeepLearning.AI and Meta AI launched a course on Llama 4, featuring new MoE models Maverick (400B) and Scout (109B) with context windows up to 10M tokens. MiniMax open-sourced MiniMax-M1, a long-context LLM with a 1M-token window, and introduced the Hailuo 02 video model. OpenAI rolled out "Record mode" for ChatGPT Pro, Enterprise, and Edu on macOS. Arcee launched the AFM-4.5B foundation model for enterprise. Midjourney released its V1 video model enabling image animation. These developments highlight major advances in model scale, long-context reasoning, multimodality, and enterprise AI applications.

Mar 10, 2025

not much happened today

gpt-4.5 claude-3.7-sonnet deepseek-r1 smolagents-codeagent gpt-4o llama-3-8b tinyr1-32b-preview r1-searcher forgetting-transformer nanomoe openai deepseek hugging-face mixture-of-experts reinforcement-learning kv-cache-compression agentic-ai model-distillation attention-mechanisms model-compression minimax model-pretraining andrej-karpathy cwolferesearch aymericroucher teortaxestex jonathanross321 akhaliq

The AI news recap highlights several key developments: nanoMoE, a PyTorch implementation of a mid-sized Mixture-of-Experts (MoE) model inspired by Andrej Karpathy's nanoGPT, enables pretraining on commodity hardware within a week. An agentic leaderboard ranks LLMs powering smolagents CodeAgent, with GPT-4.5 leading, followed by Claude-3.7-Sonnet. Discussions around DeepSeek-R1 emphasize AI model commoditization, with DeepSeek dubbed the "OpenAI of China." Q-Filters offer a training-free method for KV cache compression in autoregressive models, achieving 32x compression with minimal perplexity loss. The PokéChamp minimax language agent, powered by GPT-4o and Llama-3-8b, demonstrates strong performance in Pokémon battles. Other notable models include TinyR1-32B-Preview with Branch-Merge Distillation, R1-Searcher incentivizing search capability via reinforcement learning, and the Forgetting Transformer using a Forget Gate in softmax attention. These advancements reflect ongoing innovation in model architectures, compression, reinforcement learning, and agentic AI.