All tags
Company: "cartesia"
MiniMax 2.7: GLM-5 at 1/3 cost SOTA Open Model
minimax-m2.7 sonnet-4.6 glm-5 mimo-v2-pro mamba-3 qwen-3.5 kimi-k2.5 gpt-5.4-mini minimax xiaomi artificial-analysis ollama trae yupp openrouter vercel zo opencode kilocode cartesia self-evolving-agents reasoning cost-efficiency token-efficiency hybrid-architecture harness-engineering agent-harnesses skills memory-optimization architecture feedback-loops api inference execution-environment
MiniMax M2.7 is the headline model release, described as a "self-evolving agent" with strong performance metrics including 56.22% on SWE-Pro, 57.0% on Terminal Bench 2, and parity with Sonnet 4.6. It features recursive self-improvement in skills, memory, and architecture. Artificial Analysis places M2.7 on the cost/performance frontier with an Intelligence Index score of 50, matching GLM-5 (Reasoning) but at a fraction of the cost. Distribution is available via platforms like Ollama cloud and OpenRouter. Xiaomi’s MiMo-V2-Pro is noted as a serious Chinese API-only reasoning model with a score of 49 on the Intelligence Index and favorable token efficiency. Cartesia’s Mamba-3 is highlighted as an SSM optimized for inference-heavy use, with early reactions focusing on hybrid transformer architectures like Qwen3.5 and Kimi Linear. The report emphasizes a shift from prompting to harness engineering, where the execution environment and agent harnesses, including skills and MCP, are becoming key differentiators in AI system design. This includes discussions on tools, repo legibility, constraints, and feedback loops, with mentions of DSPy and GPT-5.4 mini as important components in this evolving landscape.
1 TRILLION token context, real time, on device?
gemini-1.5-pro gemini-1.5 cartesia mistral-ai scale-ai state-space-models voice-models multimodality model-performance on-device-ai long-context evaluation-leaderboards learning-rate-optimization scientific-publishing research-vs-engineering yann-lecun elon-musk
Cartesia, a startup specializing in state space models (SSMs), launched a low latency voice model outperforming transformer-based models with 20% lower perplexity, 2x lower word error, and 1 point higher NISQA quality. This breakthrough highlights the potential for models that can continuously process and reason over massive streams of multimodal data (text, audio, video) with a trillion token context window on-device. The news also covers recent AI developments including Mistral's Codestral weights release, Schedule Free optimizers paper release, and Scale AI's new elo-style eval leaderboards. Additionally, a debate between yann-lecun and elon-musk on the importance of publishing AI research versus engineering achievements was noted. The Gemini 1.5 Pro/Advanced models were mentioned for their strong performance.