All tags
Company: "slime-framework"
Terminal-Bench 2.0 and Harbor
kimi-k2-thinking moonshot-ai anthropic hugging-face ollama slime-framework benchmarking agentic-ai quantization model-optimization inference model-deployment moe context-windows cost-efficiency clementdelangue dbreunig awnihannun crystalsssup kimi_moonshot
Terminal-Bench has fixed task issues and launched version 2.0 with cloud container support via the Harbor framework, gaining recognition from models like Claude 4.5 and Kimi K2 Thinking. Moonshot AI's Kimi K2 Thinking is a 1 trillion parameter MoE reasoning model with ~32B active parameters, running natively in INT4 quantization and featuring a 256K context window. It leads open-weights benchmarks with an Artificial Analysis Intelligence Index score of 67 and strong agentic performance, running efficiently on consumer Apple silicon and 2× M3 Ultra hardware. The model is broadly available on Hugging Face, Ollama Cloud, and integrated into frameworks like slime. Serving bottlenecks were traced to network bandwidth rather than GPU limits, highlighting infrastructure considerations for LLM deployment.