All tags
Person: "dbreunig"
Terminal-Bench 2.0 and Harbor
kimi-k2-thinking moonshot-ai anthropic hugging-face ollama slime-framework benchmarking agentic-ai quantization model-optimization inference model-deployment moe context-windows cost-efficiency clementdelangue dbreunig awnihannun crystalsssup kimi_moonshot
Terminal-Bench has fixed task issues and launched version 2.0 with cloud container support via the Harbor framework, gaining recognition from models like Claude 4.5 and Kimi K2 Thinking. Moonshot AI's Kimi K2 Thinking is a 1 trillion parameter MoE reasoning model with ~32B active parameters, running natively in INT4 quantization and featuring a 256K context window. It leads open-weights benchmarks with an Artificial Analysis Intelligence Index score of 67 and strong agentic performance, running efficiently on consumer Apple silicon and 2× M3 Ultra hardware. The model is broadly available on Hugging Face, Ollama Cloud, and integrated into frameworks like slime. Serving bottlenecks were traced to network bandwidth rather than GPU limits, highlighting infrastructure considerations for LLM deployment.
OpenAI Dev Day: Apps SDK, AgentKit, Codex GA, GPT‑5 Pro and Sora 2 APIs
gpt-5-pro gpt-realtime-mini-2025-10-06 gpt-audio-mini-2025-10-06 gpt-image-1-mini sora-2 sora-2-pro openai canva figma zillow coursera api model-release fine-tuning agentic-ai code-generation model-deployment pricing prompt-optimization software-development multimodality sama edwinarbus gdb dbreunig stevenheidel
OpenAI showcased major product launches at their DevDay including the Apps SDK, AgentKit, and Codex now generally available with SDK and enterprise features. They introduced new models such as gpt-5-pro, gpt-realtime-mini-2025-10-06, gpt-audio-mini-2025-10-06, gpt-image-1-mini, and sora-2 with a pro variant. The Apps SDK enables embedding interactive apps inside ChatGPT with partners like Canva, Figma, Zillow, and Coursera. AgentKit offers a full stack for building and deploying production agents with tools like ChatKit and Guardrails. Codex supports speech and controller-driven coding, credited with high internal shipping velocity. Pricing for GPT-5 Pro was revealed at $15 input and $120 output per million tokens. "OpenAI turned ChatGPT into an application platform" and "AgentKit built a working agent in under 8 minutes" were highlights.