Topic: "model-bugs"

Jul 14, 2025

kimi-k2 grok-4 gpt-5 gemini-2.5 gemini-embedding cognition windsurf moonshot-ai x-ai openai google stanfordnlp huggingface mixture-of-experts model-training model-performance fine-tuning benchmarking agentic-ai model-bugs embedding-models sama hardmaru jeremyphoward akhaliq teortaxestex yuchenj_uw demishassabis

Cognition is acquiring the remaining assets of Windsurf after a significant weekend deal. Moonshot AI released Kimi K2, an open-source, MIT-licensed agentic model with 1 Trillion total / 32B active parameters using a Mixture-of-Experts architecture, trained on 15.5 Trillion tokens with the MuonClip optimizer, showing top performance on benchmarks like EQ-Bench and Creative Writing. xAI launched Grok-4, ranking 5th on IQ Bench but with notable quirks including a bug causing it to respond only with "Heavy" and a high frequency of Elon Musk mentions. Rumors about OpenAI delaying an open-source model release surfaced, with speculation about CEO sama's PR strategy and a possible GPT-5 launch in September. The Gemini 2.5 paper was released with 3,295 authors, and Google introduced its Gemini Embedding model, topping the MTEB leaderboard.

You can also subscribe by rss .

Press Esc or click anywhere to close