All tags
Model: "voxtral"
Figma's $50+b IPO
horizon-alpha gpt-5 gemini-2.5-pro qwen3-coder qwen3-coder-flash-30b-a3b command-a-vision gpt-4.1 llama-4-maverick flux-1-krea-dev glm-4.5 voxtral openai openrouter alibaba unslothai cohere huggingface black-forest-labs diffusers ostrisai zhipu-ai together-ai mistral-ai reasoning svg-generation agentic-ai context-windows vision fine-tuning inference-time-training model-generalization open-models technical-reports scaling01 teortaxestex huybery nickfrosst aidangomez reach_vb zai_org corbtt jxmnop teknuim1
OpenAI's stealth model horizon-alpha on OpenRouter sparks speculation as a precursor to GPT-5, showing strong reasoning and SVG generation capabilities, comparable to Gemini 2.5 Pro. Alibaba released the Qwen3-Coder family, including a fast Qwen3-Coder-Flash (30B-A3B) variant with agentic features and 1M context length support via UnslothAI. Cohere launched Command A Vision, a 111B parameter open-weights vision-language model outperforming GPT-4.1 and Llama 4 Maverick on enterprise benchmarks. Black Forest Labs introduced FLUX.1 Krea [dev], an open-weights photorealism model compatible with fine-tuning tools like diffusers and ostrisai. Zhipu AI unveiled GLM-4.5, a hybrid reasoning open model with agentic capabilities available on Together AI. Discussions highlight the rising importance of inference-time training and reasoning model generalization. Mistral AI released the technical report for Voxtral continuing its open science efforts.
not much happened today
kimi-k2 gpt-4.1 voxtral goedel-prover-v2 llama-3 mistral-ai moonshot-ai nous-research google-deepmind openai groq anthropic speech-recognition mixture-of-experts benchmarking dataset-release model-architecture theorem-proving reinforcement-learning asymmetry-of-verification inference-speed model-performance cline _jasonwei
Mistral released Voxtral, claimed as the world's best open speech recognition models, available via API and Hugging Face. Moonshot AI launched Kimi K2, a trillion-parameter Mixture-of-Experts (MoE) model, outperforming GPT-4.1 on benchmarks with 65.4% on SWE-Bench Verified and achieving 200 tokens/second inference speed on Groq hardware. Nous Research open-sourced the Hermes 3 dataset with 1 million samples, aiding SOTA models on the Llama-3 series. Google DeepMind introduced the Mixture-of-Recursions (MoR) architecture promising 2x inference speed and 50% parameter reduction but faced skepticism. Goedel-Prover V2 topped the PutnamBench theorem proving benchmark. AtCoder World Finals saw a human winner with OpenAI placing second. Research highlights include Jason Wei's insights on reinforcement learning and the "Verifier's Law" emphasizing the asymmetry of verification in AI training.