All tags
Company: "unslothai"
Figma's $50+b IPO
horizon-alpha gpt-5 gemini-2.5-pro qwen3-coder qwen3-coder-flash-30b-a3b command-a-vision gpt-4.1 llama-4-maverick flux-1-krea-dev glm-4.5 voxtral openai openrouter alibaba unslothai cohere huggingface black-forest-labs diffusers ostrisai zhipu-ai together-ai mistral-ai reasoning svg-generation agentic-ai context-windows vision fine-tuning inference-time-training model-generalization open-models technical-reports scaling01 teortaxestex huybery nickfrosst aidangomez reach_vb zai_org corbtt jxmnop teknuim1
OpenAI's stealth model horizon-alpha on OpenRouter sparks speculation as a precursor to GPT-5, showing strong reasoning and SVG generation capabilities, comparable to Gemini 2.5 Pro. Alibaba released the Qwen3-Coder family, including a fast Qwen3-Coder-Flash (30B-A3B) variant with agentic features and 1M context length support via UnslothAI. Cohere launched Command A Vision, a 111B parameter open-weights vision-language model outperforming GPT-4.1 and Llama 4 Maverick on enterprise benchmarks. Black Forest Labs introduced FLUX.1 Krea [dev], an open-weights photorealism model compatible with fine-tuning tools like diffusers and ostrisai. Zhipu AI unveiled GLM-4.5, a hybrid reasoning open model with agentic capabilities available on Together AI. Discussions highlight the rising importance of inference-time training and reasoning model generalization. Mistral AI released the technical report for Voxtral continuing its open science efforts.
not much happened today
qwen3-coder-480b-a35b-instruct kimi-k2 alibaba openrouterai togethercompute vllm_project unslothai white-house code-generation benchmarking model-integration context-windows open-source national-security infrastructure ai-policy fchollet clementdelangue scaling01 aravsrinivas rasbt gregkamradt yuchenj_uw
Alibaba announced the release of Qwen3-Coder-480B-A35B-Instruct, an open agentic code model with 480B parameters and 256K context length, praised for rapid development and strong coding performance. Benchmark claims of 41.8% on ARC-AGI-1 faced skepticism from Fran ois Chollet and others due to reproducibility issues. The model quickly integrated into ecosystems like vLLM, Dynamic GGUFs, and OpenRouterAI. The White House unveiled a new AI Action Plan emphasizing Innovation, Infrastructure, and International Diplomacy, linking AI leadership to national security and prioritizing compute access for the Department of Defense. The plan sparked debate on open vs. closed-source AI, with calls from Clement Delangue to embrace open science to maintain US AI competitiveness.
not much happened today
gemma-3n glm-4.1v-thinking deepseek-r1t2 mini-max-m1 o3 claude-4-opus claude-sonnet moe-72b meta scale-ai unslothai zhipu-ai deepseek huawei minimax-ai allenai sakana-ai-labs openai model-performance vision conv2d float16 training-loss open-source model-benchmarks moe load-balancing scientific-literature-evaluation code-generation adaptive-tree-search synthesis-benchmarks alexandr_wang natfriedman steph_palazzolo thegregyang teortaxes_tex denny_zhou agihippo danielhanchen osanseviero reach_vb scaling01 ndea
Meta has hired Scale AI CEO Alexandr Wang as its new Chief AI Officer, acquiring a 49% non-voting stake in Scale AI for $14.3 billion, doubling its valuation to ~$28 billion. This move is part of a major talent shuffle involving Meta, OpenAI, and Scale AI. Discussions include the impact on Yann LeCun's influence at Meta and potential responses from OpenAI. In model news, Gemma 3N faces technical issues like vision NaNs and FP16 overflows, with fixes from UnslothAI. Chinese open-source models like GLM-4.1V-Thinking by Zhipu AI and DeepSeek R1T2 show strong performance and speed improvements. Huawei open-sourced a 72B MoE model with a novel load balancing solution. The MiniMax-M1 hybrid MoE model leads math benchmarks on the Text Arena leaderboard. AllenAI launched SciArena for scientific literature evaluation, where o3 outperforms others. Research from Sakana AI Labs introduces AB-MCTS for code generation, improving synthesis benchmarks.
not much happened today
gemma-3n hunyuan-a13b flux-1-kontext-dev mercury fineweb2 qwen-vlo o3-mini o4-mini google-deepmind tencent black-forest-labs inception-ai qwen kyutai-labs openai langchain langgraph hugging-face ollama unslothai nvidia amd multimodality mixture-of-experts context-windows tool-use coding image-generation diffusion-models dataset-release multilinguality speech-to-text api prompt-engineering agent-frameworks open-source model-release demishassabis reach_vb tri_dao osanseviero simonw clementdelangue swyx hwchase17 sydneyrunkle
Google released Gemma 3n, a multimodal model for edge devices available in 2B and 4B parameter versions, with support across major frameworks like Transformers and Llama.cpp. Tencent open-sourced Hunyuan-A13B, a Mixture-of-Experts (MoE) model with 80B total parameters and a 256K context window, optimized for tool calling and coding. Black Forest Labs released FLUX.1 Kontext [dev], an open image AI model gaining rapid Hugging Face adoption. Inception AI Labs launched Mercury, the first commercial-scale diffusion LLM for chat. The FineWeb2 multilingual pre-training dataset paper was released, analyzing data quality impacts. The Qwen team released Qwen-VLo, a unified visual understanding and generation model. Kyutai Labs released a top-ranked open-source speech-to-text model running on Macs and iPhones. OpenAI introduced Deep Research API with o3/o4-mini models and open-sourced prompt rewriter methodology, integrated into LangChain and LangGraph. The open-source Gemini CLI gained over 30,000 GitHub stars as an AI terminal agent.