All tags
Model: "claude-4.1"
OpenAI's gpt-oss 20B and 120B, Claude Opus 4.1, DeepMind Genie 3
gpt-oss-120b gpt-oss-20b gpt-oss claude-4.1-opus claude-4.1 genie-3 openai anthropic google-deepmind mixture-of-experts model-architecture agentic-ai model-training model-performance reasoning hallucination-detection gpu-optimization open-weight-models realtime-simulation sama rasbt sebastienbubeck polynoamial kaicathyc finbarrtimbers vikhyatk scaling01 teortaxestex
OpenAI released the gpt-oss family, including gpt-oss-120b and gpt-oss-20b, their first open-weight models since GPT-2, designed for agentic tasks and licensed under Apache 2.0. These models use a Mixture-of-Experts (MoE) architecture with wide vs. deep design and innovative features like bias units in attention and a unique swiglu variant. The 120B model was trained with about 2.1 million H100 GPU hours. Meanwhile, Anthropic launched claude-4.1-opus, touted as the best coding model currently. DeepMind showcased genie-3, a realtime world simulation model with minute-long consistency. The releases highlight advances in open-weight models, reasoning capabilities, and world simulation. Key figures like @sama, @rasbt, and @SebastienBubeck provided technical insights and performance evaluations, noting strengths and hallucination risks.
SmolLM3: the SOTA 3B reasoning open source LLM
smollm3-3b olmo-3 grok-4 claude-4 claude-4.1 gemini-nano hunyuan-a13b gemini-2.5 gemma-3n qwen2.5-vl-3b huggingface allenai openai anthropic google-deepmind mistral-ai tencent gemini alibaba open-source small-language-models model-releases model-performance benchmarking multimodality context-windows precision-fp8 api batch-processing model-scaling model-architecture licensing ocr elonmusk mervenoyann skirano amandaaskell clementdelangue loubnabenallal1 awnihannun swyx artificialanlys officiallogank osanseviero cognitivecompai aravsrinivas
HuggingFace released SmolLM3-3B, a fully open-source small reasoning model with open pretraining code and data, marking a high point in open source models until Olmo 3 arrives. Grok 4 was launched with mixed reactions, while concerns about Claude 4 nerfs and an imminent Claude 4.1 surfaced. Gemini Nano is now shipping in Chrome 137+, enabling local LLM access for 3.7 billion users. Tencent introduced Hunyuan-A13B, an 80B parameter model with a 256K context window running on a single H200 GPU. The Gemini API added a batch mode with 50% discounts on 2.5 models. MatFormer Lab launched tools for custom-sized Gemma 3n models. Open source OCR models like Nanonets-OCR-s and ChatDOC/OCRFlux-3B derived from Qwen2.5-VL-3B were highlighted, with licensing discussions involving Alibaba.