All tags
Person: "shxf0072"
not much happened today
gpt-oss-120b gpt-oss-20b kimi-k2 deepseek-r1 qwen-3-32b openai huggingface microsoft llamaindex ollama baseten fireworksai cerebras groq together anthropic google uk-aisi sliding-window-attention mixture-of-experts rope context-length mxfp4-format synthetic-data reasoning-core-hypothesis red-teaming benchmarking coding-benchmarks model-performance fine-tuning woj_zaremba sama huybery drjimfan jxmnop scaling01 arunv30 kevinweil xikun_zhang_ jerryjliu0 ollama basetenco reach_vb gneubig shxf0072 _lewtun
OpenAI released its first open models since GPT-2, gpt-oss-120b and gpt-oss-20b, which quickly trended on Hugging Face. Microsoft supports these models via Azure AI Foundry and Windows Foundry Local. Key architectural innovations include sliding window attention, mixture of experts (MoE), a RoPE variant, and a 256k context length. The models use a new MXFP4 format supported by llama.cpp. Hypotheses suggest gpt-oss was trained on synthetic data to enhance safety and performance, supporting the Reasoning Core Hypothesis. OpenAI announced a $500K bounty for red teaming with partners including Anthropic, Google, and the UK AISI. Performance critiques highlight inconsistent benchmarking results, with GPT-OSS-120B scoring 41.8% on the Aider Polyglot coding benchmark, trailing competitors like Kimi-K2 and DeepSeek-R1. Some users note the model excels in math and reasoning but lacks common sense and practical utility.
The Quiet Rise of Claude Code vs Codex
mistral-small-3.2 qwen3-0.6b llama-3-1b gemini-2.5-flash-lite gemini-app magenta-real-time apple-3b-on-device mistral-ai hugging-face google-deepmind apple artificial-analysis kuaishou instruction-following function-calling model-implementation memory-efficiency 2-bit-quantization music-generation video-models benchmarking api reach_vb guillaumelample qtnx_ shxf0072 rasbt demishassabis artificialanlys osanseviero
Claude Code is gaining mass adoption, inspiring derivative projects like OpenCode and ccusage, with discussions ongoing in AI communities. Mistral AI released Mistral Small 3.2, a 24B parameter model update improving instruction following and function calling, available on Hugging Face and supported by vLLM. Sebastian Raschka implemented Qwen3 0.6B from scratch, noting its deeper architecture and memory efficiency compared to Llama 3 1B. Google DeepMind showcased Gemini 2.5 Flash-Lite's UI code generation from visual context and added video upload support in the Gemini App. Apple's new 3B parameter on-device foundation model was benchmarked, showing slower speed but efficient memory use via 2-bit quantization, suitable for background tasks. Google DeepMind also released Magenta Real-time, an 800M parameter music generation model licensed under Apache 2.0, marking Google's 1000th model on Hugging Face. Kuaishou launched KLING 2.1, a new video model accessible via API.