All tags
Model: "smollm2"
s1: Simple test-time scaling (and Kyutai Hibiki)
qwen-2.5-32b gemini-2.0-flash smollm2 granite-vision-3.1-2b google-deepmind qwen gemini hugging-face ibm deepseek reasoning fine-tuning scaling-laws open-source-models data-centric-training vision multilingual-models language-model-reasoning niklas-muennighoff
"Wait" is all you need introduces a novel reasoning model finetuned from Qwen 2.5 32B using just 1000 questions with reasoning traces distilled from Gemini 2.0 Flash Thinking, enabling controllable test-time compute by appending "Wait" to extend reasoning. Lead author Niklas Muennighoff, known for work on Bloom, StarCoder, and BIG-bench, highlights this method's efficiency and its reproduction of the famous o1 scaling chart. Additionally, Kyutai Moshi's Hibiki project demonstrates impressive offline French-English live translation on iPhone. Recent AI model releases include DeepSeek R1 and R3 open source models, potentially marking a major open-source milestone, Hugging Face's SmolLM2 emphasizing data-centric training for small LMs, and IBM's Granite-Vision-3.1-2B, a small vision-language model with strong performance. Key research papers spotlight LIMO for minimal demonstration reasoning achieving high accuracy on AIME and MATH benchmarks, and Token-Assisted Reasoning mixing latent and text tokens to improve language model reasoning.
not much happened today
smollm2 llama-3-2 stable-diffusion-3.5 claude-3.5-sonnet gemini openai anthropic google meta-ai-fair suno-ai perplexity-ai on-device-ai model-performance robotics multimodality ai-regulation model-releases natural-language-processing prompt-engineering agentic-ai ai-application model-optimization sam-altman akhaliq arav-srinivas labenz loubnabenallal1 alexalbert fchollet stasbekman svpino rohanpaul_ai hamelhusain
ChatGPT Search was launched by Sam Altman, who called it his favorite feature since ChatGPT's original launch, doubling his usage. Comparisons were made between ChatGPT Search and Perplexity with improvements noted in Perplexity's web navigation. Google introduced a "Grounding" feature in the Gemini API & AI Studio enabling Gemini models to access real-time web information. Despite Gemini's leaderboard performance, developer adoption lags behind OpenAI and Anthropic. SmolLM2, a new small, powerful on-device language model, outperforms Meta's Llama 3.2 1B. A Claude desktop app was released for Mac and Windows. Meta AI announced robotics advancements including Meta Sparsh, Meta Digit 360, and Meta Digit Plexus. Stable Diffusion 3.5 Medium, a 2B parameter model with a permissive license, was released. Insights on AGI development suggest initial inferiority but rapid improvement. Anthropic advocates for early targeted AI regulation. Discussions on ML specialization predict training will concentrate among few companies, while inference becomes commoditized. New AI tools include Suno AI Personas for music creation, PromptQL for natural language querying over data, and Agent S for desktop task automation. Humor was shared about Python environment upgrades.