All tags
Topic: "language-model-reasoning"
s1: Simple test-time scaling (and Kyutai Hibiki)
qwen-2.5-32b gemini-2.0-flash smollm2 granite-vision-3.1-2b google-deepmind qwen gemini hugging-face ibm deepseek reasoning fine-tuning scaling-laws open-source-models data-centric-training vision multilingual-models language-model-reasoning niklas-muennighoff
"Wait" is all you need introduces a novel reasoning model finetuned from Qwen 2.5 32B using just 1000 questions with reasoning traces distilled from Gemini 2.0 Flash Thinking, enabling controllable test-time compute by appending "Wait" to extend reasoning. Lead author Niklas Muennighoff, known for work on Bloom, StarCoder, and BIG-bench, highlights this method's efficiency and its reproduction of the famous o1 scaling chart. Additionally, Kyutai Moshi's Hibiki project demonstrates impressive offline French-English live translation on iPhone. Recent AI model releases include DeepSeek R1 and R3 open source models, potentially marking a major open-source milestone, Hugging Face's SmolLM2 emphasizing data-centric training for small LMs, and IBM's Granite-Vision-3.1-2B, a small vision-language model with strong performance. Key research papers spotlight LIMO for minimal demonstration reasoning achieving high accuracy on AIME and MATH benchmarks, and Token-Assisted Reasoning mixing latent and text tokens to improve language model reasoning.