All tags
Company: "mila"
not much happened today
gpt-5-pro gemini-2.5 vllm deepseek-v3.1 openai google-deepmind microsoft epoch-ai-research togethercompute nvidia mila reasoning reinforcement-learning inference speculative-decoding sparse-attention kv-cache-management throughput-optimization compute-efficiency tokenization epochairesearch yitayml _philschmid jiqizhixin cvenhoff00 neelnanda5 lateinteraction mgoin_ blackhc teortaxestex
FrontierMath Tier 4 results show GPT-5 Pro narrowly outperforming Gemini 2.5 Deep Think in reasoning accuracy, with concerns about problem leakage clarified by Epoch AI Research. Mila and Microsoft propose Markovian Thinking to improve reasoning efficiency, enabling models to reason over 24K tokens with less compute. New research suggests base models inherently contain reasoning mechanisms, with "thinking models" learning to invoke them effectively. In systems, NVIDIA Blackwell combined with vLLM wins InferenceMAX with significant throughput gains, while Together AI's ATLAS adaptive speculative decoding achieves 4× speed improvements and reduces RL training time by over 60%. SparseServe introduces dynamic sparse attention with KV tiering, drastically improving throughput and latency in GPU memory management.
a quiet weekend
o1 datagemma aloha demostart firefly-ai-video-model pixtral-12b gamegen-o openai google-deepmind adobe mistral-ai tencent supermaven 11x cohere anthropic latent-space-university stanford microsoft mila notre-dame reinforcement-learning chain-of-thought reasoning robotics diffusion-models multimodality video-generation model-training reflection-tuning mathematical-reasoning model-benchmarking fine-tuning george-hotz terence-tao adcock_brett rohanpaul_ai bindureddy fchollet philschmid
OpenAI released the new o1 model, leveraging reinforcement learning and chain-of-thought prompting to excel in reasoning benchmarks, achieving an IQ-like score of 120. Google DeepMind introduced DataGemma to reduce hallucinations by connecting LLMs with real-world data, and unveiled ALOHA and DemoStart for robot dexterity using diffusion methods. Adobe previewed its Firefly AI Video Model with text-to-video and generative extend features. Mistral launched the multimodal Pixtral 12B model, and Tencent presented the GameGen-O open-world video game generation model. Several research papers from Stanford, OpenAI, Microsoft, Mila, and Notre Dame focus on advanced reasoning, self-verification, and reflection tuning techniques. Experts like Terence Tao and George Hotz have shared mixed but optimistic views on o1's capabilities. Seed funding rounds include Supermaven ($12M) and 11x ($24M).