All tags
Topic: "distributed-training"
Thinking Machines' Tinker: LoRA based LLM fine-tuning API
qwen-235b-a22b sora-2 thinking-machines openai fine-tuning lora model-training api model-optimization distributed-training post-training-methods research-productivity video-generation content-moderation engagement-patterns karpathy lilianweng sama
Thinking Machines recently raised $2 billion without shipping a product until now, launching their first product Tinker, a managed service API for fine-tuning large and mixture-of-experts models like Qwen-235B-A22B using LoRA for cost-efficient training. The Tinker API offers low-level primitives for post-training methods and is supported by an open-source Tinker Cookbook library. Influential AI figures like Andrej Karpathy and Lilian Weng praised its design for reducing complexity and boosting research productivity. Meanwhile, OpenAI launched Sora 2, a video+audio model integrated into their consumer social app, sparking viral engagement and concerns over misuse and content moderation. Sam Altman emphasized the product's dual focus on delight and revenue alongside AGI research.
Oracle jumps +36% in a day after winning $300B OpenAI contract
qwen3-235b qwen3-4b qwen2.5-7b vllm oracle openai microsoft moonshot-ai vllm-project thinking-machines-lab meta reinforcement-learning model-weight-updates deterministic-inference benchmarking long-context model-optimization cuda distributed-training kimi_moonshot arankomatsuzaki qgallouedec cHHillee woosuk_k stasbekman
Oracle's OCI division reported a stunning +359% revenue bookings growth to $455B with cloud revenue guidance of $144B by 2030, driven significantly by a large deal with OpenAI amid tensions with Microsoft. On AI infrastructure, Moonshot AI released Kimi’s checkpoint-engine, enabling rapid weight updates on 1T-parameter models across thousands of GPUs, integrating with vLLM. RLFactory introduced a plug-and-play reinforcement learning framework for tool-using agents, showing smaller models outperforming larger ones. TRL v0.23 added context parallelism for long-context training. Thinking Machines Lab published research on deterministic inference pipelines, making vLLM deterministic for Qwen models. Meta launched BackendBench, a PyTorch benchmarking tool.
Prime Intellect's INTELLECT-2 and PRIME-RL advance distributed reinforcement learning
intellect-2 dreamo qwen gemini-2.5-pro dynamic-byte-latent-transformer gen-4-references mistral-medium-3 le-chat-enterprise primeintellect bytedance qwen gemma meta-ai-fair runwayml mistral-ai google distributed-training reinforcement-learning gpu-clusters model-optimization quantization multimodality agentic-ai video-understanding fine-tuning _akhaliq reach_vb osanseviero aiatmeta c_valenzuelab lmarena_ai adcock_brett
Prime Intellect released INTELLECT-2, a decentralized GPU training and RL framework with a vision for distributed AI training overcoming colocation limits. ByteDance launched DreamO, a unified image customization model on Hugging Face. Qwen released models optimized for GPTQ, GGUF, and AWQ quantization. Gemma surpassed 150 million downloads on Hugging Face. Meta released weights for the Dynamic Byte Latent Transformer and the Collaborative Reasoner framework to improve language model efficiency and reasoning. RunwayML introduced Gen-4 References, a near-realtime model requiring no fine-tuning. Mistral AI released Mistral Medium 3, a strong multimodal model, and Le Chat Enterprise, an agentic AI assistant for business. Google updated Gemini 2.5 Pro Preview with video understanding and UI improvements. "Airbnb for spare GPUs from all over the world" highlights the ongoing challenges and potential of distributed GPU training.