All tags
Topic: "high-performance-computing"
How To Scale Your Model, by DeepMind
qwen-0.5 google-deepmind deepseek hugging-face transformers inference high-performance-computing robotics sim2real mixture-of-experts reinforcement-learning bias-mitigation rust text-generation open-source omarsar0 drjimfan tairanhe99 guanyashi lioronai _philschmid awnihannun clementdelangue
Researchers at Google DeepMind (GDM) released a comprehensive "little textbook" titled "How To Scale Your Model" covering modern Transformer architectures, inference optimizations beyond O(N^2) attention, and high-performance computing concepts like rooflines. The resource includes practical problems and real-time comment engagement. On AI Twitter, several key updates include the open-sourced humanoid robotics model ASAP inspired by athletes like Cristiano Ronaldo, LeBron James, and Kobe Bryant; a new paper on Mixture-of-Agents proposing the Self-MoA method for improved LLM output aggregation; training of reasoning LLMs using the GRPO algorithm from DeepSeek demonstrated on Qwen 0.5; findings on bias in LLMs used as judges highlighting the need for multiple independent evaluations; and the release of mlx-rs, a Rust library for machine learning with examples including Mistral text generation. Additionally, Hugging Face launched an AI app store featuring over 400,000 apps with 2,000 new daily additions and 2.5 million weekly visits, enabling AI-powered app search and categorization.
Gemini Ultra is out, to mixed reviews
gemini-ultra gemini-advanced solar-10.7b openhermes-2.5-mistral-7b subformer billm google openai mistral-ai hugging-face multi-gpu-support training-data-contamination model-merging model-alignment listwise-preference-optimization high-performance-computing parameter-sharing post-training-quantization dataset-viewer gpu-scheduling fine-tuning vram-optimization
Google released Gemini Ultra as a paid tier for "Gemini Advanced with Ultra 1.0" following the discontinuation of Bard. Reviews noted it is "slightly faster/better than ChatGPT" but with reasoning gaps. The Steam Deck was highlighted as a surprising AI workstation capable of running models like Solar 10.7B. Discussions in AI communities covered topics such as multi-GPU support for OSS Unsloth, training data contamination from OpenAI outputs, ethical concerns over model merging, and new alignment techniques like Listwise Preference Optimization (LiPO). The Mojo programming language was praised for high-performance computing. In research, the Subformer model uses sandwich-style parameter sharing and SAFE for efficiency, and BiLLM introduced 1-bit post-training quantization to reduce resource use. The OpenHermes dataset viewer tool was launched, and GPU scheduling with Slurm was discussed. Fine-tuning challenges for models like OpenHermes-2.5-Mistral-7B and VRAM requirements were also topics of interest.