All tags
Person: "liang-wenfeng"
Project Stargate: $500b datacenter (1.7% of US GDP) and Gemini 2 Flash Thinking 2
gemini-2.0-flash deepseek-r1 qwen-32b openai softbank oracle arm microsoft nvidia huggingface deepseek-ai long-context quantization code-interpretation model-distillation open-source agi-research model-performance memory-optimization noam-shazeer liang-wenfeng
Project Stargate, a US "AI Manhattan project" led by OpenAI and Softbank, supported by Oracle, Arm, Microsoft, and NVIDIA, was announced with a scale comparable to the original Manhattan project costing $35B inflation adjusted. Despite Microsoft's reduced role as exclusive compute partner, the project is serious but not immediately practical. Meanwhile, Noam Shazeer revealed a second major update to Gemini 2.0 Flash Thinking, enabling 1M token long context usable immediately. Additionally, AI Studio introduced a new code interpreter feature. On Reddit, DeepSeek R1, a distillation of Qwen 32B, was released for free on HuggingChat, sparking discussions on self-hosting, performance issues, and quantization techniques. DeepSeek's CEO Liang Wenfeng highlighted their focus on fundamental AGI research, efficient MLA architecture, and commitment to open-source development despite export restrictions, positioning DeepSeek as a potential alternative to closed-source AI trends.
Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o-mini version)
gpt-4o-mini deepseek-v2-0628 mistral-nemo llama-8b openai deepseek-ai mistral-ai nvidia meta-ai-fair hugging-face langchain keras cost-efficiency context-windows open-source benchmarking neural-networks model-optimization text-generation fine-tuning developer-tools gpu-support parallelization cuda-integration multilinguality long-context article-generation liang-wenfeng
OpenAI launched the GPT-4o Mini, a cost-efficient small model priced at $0.15 per million input tokens and $0.60 per million output tokens, aiming to replace GPT-3.5 Turbo with enhanced intelligence but some performance limitations. DeepSeek open-sourced DeepSeek-V2-0628, topping the LMSYS Chatbot Arena Leaderboard and emphasizing their commitment to contributing to the AI ecosystem. Mistral AI and NVIDIA released the Mistral NeMo, a 12B parameter multilingual model with a record 128k token context window under an Apache 2.0 license, sparking debates on benchmarking accuracy against models like Meta Llama 8B. Research breakthroughs include the TextGrad framework for optimizing compound AI systems via textual feedback differentiation and the STORM system improving article writing by 25% through simulating diverse perspectives and addressing source bias. Developer tooling trends highlight LangChain's evolving context-aware reasoning applications and the Modular ecosystem's new official GPU support, including discussions on Mojo and Keras 3.0 integration.