All tags
Person: "jack_w_rae"
Google I/O: new Gemini native voice, Flash, DeepThink, AI Mode (DeepSearch+Mariner+Astra)
gemini-2.5-pro gemini-2.5 google google-deepmind ai-assistants reasoning generative-ai developer-tools ai-integration model-optimization ai-application model-updates ai-deployment model-performance demishassabis philschmid jack_w_rae
Google I/O 2024 showcased significant advancements with Gemini 2.5 Pro and Deep Think reasoning mode from google-deepmind, emphasizing AI-driven transformations and developer opportunities. GeminiApp aims to become a universal AI assistant on the path to AGI, with new features like AI Mode in Google Search expanding generative AI access. The event included multiple keynotes and updates on over a dozen models and 20+ AI products, highlighting Google's leadership in AI innovation. Influential voices like demishassabis and philschmid provided insights and recaps, while the launch of Jules as a competitor to Codex/Devin was noted.
not much happened today
gpt-2 r1 gemma-3 gemmacoder3-12b qwen2.5-omni openai deepseek berkeley alibaba togethercompute nvidia azure runway langchain bmw amazon open-source function-calling benchmarking code-reasoning multimodality inference-speed image-generation voice-generation animation robotics realtime-transcription webrtc sama clémentdelangue lioronai scaling01 cognitivecompai osanseviero jack_w_rae ben_burtenshaw theturingpost vipulved kevinweil tomlikesrobots adcock_brett juberti
OpenAI plans to release its first open-weight language model since GPT-2 in the coming months, signaling a move towards more open AI development. DeepSeek launched its open-source R1 model earlier this year, challenging perceptions of China's AI progress. Gemma 3 has achieved function calling capabilities and ranks on the Berkeley Function-Calling Leaderboard, while GemmaCoder3-12b improves code reasoning performance on LiveCodeBench. Alibaba_Qwen's Qwen2.5-Omni introduces a novel Thinker-Talker system and TMRoPE for multimodal input understanding. The TogetherCompute team achieved 140 TPS on a 671B parameter model, outperforming Azure and DeepSeek API on Nvidia GPUs. OpenAI also expanded ChatGPT features with image generation for all free users and a new voice release. Runway Gen-4 enhances animation for miniature dioramas, and LangChain launched a chat-based generative UI agent. Commercial deployment of Figure 03 humanoid robots at BMW highlights advances in autonomy and manufacturing scaling. New tools include OpenAI's realtime transcription API with WebRTC support and Amazon's Nova Act AI browser agent.