All tags
Company: "deutsche-telekom"
not much happened today
trillium gemini-2.5-pro gemini-deepthink google huawei epoch-ai deutsche-telekom nvidia anthropic reka-ai weaviate deepmind energy-efficiency datacenters mcp context-engineering instruction-following embedding-models math-reasoning benchmarking code-execution sundarpichai yuchenj_uw teortaxestex epochairesearch scaling01 _avichawla rekaailabs anthropicai douwekiela omarsar0 nityeshaga goodside iscienceluvr lmthang
Google's Project Suncatcher prototypes scalable ML compute systems in orbit using solar energy with Trillium-generation TPUs surviving radiation, aiming for prototype satellites by 2027. China's 50% electricity subsidies for datacenters may offset chip efficiency gaps, with Huawei planning gigawatt-scale SuperPoDs for DeepSeek by 2027. Epoch launched an open data center tracking hub, and Deutsche Telekom and NVIDIA announced a $1.1B Munich facility with 10k GPUs. In agent stacks, MCP (Model-Compute-Platform) tools gain traction with implementations like LitServe, Claude Desktop, and Reka's MCP server for VS Code. Anthropic emphasizes efficient code execution with MCP. Context engineering shifts focus from prompt writing to model input prioritization, with reports and tools from Weaviate, Anthropic, and practitioners highlighting instruction-following rerankers and embedding approaches. DeepMind's IMO-Bench math reasoning suite shows Gemini DeepThink achieving high scores, with a ProofAutoGrader correlating strongly with human grading. Benchmarks and governance updates include new tasks and eval sharing in lighteval.
Anthropic's $61.5B Series E
gpt-4.5 claude-3.7-sonnet deepseek-r1 anthropic openai deepseek lmsys perplexity-ai deutsche-telekom model-performance benchmarking style-control coding multi-turn funding partnerships workflow lmarena_ai teortaxestex casper_hansen_ omarsar0 aidan_mclau willdepue vikhyatk teknim1 reach_vb _aidan_clark_ cto_junior aravsrinivas
Anthropic raised a $3.5 billion Series E funding round at a $61.5 billion valuation, signaling strong financial backing for the Claude AI model. GPT-4.5 achieved #1 rank across all categories on the LMArena leaderboard, excelling in multi-turn conversations, coding, math, creative writing, and style control. DeepSeek R1 tied with GPT-4.5 for top performance on hard prompts with style control. Discussions highlighted comparisons between GPT-4.5 and Claude 3.7 Sonnet in coding and workflow applications. The importance of the LMSYS benchmark was emphasized, though some questioned the relevance of benchmarks versus user acquisition. Additionally, Perplexity AI partnered with Deutsche Telekom to integrate the Perplexity Assistant into a new AI phone.