All tags
Person: "deredleritt3r"
OpenEvidence, the โChatGPT for doctors,โ raises $250m at $12B valuation, 12x from $1b last Feb
claude claude-3 claude-opus gpt-5.2 gemini-3-flash-high openevidence anthropic podium openai google gemini agentic-ai model-alignment performance-evaluation memory-optimization long-context benchmarking multi-agent-systems reinforcement-learning daniel_nadler amanda_askell eric_rea tom_loverro garry_tan omarsar0 brendanfoody deredleritt3r
OpenEvidence raised $12 billion, a 12x increase from last year, with usage by 40% of U.S. physicians and over $100 million in annual revenue. Anthropic released a new Claude model constitution under CC0 1.0, framing it as a living document for alignment and training. Podium reported over $100 million ARR from 10,000+ AI agents, shifting from software sales to AI operators. Innovations in agent memory and reliability include the Agent Cognitive Compressor (ACC) and multi-agent scientific workflows via MCP-SIM. Agentic benchmarking shows challenges in long-horizon tasks with models like Gemini 3 Flash High, GPT-5.2 High, and Claude Opus 4.5 High scoring modestly on professional services and legal research benchmarks.
Claude Haiku 4.5
claude-3.5-sonnet claude-3-haiku claude-3-haiku-4.5 gpt-5 gpt-4.1 gemma-2.5 gemma o3 anthropic google yale artificial-analysis shanghai-ai-lab model-performance fine-tuning reasoning agent-evaluation memory-optimization model-efficiency open-models cost-efficiency foundation-models agentic-workflows swyx sundarpichai osanseviero clementdelangue deredleritt3r azizishekoofeh vikhyatk mirrokni pdrmnvd akhaliq sayashk gne
Anthropic released Claude Haiku 4.5, a model that is over 2x faster and 3x cheaper than Claude Sonnet 4.5, improving iteration speed and user experience significantly. Pricing comparisons highlight Haiku 4.5's competitive cost against models like GPT-5 and GLM-4.6. Google and Yale introduced the open-weight Cell2Sentence-Scale 27B (Gemma) model, which generated a novel, experimentally validated cancer hypothesis, with open-sourced weights for community use. Early evaluations show GPT-5 and o3 models outperform GPT-4.1 in agentic reasoning tasks, balancing cost and performance. Agent evaluation challenges and memory-based learning advances were also discussed, with contributions from Shanghai AI Lab and others. "Haiku 4.5 materially improves iteration speed and UX," and "Cell2Sentence-Scale yielded validated cancer hypothesis" were key highlights.