All tags
Person: "mirrokni"
OpenRouter's State of AI - An Empirical 100 Trillion Token Study
grok-code-fast gemini-3 gemini-3-deep-think gpt-5.1-codex-max openrouter deepseek anthropic google google-deepmind reasoning coding tokenization long-context model-architecture benchmarking agentic-ai prompt-engineering quocleix noamshazeer mirrokni
OpenRouter released its first survey showing usage trends with 7 trillion tokens proxied weekly, highlighting a 52% roleplay bias. Deepseek's open model market share has sharply declined due to rising coding model usage. Reasoning model token usage surged from 0% to over 50%. Grok Code Fast shows high usage, while Anthropic leads in tool calling and coding requests with around 60% share. Input tokens quadrupled and output tokens tripled this year, driven mainly by programming use cases, which dominate spending and volume. Google launched Gemini 3 Deep Think, featuring parallel thinking and achieving 45.1% on ARC-AGI-2 benchmarks, and previewed Titans, a long-context neural memory architecture scaling beyond 2 million tokens. These advances were shared by Google DeepMind and Google AI on Twitter.
Claude Haiku 4.5
claude-3.5-sonnet claude-3-haiku claude-3-haiku-4.5 gpt-5 gpt-4.1 gemma-2.5 gemma o3 anthropic google yale artificial-analysis shanghai-ai-lab model-performance fine-tuning reasoning agent-evaluation memory-optimization model-efficiency open-models cost-efficiency foundation-models agentic-workflows swyx sundarpichai osanseviero clementdelangue deredleritt3r azizishekoofeh vikhyatk mirrokni pdrmnvd akhaliq sayashk gne
Anthropic released Claude Haiku 4.5, a model that is over 2x faster and 3x cheaper than Claude Sonnet 4.5, improving iteration speed and user experience significantly. Pricing comparisons highlight Haiku 4.5's competitive cost against models like GPT-5 and GLM-4.6. Google and Yale introduced the open-weight Cell2Sentence-Scale 27B (Gemma) model, which generated a novel, experimentally validated cancer hypothesis, with open-sourced weights for community use. Early evaluations show GPT-5 and o3 models outperform GPT-4.1 in agentic reasoning tasks, balancing cost and performance. Agent evaluation challenges and memory-based learning advances were also discussed, with contributions from Shanghai AI Lab and others. "Haiku 4.5 materially improves iteration speed and UX," and "Cell2Sentence-Scale yielded validated cancer hypothesis" were key highlights.