Frozen AI News archive

Claude Haiku 4.5

**Anthropic** released **Claude Haiku 4.5**, a model that is over 2x faster and 3x cheaper than **Claude Sonnet 4.5**, improving iteration speed and user experience significantly. Pricing comparisons highlight Haiku 4.5's competitive cost against models like **GPT-5** and **GLM-4.6**. **Google** and **Yale** introduced the open-weight **Cell2Sentence-Scale 27B (Gemma)** model, which generated a novel, experimentally validated cancer hypothesis, with open-sourced weights for community use. Early evaluations show **GPT-5** and **o3** models outperform **GPT-4.1** in agentic reasoning tasks, balancing cost and performance. Agent evaluation challenges and memory-based learning advances were also discussed, with contributions from Shanghai AI Lab and others. *"Haiku 4.5 materially improves iteration speed and UX,"* and *"Cell2Sentence-Scale yielded validated cancer hypothesis"* were key highlights.

Canonical issue URL

yay fast Claude

AI News for 10/14/2025-10/15/2025. We checked 12 subreddits, 544 Twitters and 23 Discords (197 channels, and 6317 messages) for you. Estimated reading time saved (at 200wpm): 479 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

There was a time when entire model families launched at one go, but now different sizes are launched at different times presumably immediately after they are ready and without regard to the storytelling of the daily AI news newsletter writer. Anyway, Anthropic have followed up Claude Sonnet 4.5 with Haiku 4.5 (system card here), entirely skipping Haiku 4.0 and 4.1. It's meant to be almost as good as Sonnet 4.5, but more than 2x as fast and 3x cheaper.

For those keeping track, here's the pricing vs peer models:

Haiku 3: I $0.25/M, O $1.25/M

Haiku 4.5: I $1.00/M, O $5.00/M

GPT-5: I $1.25/M, O $10.00/M

GPT-5-mini: I $0.25/M, O $2.00/M

GPT-5-nano: I $0.05/M, O $0.40/M

GLM-4.6: I $0.60/M, O $2.20/M

AI Twitter Recap

AI for Science: Open-weight C2S-Scale 27B (Gemma) yields validated cancer hypothesis

Small Models, Speed, and Agentic Cost-Performance

Agents: evaluations, memory, and orchestration

Training, optimization, and infrastructure notes

Product and multimodal releases

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. Apple M5 AI accelerator launch + DGX Spark hands-on benchmarks

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. Claude Haiku 4.5 launch and Google model demos (Gemini 3.0 Pro Nintendo sim, Veo 3.1)

2. OpenAI “Adult Mode” rollout: memes and hypocrisy callouts

3. AI social adoption vs IP rights: companions normalization and Japan’s anime/manga training pushback


AI Discord Recap

A summary of Summaries of Summaries by gpt-5

1. Claude Haiku 4.5 Model Rollout & Benchmarks

2. DGX Spark: Hype vs Throughput

3. Qwen3-VL Compact Models & Finetuning

4. Agentic Reasoning Research Heats Up

5. AI Infra & APIs Scale Up


Discord: High level Discord summaries

LMArena Discord


Perplexity AI Discord


OpenAI Discord


Unsloth AI (Daniel Han) Discord


OpenRouter Discord


Cursor Community Discord


HuggingFace Discord


Nous Research AI Discord


Latent Space Discord


LM Studio Discord


GPU MODE Discord


DSPy Discord


Yannick Kilcher Discord


Moonshot AI (Kimi K-2) Discord


Eleuther Discord


tinygrad (George Hotz) Discord


Modular (Mojo 🔥) Discord


aider (Paul Gauthier) Discord


MCP Contributors (Official) Discord


Windsurf Discord


Manus.im Discord Discord


MLOps @Chipro Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

LMArena ▷ #general (1409 messages🔥🔥🔥):

LM Arena Bugs, Video Generation Issues, Gemini 3.0 Pro, Veo 3.1, Automation for A/B testing


LMArena ▷ #announcements (1 messages):

Claude Haiku, Video Arena Models


Perplexity AI ▷ #general (1251 messages🔥🔥🔥):

Perplexity Pro Benefits, Comet Browser Issues, Grok 4 Reasoning Toggle, ChatGPT vs. Perplexity, Claude Haiku 4.5 model released


Perplexity AI ▷ #sharing (3 messages):

NVIDIA DGX Spark, AI-generated voice fraud, FCC regulations on AI robocalls, call-screening


Perplexity AI ▷ #pplx-api (4 messages):

Spaces, Student Pro, API Key, n8n Node, Authorization Error


OpenAI ▷ #annnouncements (2 messages):

Expert Council on Well-Being and AI, ChatGPT Saved Memories Update


OpenAI ▷ #ai-discussions (822 messages🔥🔥🔥):

Permanent Memory LLMs, Vision Robot Capabilities, AI Event in Vegas, GPT6 Release with Memory, Stair Climbing Roomba


OpenAI ▷ #gpt-4-discussions (5 messages):

GPT-5 for STEM, Custom GPT Memory Issues


OpenAI ▷ #prompt-engineering (61 messages🔥🔥):

Dyscalculia and AI error checking, Prompt engineering learning, Reporting Messages, Harm fantasies


OpenAI ▷ #api-discussions (61 messages🔥🔥):

Image-related issues with models, Crossword solving limitations, Image generation struggles, Prompt engineering tips, Reporting messages on Discord


Unsloth AI (Daniel Han) ▷ #general (346 messages🔥🔥):

Qwen3-VL finetuning, Civitai content removal, New LLM architecture for context issues, QAT docs, Hugging Face rate limits


Unsloth AI (Daniel Han) ▷ #introduce-yourself (3 messages):

AI Project Development, NLP Tasks, Model Deployment, AI Agent Development


Unsloth AI (Daniel Han) ▷ #off-topic (213 messages🔥🔥):

Ryzen 10 series RAM, Unix OS in Rust, LLM OS, NVlabs' QeRL, Apple M5 Chip


Unsloth AI (Daniel Han) ▷ #help (17 messages🔥):

MOE Layers, T4 vs B200 speed, Qwen 2.5 Coder 14b for autocomplete, Training and hosting Qwen3 30B


Unsloth AI (Daniel Han) ▷ #showcase (8 messages🔥):

Anthropic, Copyright, Reasoning traces


Unsloth AI (Daniel Han) ▷ #research (7 messages):

arxiv.org/abs/2506.10943, Company Hack Week event, unsloths fastinference blog


OpenRouter ▷ #announcements (1 messages):

Claude Haiku 4.5, SWE-bench Verified, Sonnet 4.5, Frontier-class reasoning at scale


OpenRouter ▷ #general (401 messages🔥🔥):

Ling-1T issues and potential disabling, Caching in OpenRouter chats, CYOA games with AI, FP4 quality concerns, Claude Haiku 4.5 release


OpenRouter ▷ #new-models (1 messages):

Readybot.io: OpenRouter - New Models


OpenRouter ▷ #discussion (14 messages🔥):

OpenRouter 3.0, Anthropic payments, Sambanova Status, Google Deepmind praise


Cursor Community ▷ #general (372 messages🔥🔥):

Cursor outage, Plan Downgrades and pricing mishaps, Windsurf, Model Ensemble, Open Router


Cursor Community ▷ #background-agents (5 messages):

Background Agents vs Normal Agents, Async work with Background Agents, Customizing dev workflows with BAs, Project Management tool summons BAs


HuggingFace ▷ #general (215 messages🔥🔥):

Open Source LLM preference, Deepseek vs Qwen, Civitai content removal, Discord bot debugging, AMD Radeon GPUs with Stable Diffusion


HuggingFace ▷ #today-im-learning (1 messages):

GenAI techniques, Workflows for learning


HuggingFace ▷ #i-made-this (23 messages🔥):

MIT Licensed Dataset Usage, Phone Addiction Data Analysis, Nanochat Model Training, Discord conflict


HuggingFace ▷ #computer-vision (4 messages):

Object Identification, Contouring, Pixel Intensity Removal


HuggingFace ▷ #NLP (1 messages):

jazzco0151: https://discord.com/api/oauth2/token


HuggingFace ▷ #gradio-announcements (1 messages):

Agents & MCP Hackathon, Winter 2025 Hackathon


HuggingFace ▷ #smol-course (2 messages):

PEFT Configuration Issues, trackio Dependency Problem


HuggingFace ▷ #agents-course (3 messages):

Agent Course, Study Group, Course Progress


Nous Research AI ▷ #general (159 messages🔥🔥):

Long Context Datasets, Strix Halo vs DGX, Threadripper for Local Inference, HP Z2 Mini G1a, Meta Early Experience


Nous Research AI ▷ #ask-about-llms (1 messages):

Hermes-4-8-2, Psyche Network Runs, Model Card Discrepancy


Nous Research AI ▷ #research-papers (1 messages):

Agent Learning, Reinforcement Learning Challenges, META's early experience


Nous Research AI ▷ #research-papers (1 messages):

Agent Learning, Early Experience, META’s Agent Learning, Implicit world modeling, Self-reflection


Latent Space ▷ #ai-general-chat (126 messages🔥🔥):

OpenAI Codex Slow Inference, Qwen3-VL Models, Nvidia DGX Spark vs M4 Max MacBook Pro, OpenAI gpt-5-search-api, Karina Nguyen's AI Drop


Latent Space ▷ #private-agents (8 messages🔥):

AI Freelancing for Teenagers, NPU vs GPU for MoE Inference, NVIDIA DGX Spark as a Devkit


LM Studio ▷ #general (68 messages🔥🔥):

Qwen3 VL issue, MCP Servers in LM Studio, Structure Output Discussion, AMD 9070XT vs Nvidia 5070, System prompt token limits


LM Studio ▷ #hardware-discussion (63 messages🔥🔥):

MacBook Pro Battery Life, Nvidia Spark, Windows 11 vs Linux, LM Studio and normies, Wikipedia edits


GPU MODE ▷ #general (11 messages🔥):

DSA efficiency vs NSA paper, GPU programming trend, vLLM and SGLang batch invariance tests, Category theory in ML, Profiling rented GPUs with vLLM


GPU MODE ▷ #triton (5 messages):

Triton Algorithm Replacement, Triton IR Design, Triton's Layout Algebra


GPU MODE ▷ #cuda (5 messages):

PTX and SASS code for cluster sync, Tensor descriptor's L2Promotion argument, Async pipelined persistent cuda kernels, NCU timeline view


GPU MODE ▷ #announcements (1 messages):

torch.compile, Kernel programming DSL, Helion, compiler cache, diagrams for deep learning


GPU MODE ▷ #beginner (5 messages):

Pearson Correlation Kernel, Floating Point Precision, Online Course PPC Assignment


GPU MODE ▷ #irl-meetup (4 messages):

Multi-node kernel hackathon, NYC Meetup, Sweden, London


GPU MODE ▷ #triton-puzzles (1 messages):

codingmasterp: Do flashattention


GPU MODE ▷ #intel (8 messages🔥):

Crescent Island, LPDDR5X memory choice, Rubin CPX competition, Intel's Number Format Support, CXL-capability


GPU MODE ▷ #self-promotion (1 messages):

AlphaFold 3, MegaFold, Triton Optimizations


GPU MODE ▷ #🍿 (1 messages):

Agent Hacking, Kernelbench v0.1, Sakana AI


GPU MODE ▷ #thunderkittens (1 messages):

ROCm Support Timeline


GPU MODE ▷ #submissions (19 messages🔥):

amd-gemm-rs Leaderboard Updates, amd-ag-gemm Leaderboard Updates, amd-all2all Leaderboard Updates, MI300x8 Performance


GPU MODE ▷ #amd-competition (12 messages🔥):

MI300x Kernel, Distributed Comms, AMD GPUs, Competition Stats


GPU MODE ▷ #singularity-systems (3 messages):

infra changes, nanochat training, lambda H100 clusters


GPU MODE ▷ #general (3 messages):

Carl Bot, Reference Kernels


GPU MODE ▷ #multi-gpu (17 messages🔥):

Multi-GPU systems in HPC, Data movement research in multi-GPU HPC, RTX 6000 Pro and Blackwell architecture


GPU MODE ▷ #low-bit-training (1 messages):

kitsu5116: https://arxiv.org/pdf/2510.08757


GPU MODE ▷ #irl-accel-hackathon (1 messages):

Comet-style MoE kernels, fine grained overlapping, comms and compute


GPU MODE ▷ #llmq (1 messages):

llmq, unit tests


GPU MODE ▷ #helion (1 messages):

Helion, GPU Mode Talk


DSPy ▷ #general (93 messages🔥🔥):

Recursive Language Models (RLMs), Claude code vs RLMs, Tiny Recursive Models (TRMs), OpenAI's memory operations and DSPy, Sub-agents in DSPy


Yannick Kilcher ▷ #general (21 messages🔥):

Codex addon for vscode, GitHub pull requests with git hub agents, win+h dictation


Yannick Kilcher ▷ #paper-discussion (18 messages🔥):

Coding AI Completions, Tooling Affordance, DIAYN Paper Discussion, Mutual Information in RL


Yannick Kilcher ▷ #ml-news (3 messages):

Gemma Models, AI in Cancer Therapy


Moonshot AI (Kimi K-2) ▷ #general-chat (39 messages🔥):

Trickle vibe coding website, Aspen's Bitcoin Leverage Story, Gemini 2.5 is too old, Kimi K2 Update, Thinking vs Non-Thinking Models


Eleuther ▷ #general (7 messages):

Compute Funding for Research Group, LLM Situational Awareness Benchmarks


Eleuther ▷ #research (17 messages🔥):

SEAL optimizer, AdamW optimizer, AI/ML research, tensor logic, MAE training


Eleuther ▷ #interpretability-general (1 messages):

Devinterp.com, Neural Network Development


tinygrad (George Hotz) ▷ #learn-tinygrad (14 messages🔥):

Freezing parts of a matrix for training, Implementing LeNet-5 in tinygrad, Debugging optimizer issues, Nested TinyJit calls


Modular (Mojo 🔥) ▷ #general (7 messages):

ARM Linux Support, DGX Spark Compatibility, CUDA 13 Update, Jetson Thor Support, Mojo and MAX on ARM


Modular (Mojo 🔥) ▷ #mojo (5 messages):

Querying type in Mojo, get_type_name(), __type_of(a)


aider (Paul Gauthier) ▷ #general (2 messages):

OpenCode + GLM 4.6, aider.chat with Sonnet 4.5, OpenRouter/x-ai/grok-code-fast-1 integration


aider (Paul Gauthier) ▷ #questions-and-tips (2 messages):

Qwen2.5-Coder:7B Metadata, Ollama Integration Issues, Model Output Problems, Troubleshooting Model Errors


aider (Paul Gauthier) ▷ #links (2 messages):

Chinese provider with free tokens, Claude 4.5 available


MCP Contributors (Official) ▷ #general (4 messages):

Model Context Protocol, MCP Feature Support Matrix, SEP Document, Hierarchical Groups


Windsurf ▷ #announcements (2 messages):

Windsurf Patch 1.12.18 Release, Claude Haiku 4.5 Availability


Manus.im Discord ▷ #general (2 messages):

AI Tool Innovation, Subscription Model for AI Agents, Service Expectations in AI Communities, Project Mistakes and Learnings


MLOps @Chipro ▷ #events (1 messages):

Domain-Centric GenAI, Autonomous Data Products, RAG Systems, Domain-Specific Models