Frozen AI News archive

DeepSeek-OCR finds vision models can decode 10x more efficiently with ~97% accuracy of text-only, 33/200k pages/day/A100

As **ICCV 2025** begins, **DeepSeek** releases a novel **DeepSeek-OCR** 3B MoE vision-language model that compresses long text as visual context with high accuracy and efficiency, challenging traditional tokenization approaches. The model achieves ~97% decoding precision at <10× compression and processes up to ~33M pages/day on 20 A100-40G nodes, outperforming benchmarks like GOT-OCR2.0. Discussions highlight the potential for unlimited context windows and tokenization-free inputs, with contributions from **@karpathy**, **@teortaxesTex**, and others. In video generation, **google-deepmind**'s **Veo 3.1** leads community benchmarks with advanced precision editing and scene blending, while **Krea** open-sources a 14B autoregressive video model enabling realtime long-form generation at ~11 FPS on a single B200 GPU.

Canonical issue URL

Vision is all you need?

AI News for 10/17/2025-10/20/2025. We checked 12 subreddits, 544 Twitters and 23 Discords (198 channels, and 14010 messages) for you. Estimated reading time saved (at 200wpm): 1097 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

As ICCV kicks off in Hawaii, DeepSeek continues to show signs of life. This one is a relatively small paper with 3 authors, and a small 3B model, but the contribution of a SAM+CLIP+compressor named DeepEncoder:

and the headline findings are sound:

The significance of a very good OCR model, beyond liberating a lot of data from books and PDFs, is the opportunity to always consume rich text and get rid of the tokenizer.


AI Twitter Recap

DeepSeek’s “Optical Context Compression” OCR and the end of text-only context?

Video generation: Veo 3.1 leaps ahead; Krea Realtime goes OSS

Agentic coding stacks, governance, and enterprise posture

Infra resilience and performance tooling

Evals and benchmarks: real money, real leaderboards, and structured reasoning

Domain tools: Life sciences, data pipelines, and structured extraction

Top tweets (by engagement)


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. DeepSeek OCR Release

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. Robotics Innovations

2. AGI Predictions and History


AI Discord Recap

A summary of Summaries of Summaries by gpt-5

1. AI Video Generation Showdown

2. Kernel DSLs and Quantization Updates

3. New Models, Datasets, and Agent Tooling

4. Portable GPU Compute on Macs

5. Research & Evaluation Highlights


Discord: High level Discord summaries

Perplexity AI Discord


LMArena Discord


Unsloth AI (Daniel Han) Discord


OpenAI Discord


OpenRouter Discord


LM Studio Discord


HuggingFace Discord


GPU MODE Discord


Latent Space Discord


Eleuther Discord


Moonshot AI (Kimi K-2) Discord


Nous Research AI Discord


Manus.im Discord Discord


Modular (Mojo 🔥) Discord


DSPy Discord


Yannick Kilcher Discord


aider (Paul Gauthier) Discord


tinygrad (George Hotz) Discord


MCP Contributors (Official) Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Windsurf Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

Perplexity AI ▷ #general (1099 messages🔥🔥🔥):

Comet Referral Program, AWS Outage, GPT-5 vs Claude, Scientific Method, MCP


Perplexity AI ▷ #sharing (11 messages🔥):

Perplexity AI, TikTok Video, Claude Sonnet 4.5, Shareable Threads, AWS Dashboard


Perplexity AI ▷ #pplx-api (8 messages🔥):

API pricing, ZTT server


LMArena ▷ #general (1182 messages🔥🔥🔥):

Lithiumflow's Coding Capabilities, Gemini 3 Speculation, AI Video Generation Quality, Claude Sonnet 4.5


LMArena ▷ #announcements (1 messages):

Text-to-Video Leaderboard, Image-to-Video Leaderboard, Veo-3.1 ranking


Unsloth AI (Daniel Han) ▷ #general (1106 messages🔥🔥🔥):

Apple hardware support in Unsloth, AMD x PyTorch Hackathon, Training reasoning models for legal systems, Synthetic data generation with synonyms, Choosing models for coding tasks


Unsloth AI (Daniel Han) ▷ #introduce-yourself (8 messages🔥):

Software Engineer Introduction, AI Bot Development, New Tricks for Old Dogs


Unsloth AI (Daniel Han) ▷ #off-topic (426 messages🔥🔥🔥):

Qwen 2.5 VL Issues, Hackathon Challenges and Synthetic Data, GPU vram burden, Diff2flow, RP Stat thing


Unsloth AI (Daniel Han) ▷ #help (73 messages🔥🔥):

FailOnRecompileLimitHit, Gemma3-270m decoding, TTS and ASR model training, GRPO recipe for gpt oss 20b, QWEN2.5 7B chat template


Unsloth AI (Daniel Han) ▷ #showcase (14 messages🔥):

Luau-Qwen3-4B-FIM-v0.1, Training Configurations, Luau-Devstral-24B-Instruct-v0.2, Brainstorm adapter


Unsloth AI (Daniel Han) ▷ #research (12 messages🔥):

AI model thinking modes, xLLMs Dataset Collection, Double descent history and papers


OpenAI ▷ #ai-discussions (569 messages🔥🔥🔥):

Comcast data selling, Stealth model, Grok video, Sora 2 video quality, Sora 2 invites


OpenAI ▷ #gpt-4-discussions (42 messages🔥):

Agentic AI Hackathon, VPN use with ChatGPT/Sora, DALL-E private image generation, ChatGPT access in universities, Sora Access and "k0d3"


OpenAI ▷ #prompt-engineering (51 messages🔥):

Sora's ability to distinguish real vs fictional images, Prompt engineering for Sora, Controlling ChatGPT output, Sora screenshake VFX, Learning prompt engineering


OpenAI ▷ #api-discussions (51 messages🔥):

Sora recognizing real vs fictional images, Sora account for movie trailers, Prompt engineering resources, Screenshakes or other VFX in Sora 2


OpenRouter ▷ #app-showcase (125 messages🔥🔥):

TLRAG framework, Deterministic AI, Model Agnostic Framework, OpenRouter user demographic, AI Slop


OpenRouter ▷ #general (428 messages🔥🔥🔥):

SambaNova latency in DeepSeek v3.1, AI SDK Anthropics models in OpenRouter, Default LLM for market research - gemini 2.5 pro?, Is llama doing anything good?, Support Agent SKILLS from Claude?


OpenRouter ▷ #new-models (1 messages):

Readybot.io: OpenRouter - New Models


OpenRouter ▷ #discussion (93 messages🔥🔥):

Fake AI Product Success Rates, AI Art in Corporate Branding, Qwen3 235A22B API Pricing, Liquid stopped hosting LFM 7b, AWS Status Page


LM Studio ▷ #general (295 messages🔥🔥):

Llama Bench support, MCP servers, OpenHands agentic framework, System Prompts, Jinja Template issue


LM Studio ▷ #hardware-discussion (339 messages🔥🔥):

GPU Cooling Solutions, 3090 Hotspot Issues, Auxiliary GPU Usage, EPYC vs Threadripper for LLMs, ROCm Support for Older GPUs


HuggingFace ▷ #general (535 messages🔥🔥🔥):

Fine-tuning LLMs, HF Inference API & text generation, MCPs vs Agents, Image analysis and labeling, AWS outage


HuggingFace ▷ #today-im-learning (6 messages):

NivasaAI Dynamic Routing, Google ADK, Agent + Tools, Max limitations


HuggingFace ▷ #cool-finds (4 messages):

Qwen3 Vision Model, Microsoft Protein Functionality Prediction


HuggingFace ▷ #i-made-this (14 messages🔥):

Agent Building Tutorial, LLM Token Hallucination, New Architecture for Decentralized Tokenizers, Self-Hosted Tracing for OpenAI Agents, Amiko: Social Identity Platform for AI Twins


HuggingFace ▷ #computer-vision (2 messages):

``


HuggingFace ▷ #NLP (1 messages):

Chat Template Conversion, Tokenizer Execution, Fine-Tuning Script


HuggingFace ▷ #smol-course (16 messages🔥):

Leaderboard Submission Delay, CUDA Out of Memory Errors in SMOL Course, Lighteval Bug Fix, DPO Exploration Lacking


HuggingFace ▷ #agents-course (13 messages🔥):

Course Starting, SmolAgents Framework, DeepFabric for SLMs


GPU MODE ▷ #general (19 messages🔥):

Triton distributed talk, Helion talk, Category Theory and AI resources, Impossible Cloud Network (ICN) collab, AMD event


GPU MODE ▷ #triton (15 messages🔥):

TMA performance, Triton host TensorDescriptor, algebraic shuffle


GPU MODE ▷ #cuda (21 messages🔥):

Thread Block vs CTA, Distributed Shared Memory Latency/Bandwidth, TMA vs cp.async, CUDA Learning Resources, Device-Agnostic TMA Logic


GPU MODE ▷ #torch (15 messages🔥):

Pytorch profiler CUDA issues, AlphaZero Compute Requirements, Matmul Fusion, Process scheduling issues


GPU MODE ▷ #cool-links (2 messages):

GPU Engineering Resources, PMPP-Eval Journey


GPU MODE ▷ #jobs (1 messages):

Seed Stage SF Startup, GPU Performance Engineers, Herdora Hiring


GPU MODE ▷ #pmpp-book (1 messages):

mannythecreator: Could you please share some url that talks about these.


GPU MODE ▷ #torchao (8 messages🔥):

vLLM quantization, SGLang quantization support, Online quantization, ModuleFqnToConfig, torchao_utils.py


GPU MODE ▷ #off-topic (13 messages🔥):

geohot, GPUs go brrr, DGX Spark impressions, Blackwell instructions


GPU MODE ▷ #irl-meetup (2 messages):

San Diego Meetup, Orange County Meetup


GPU MODE ▷ #triton-puzzles (2 messages):

Triton Kernels, Fused Kernels, Kernel Assistance


GPU MODE ▷ #rocm (10 messages🔥):

AMD Warp Sizes vs NVIDIA, MI300x Cache Coherency, Warp Tiling, GEMM Occupancy


GPU MODE ▷ #tilelang (1 messages):

TileLang, Deepseek V32, Sparse MLA


GPU MODE ▷ #self-promotion (5 messages):

Nvidia DGX, Petaflop Compute, MMA Atoms in CuTe, CUTLASS docs, Blogpost on MMA Atoms


GPU MODE ▷ #🍿 (1 messages):

LLM for Kernel Generation, LLM for Bottleneck Identification


GPU MODE ▷ #thunderkittens (1 messages):

Thread Execution Clarification, Collective Launch Behavior, TMA Operations in New TK, Prefix Meaning in New TK


GPU MODE ▷ #submissions (3 messages):

VectorAdd Leaderboard, H100 Results, B200 Results, A100 Results


GPU MODE ▷ #ppc (4 messages):

CP5, coalescing memory accesses, tiling, Transpose


GPU MODE ▷ #hardware (20 messages🔥):

H100 server prices, RTX 6000 ADA TFLOPs variance, Benchmarking nuances and thermal throttling, NVLink bridge prices, CuBLAS autotuning


GPU MODE ▷ #factorio-learning-env (1 messages):

``


GPU MODE ▷ #amd-competition (2 messages):

Competition Submissions, Winner Write-ups


GPU MODE ▷ #cutlass (22 messages🔥):

CUTLASS tile size tuning, CuTe in non-CMake projects, MoE Grouped GEMM throughput, CUTLASS naming conventions, PTX code generation


GPU MODE ▷ #singularity-systems (7 messages):

SITP, picograd, Karpathy's Eureka Course, MLSys 2026 tutorial, Tinygrad


GPU MODE ▷ #multi-gpu (28 messages🔥):

Expert Parallelism (EP) for MoE with AllToAll, Combining EP with DP, Parallel folding, Multi-GPU training with Triton, Iris for Multi-GPU training on AMD GPUs


GPU MODE ▷ #irl-accel-hackathon (14 messages🔥):

Synthetic Data AI Agents Challenge, Nvidia DGX Spark, Disaggregated prefill/decode, Speculative decoding, kernel optimization


GPU MODE ▷ #opencl-vulkan (3 messages):

CUDA, OpenCL, Vulkan


GPU MODE ▷ #cluster-management (2 messages):

Fault Tolerant Llama Training, Node Failure Prediction


GPU MODE ▷ #llmq (2 messages):

Qutlass integration


GPU MODE ▷ #helion (3 messages):

Helion, PyTorch Conference, Triton Developer Conference, Helion 0.2


Latent Space ▷ #ai-general-chat (133 messages🔥🔥):

ManusAI v1.5, AI sentiment shift, Cursor Git worktree support, Grok 4.20, GPT-4o transcribe diarize


Latent Space ▷ #ai-announcements (6 messages):

Lightning Pods, X-Ware.v0, Elie's 2025 State of Pre-training Podcast


Latent Space ▷ #private-agents (14 messages🔥):

NPU programming, AMD's NPU approach, eGPUs, tinygrad's eGPU support, RTX 3090 buying guide


Latent Space ▷ #genmedia-creative-ai (8 messages🔥):

AI-Generated Luxe Escapism, Endless Summer AI photobooth


Eleuther ▷ #general (95 messages🔥🔥):

LLM Training Speed on RTX 3090, Minimum Model Size for Coherence, Pretraining vs Fine-tuning, EleutherAI Discord Server Tag


Eleuther ▷ #research (37 messages🔥):

Attribution graphs, Diffusion models vs LLMs, Continuous and binary rewards in RL, Evaluation-awareness experiments, NormUon optimizer


Eleuther ▷ #interpretability-general (13 messages🔥):

Anthropic's Biology of LLMs paper, Cross layer transcoders with attribution graphs for diffusion models, Finetuning Llama-3 to count words, Subtracting the space_direction, Decoupling data complexity from model complexity


Eleuther ▷ #lm-thunderdome (1 messages):

Eval Harness, lm-evaluation-harness, MMLU, repeats


Moonshot AI (Kimi K-2) ▷ #general-chat (139 messages🔥🔥):

DeepSeek vs Moonshot, Groq's Kimi Implementation, Kimi K2 Troubleshooting, Prediction Markets, Quant Trading


Nous Research AI ▷ #general (94 messages🔥🔥):

GLM 4.6, Claude's Coding Monopoly, AI Learning Resources, LLM Reasoning, OS Model Development


Nous Research AI ▷ #research-papers (8 messages🔥):

Sampling method, AI Safety, Clinical AI, Healthcare AI


Nous Research AI ▷ #interesting-links (2 messages):

ScaleRL for Sparse Models, Trajectory vs Sample Level Loss Aggregation, Iterative RL Reasoning


Nous Research AI ▷ #research-papers (8 messages🔥):

Sampling Method, International AI Safety Report 2025, Healthcare AI Safety


Manus.im Discord ▷ #general (107 messages🔥🔥):

Manus infrastructure outage, Manus credits disappearing, Free perplexity pro, Open Sourcing Manus, Manus google drive connection


Modular (Mojo 🔥) ▷ #general (54 messages🔥):

Mojo compiler self-hosted?, Mojo and MLIR, MAX Kernels Backend, Advantages of MAX Dynamic Shapes, Mojo vs Python


Modular (Mojo 🔥) ▷ #mojo (12 messages🔥):

UDP Sockets in Mojo, Mojo vs Rust vs C++, itertools package in Mojo


Modular (Mojo 🔥) ▷ #max (4 messages):

Max Backend for PyTorch, PyTorch Nightly Use


DSPy ▷ #show-and-tell (1 messages):

LM Studio, DSPy framework, llms.txt generator


DSPy ▷ #general (64 messages🔥🔥):

Claude Agents, Clojure REPL environments, Typing DSPy, Gemini models in DSPy, RLM implementation


Yannick Kilcher ▷ #general (38 messages🔥):

Weekly tautological counter, Reinforcement learning framework for PyTorch, Graph neural networks, AI engineer qualifications, ML Debugging interviews


Yannick Kilcher ▷ #paper-discussion (18 messages🔥):

Lip Movement Algorithm, Paper Machine Unlearning, Backscatter IoT Applications, RL Training-Free Sampling Method


Yannick Kilcher ▷ #agents (2 messages):

AI Agent Movie Inspiration, Voice integration


Yannick Kilcher ▷ #ml-news (5 messages):

Qwen3 Vision Model, aidaw.com, Unitree Robotics, DeepSeek-OCR


aider (Paul Gauthier) ▷ #general (32 messages🔥):

Aider Status and Roadmap, Integrating Agentic Extensions into Aider, Devstral Small Model Feedback, aider-ce vs Codex CLI


aider (Paul Gauthier) ▷ #questions-and-tips (5 messages):

Commit Message Reasoning, Read-Only Files, Aider Style Guidelines


tinygrad (George Hotz) ▷ #general (12 messages🔥):

nanochat, SHRINK and PAD, CI for external PR, usb gpu, MFLOPS and MB/s


tinygrad (George Hotz) ▷ #learn-tinygrad (5 messages):

Gradient Accumulation, TinyJit, Manual Gradient Division


MCP Contributors (Official) ▷ #general (5 messages):

MCP Access, Webrix MCP Gateway, Docker MCP Gateway Multi-Tenant, MCP auth extension, oauth scope granularity