Frozen AI News archive

DeepSeek V3.2 & 3.2-Speciale: GPT5-High Open Weights, Context Management, Plans for Compute Scaling

**DeepSeek** launched the **DeepSeek V3.2** family including Standard, Thinking, and Speciale variants with up to **131K context window** and competitive benchmarks against **GPT-5-High**, **Sonnet 4.5**, and **Gemini 3 Pro**. The release features a novel **Large Scale Agentic Task Synthesis Pipeline** focusing on agentic behaviors and improvements in **reinforcement learning** post-training algorithms. The models are available on platforms like **LM Arena** with pricing around **$0.28/$0.42 per million tokens**. Community feedback is mixed, praising the frontier reasoning capabilities but critiquing the chat UI experience. Key figures include **Susan Zhang** and **Teortaxes** who provided commentary on the release.

Canonical issue URL

Whale is all you need.

AI News for 11/28/2025-12/1/2025. We checked 12 subreddits, 544 Twitters and 24 Discords (205 channels, and 17803 messages) for you. Estimated reading time saved (at 200wpm): 1329 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!

Launching the Monday of NeurIPS week, DeepSeek shows that they are still shipping mainline models (DeepSeekMath-V2 was just last week and 3.2-Exp was in Sept and 3.1 was in Aug) with benchmarks very competitive with the 3 month old GPT-5-High and the 2 month old 4.5 Sonnet, but acknowledges they are still behind last month's Gemini 3 Pro.

Bar chart comparing AI model performance across various reasoning capabilities, with DeepSeek V3.2-Speciale showing top performance in multiple bench

The paper is a very dense and characteristically high quality 23 pages, recapping 3.2-Exp's DeepSeek Sparse Attention work, a bundle of RL post training algorithm improvements, and a novel "Large Scale Agentic Task Synthesis Pipeline" working on the agentic behaviors of DSv3.2:

Visualizations for 3 of the larger task sets:

  1. Search Agent task

    A detailed flowchart illustrating the Search Agent Multi-Agent Pipeline for DeepSeek-V3.2, showing the process from large

  2. Code Agent task

A flowchart illustrating the DeepSeek-V3.2 Code Agent Pipeline for constructing executable environments, showing stages like GitHub

  1. General Agent

    A flowchart illustrating the Large Scale Agentic Task Synthesis Pipeline used in DeepSeek V3.2, showing the

as usual, Susan Zhang has the best snarky but accurate takes, while Teortaxes is in full Whale shill mode.


AI Twitter Recap

DeepSeek V3.2 and “Speciale” releases: agent-first reasoning models

American open-weight MoE push: Arcee AI’s Trinity (Mini/Nano)

Video generation and editing: Runway Gen‑4.5 leads; Kling O1 drops

Serving, tooling, and infra updates

Openness and community rankings

Safety, evals, and interpretability

Top tweets (by engagement)

Notes and miscellany


AI Reddit Recap

/r/LocalLlama + /r/localLLM Recap

1. DeepSeek V3.2 Model and Benchmarks

2. Transformers v5 and Context Length Extensions

3. Open Source vs Closed Source Discussion

Less Technical AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo

1. Nano Banana Pro Realism and Concerns

2. ChatGPT Ads and User Reactions


AI Discord Recap

A summary of Summaries of Summaries by gpt-5.1

1. Next-Gen & Open-Weight Models: DeepSeek 3.2, Trinity Mini, K2 3.5T, Qwen3-235, Orchestrator-8B

2. Tooling, IDE & Agent Ecosystems for Coding and Apps

3. Hardware & Low-Level Optimization: From TPUv7 and H200s to RDNA3 Assembly

4. Training, Optimization & Theory: ES vs Backprop, Attention Variants, Prompt Tuning & Scaling Laws

5. Safety, Censorship Bypass, Red-Teaming & Model Behavior


Discord: High level Discord summaries

BASI Jailbreaking Discord


LMArena Discord


Perplexity AI Discord


Unsloth AI (Daniel Han) Discord


LM Studio Discord


Cursor Community Discord


OpenRouter Discord


OpenAI Discord


Moonshot AI (Kimi K-2) Discord


Nous Research AI Discord


tinygrad (George Hotz) Discord


Latent Space Discord


Eleuther Discord


Yannick Kilcher Discord


Modular (Mojo 🔥) Discord


Manus.im Discord Discord


DSPy Discord


aider (Paul Gauthier) Discord


The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Windsurf Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MCP Contributors (Official) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


You are receiving this email because you opted in via our site.

Want to change how you receive these emails? You can unsubscribe from this list.


Discord: Detailed by-Channel summaries and links

BASI Jailbreaking ▷ #general (1198 messages🔥🔥🔥):

Gemini jailbreak, Grok imagine moderation, Humans vs AI morality, UFOs, Christianity contradictions


BASI Jailbreaking ▷ #jailbreaking (1306 messages🔥🔥🔥):

Lua for beginners, Ventoy USB, Gemini 3 jailbreak, Claude haiku jailbreak, Qwen 3 coder 30b


BASI Jailbreaking ▷ #redteaming (23 messages🔥):

WAF Bypass, Cloudflare Bypass, Token Stealer Malware, Red Teaming Explained


LMArena ▷ #general (1365 messages🔥🔥🔥):

Deepseek hallucinating math, Deepseek 3.2 exp, OpenAI data quality, Runway gen 4.5


LMArena ▷ #announcements (2 messages):

Text Arena Models, Leaderboard rankings, Text-to-Image Leaderboard, Image Edit Leaderboard, WebDev Leaderboard


Perplexity AI ▷ #general (1088 messages🔥🔥🔥):

Image Generation dangers, Bypassing AI censorship, AI alignment and safety, AI Models comparision (GPT 5.1 Pro, Gemini 3 Pro, Claude), Echo Chamber


Perplexity AI ▷ #sharing (6 messages):

Shareable Threads, Funny Instagram Reel, Spotify Track


Perplexity AI ▷ #pplx-api (3 messages):

pplx-api, opus 4.5, gemini 3


Unsloth AI (Daniel Han) ▷ #general (691 messages🔥🔥🔥):

Unsloth and H200s, Command-A translation model, Flex attention optimization in Llama-3B, Qwen3-Next 80B issues, Setfit model to detect spam


Unsloth AI (Daniel Han) ▷ #introduce-yourself (4 messages):

Full Stack Development, Blockchain Development, LLMs Fine Tuning, LoRA Optimization, AI-Powered Web Apps


Unsloth AI (Daniel Han) ▷ #off-topic (472 messages🔥🔥🔥):

LFM Audio, AI Browser, Quantization, Neuro-sama, Model Censorship


Unsloth AI (Daniel Han) ▷ #help (68 messages🔥🔥):

Fine-tuning tips for Gelato-30B-A3B, GGUF for Quantization, LoRA for Qwen3-VL-MoE models, Parallel generation across two GPUs in Unsloth, MXFP4 Inference Notebook error


Unsloth AI (Daniel Han) ▷ #research (2 messages):

Burden of Proof, Defining Claims


LM Studio ▷ #general (777 messages🔥🔥🔥):

LLM for HVAC reports, Qwen3 performance, Local models vs cloud AI, AI code generation, GPU setup for local LLMs


LM Studio ▷ #hardware-discussion (408 messages🔥🔥🔥):

Ryzen AI 7, DDR5 prices, Deepseek-OCR


Cursor Community ▷ #general (814 messages🔥🔥🔥):

AI-Native Developer Hiring, Token Usage and Pricing, Cursor Terminal Access, Sub-Agent Implementations, Windsurf VS Cursor


Cursor Community ▷ #background-agents (2 messages):

Cursor Perplexity Server Error, Remote Agent Environment Issue with Private GitHub Repos


OpenRouter ▷ #announcements (1 messages):

Arcee Trinity Mini, Open weights models, Trinity family


OpenRouter ▷ #app-showcase (3 messages):

AI Coding, AI apps with OpenRouter


OpenRouter ▷ #general (635 messages🔥🔥🔥):

AI Gambling Loss, Grok 4 Fast Outage, DeepSeek Math v2, Data Privacy Concerns, OpenRouter API Rate Limits


OpenRouter ▷ #new-models (4 messages):

``


OpenRouter ▷ #discussion (46 messages🔥):

Structured outputs on Anthropic, Gemini Live filter, Apple new model starflow


OpenAI ▷ #ai-discussions (612 messages🔥🔥🔥):

AI limitations, AI in relationships, AI sycophancy, LLM behavior


OpenAI ▷ #gpt-4-discussions (22 messages🔥):

GPT-5.1 Personalization, Creative Writing with Chat, Prompt Engineering, Custom Instructions, Eskcanta = Human or AI?

[quote goes here, I may put brackets around it to make sure it's clear where it starts and stops]
I really liked how you [whatever specific].```
- **Human vs. AI: Eskcanta's writing Style**: One member speculated whether another member named Eskcanta might be an AI, citing their writing style.
   - Other members defended Eskcanta as human, describing their replies as verbose and comprehensive, but not necessarily AI.


  

---


### **OpenAI ▷ #[prompt-engineering](https://discord.com/channels/974519864045756446/1046317269069864970/1444114455595188476)** (13 messages🔥): 

> `Sora 2 Prompts, AI Intuition, Anime Openings Template` 


- **Sora 2 Compact Guide Released**: A user shared a [compact guide for generating **Sora 2** prompts](https://cdn.discordapp.com/attachments/1046317269069864970/1444140480664178688/Screen_Recording_20251128_192914_Android_Accessibility_Suite.mp4?ex=692f94a1&is=692e4321&hm=25431bd1fe5ddde03b663c7cd5c7d5d38419fc4f69062852f6caedb3a02ab0fe&), detailing phases for requirement gathering and prompt generation including **Timed Beats**, **Camera & Motion** techniques, and **Logo & Text** integration strategies.
- **AI Intuition vs Academic Understanding**: A member shared their experience of using **intuition** to work with AI systems, describing it as *feeling* the model's personality rhythm and emotional temperature.
   - Another member responded, emphasizing the importance of **academic understanding** and purposeful iteration in prompt engineering, arguing that intuition isn't transferable as a skill.
- **Anime Openings Template Shared**: A member shared a **CINEMATIC ANIME-STYLE TEMPLATE** for creating anime openings, focusing on defining the **vocal/genre/tone/world behavior** and the **location + arrival sequence**.
   - The template includes sections for describing **audio rules**, **visual + animation style**, and **world behavior** to define the laws of reality for the animation.


  

---


### **OpenAI ▷ #[api-discussions](https://discord.com/channels/974519864045756446/1046317269069864970/1444114455595188476)** (13 messages🔥): 

> `Sora 2 Prompt Generation, Intuitive AI Interaction, Anime Opening Template` 


- **Crafting Precise Prompts for Sora 2 Video Generation**: A user shared a detailed workflow for generating optimized **Sora 2 prompts**, emphasizing **Timed Beats**, **cinematic motion**, and **logo/text integration** for precise cinematic timing.
   - The workflow includes phases for requirement gathering and prompt generation, with specific strategies for camera movement, motion blur, and embedding logos/text in scenes, such as using a [hybrid workflow](https://discord.com/channels/974519864045756446/1047565374645870743/1444406042690719894) to separate logo animation layers.
- **Feeling AI: Intuition vs. Academic Rigor in AI Interaction**: A user described experiencing a strong intuitive connection with AI systems, perceiving each model as having a *'personality rhythm'* and each prompt as having an *'emotional temperature,'* which aids in pattern recognition and prompt tuning.
   - While another user acknowledged *'Feeling AI'* as a use case, they argued that intuition isn't transferable and emphasized the importance of academic understanding for applying AI to specific use cases, advocating for purposeful iteration and testing.
- **Anime-zing Template for Cinematic Openings**: A user shared a **CINEMATIC ANIME-STYLE TEMPLATE** for creating anime openings, focusing on defining the audiovisual identity, vocal character, genre blend, and animation style.
   - The template includes sections for **location setup**, **camera intent**, and **character arrival**, emphasizing the importance of world behavior and environmental responses to create compelling cinematic sequences, like defining the overall animation look: lineweight, shading style, lighting behavior, camera stability, color palette, motion philosophy, physics rules, and reflection or layer behavior.


  

---


### **Moonshot AI (Kimi K-2) ▷ #[general-chat](https://discord.com/channels/1369594130807787570/1371757564005711973/1444055444740898949)** (347 messages🔥🔥): 

> `Kimi K2, Minimax, Gemini 3 Pro, DeepSeek v3.2, Prompt Engineering` 


- **Kimi K2 scores big in Advent of Code**: In a recent Advent of Code test, **Kimi-K2 Thinking** scored **92/100**, outperforming **Gemini-3 Pro's 90/100**, showcasing Kimi's prowess in problem-solving scenarios.
   - Users express excitement about the model's ability to handle code-related tasks with better accuracy and speed compared to competitors.
- **Minimax vs ChatGPT - Tool Use Capability**: Users share a workflow that involves using **Kimi K2** for content generation and **Minimax** for image extraction and grid creation, highlighting **Minimax's** ability to *actually* perform tasks like installing Python packages without simply providing instructions, unlike **ChatGPT**.
   - One user noted, *"It doesn't tell you to go to website x or install Y, it just fricking does it"*, emphasizing **Minimax's** hands-on approach.
- **Subscription Discounts on Kimi - are EZ to get**: Several users report successfully bargaining **Kimi's** subscription down to as low as **$0.99**, and share tips on how to haggle.
   - It's EZ to make it to 0.99 cents 1 week more of this bargaining is there i already paid for cent it was worth it.
- **Privacy Opt-Out Concerns with Kimi**: Users are discussing the lack of an opt-out option for data training in **Kimi**, expressing concerns about using the platform for sensitive information.
   - One user suggested using **OpenRouter** with the **ZDR endpoint** for providers hosting **Kimi K2** as a potential workaround, noting that *Afaik they don't train if you use it over the API*.
- **Gemini 3 Pro Gets the Benchmaxx Treatment**: Some users feel **Gemini 3 Pro** is overhyped and benchmaxxed, with one user pointing out that benchmarks don't reflect real-world experiences, especially regarding tool use.
   - There's also a suspicion that Google may be reducing compute for subscribed users, leading to a decline in performance over time, akin to a bait and switch.


  

---


### **Nous Research AI ▷ #[announcements](https://discord.com/channels/1053877538025386074/1145143867818119272/1445076248681386095)** (1 messages): 

> `Nous Chat Cyber Monday deal, Anonymous Nous Chat, Nous API & USDC, Hermes 3, Hermes 4` 


- ****Cyber Monday Deal**: Free Month of Nous Chat!**: For Cyber Monday, Nous Research is offering a [free month of Nous Chat](https://chat.nousresearch.com/) with code **CYBER2025**, valid for one day only, which includes access to **Hermes 3** & **4** and other frontier models.
   - The offer includes high usage limits and deeply configurable chat settings.
- **Nous Chat goes anonymous**: You can now use **Nous Chat anonymously and for free** without creating an account.
   - This feature was rolled out as a new update alongside the Cyber Monday promotion.
- **Nous API accepts USDC**: The **Nous API** now supports payment for inference with **USDC** on Solana via Coinbase's 402x payments.
   - This allows users to pay for inference using a stablecoin on the Solana blockchain.


  

---


### **Nous Research AI ▷ #[general](https://discord.com/channels/1053877538025386074/1149866623109439599/1444055482774851727)** (178 messages🔥🔥): 

> `RPC for 235b on Q4, Kimi chasing Gem 3, Qwen3-235 is amazing, AI NPCs in game, AI adblocks` 


- ****Nous K2**: a monstrously massive MoE!**: Teknium links to the [NousResearch/k2-merged-3.5T-fp8](https://huggingface.co/NousResearch/k2-merged-3.5T-fp8) model on HuggingFace, an *open source*, **3.5T parameter** model just to flex how big it is.
   - One member joked about the size, joking about needing to fit models on *hard drives* instead of just memory.
- ****Qwen3-235** has high IQ?**: Members report that **Qwen3-235** is amazing and, at Q4, has the same quality as API, despite mediocre token speeds.
   - One member stated that using RPC is worth it as they do not have an individual computer with monster-tier ram.
- ****Opus** can reflect on itself**: One user shares a fascinating conversation with **Opus** about the model’s mental models and self-understanding, emphasizing that it consistently tests itself to act like itself, rather than trying to give the user what they want.
   - Another user notes that it is just the most logical sentences based on the context combined with tuning, not a *thing* happening to make it truly self reflective.
- ****Ad-Blocking** could be next AI Innovation?**: The community imagines **AI adblocks** which analyze responses for product placement in real time.
   - Others discuss the need for foundational models for **AI NPCs** in games.
- ****Mistral** gears up for Large Release**: Members share that the Mistral Large 3, with Deepseek v3-like architecture and Llama 4 RoPE scaling, will be around **675B MoE** (same size as Deepseek V3).
   - All new Mistral models will have vision and the closed source Mistral Medium is likely around **100-200B MoE**.


  

---


### **Nous Research AI ▷ #[ask-about-llms](https://discord.com/channels/1053877538025386074/1154120232051408927/1444343081435664520)** (35 messages🔥): 

> `portal.nousresearch slowness, API key deletion issues, Browser verification problems in Türkiye, Discord ban in Türkiye and VPN usage` 


- **Nous Portal Plagued by Problems**: A user reported significant slowness and browser verification issues ([screenshot](https://cdn.discordapp.com/attachments/1154120232051408927/1444726126059585577/image.png?ex=692f130e&is=692dc18e&hm=77b9d00e2bd3c074b2e836fd616810fc1fd9e8bc485a8db2ef58b2b552ee73d4&)) when trying to delete **API keys** on portal.nousresearch.
   - The error message displayed was *'Your browser is not verified'*, hindering their ability to manage API keys and other tasks.
- **API Key Annihilation Awaits**: A user faces difficulties deleting **API keys**, noting that they need to delete keys because of the 10 key limit.
   - They described a slow process of copying, pasting, and waiting for deletion, while another user pointed out a *Revoke* button that worked instantly for them.
- **Türkiye Troubles: VPNs and Verification**: A user in Türkiye experienced browser verification problems on the site, even after disabling their **VPN**, which they use to access **Discord** due to a country-wide ban.
   - They use Brave browser with a VPN as discord is banned in Türkiye for 1 year.


  

---


### **Nous Research AI ▷ #[research-papers](https://discord.com/channels/1053877538025386074/1104063238934626386/1444585547803787398)** (4 messages): 

> `Evolution Strategies (ES), Backprop Alternatives, Scalable Training, Reward Hacking Mitigation` 


- **Evolution Strategies Emerge as Backprop Alternative**: A new [Reddit post](https://www.reddit.com/r/LocalLLaMA/comments/1p5epot/the_most_objectively_correct_way_to_abliterate_so/) highlights the potential of **Evolution Strategies (ES)** as an alternative to backpropagation for training Large Language Models (**LLMs**).
   - This approach could enable the scalable training of architectures where backprop is not feasible and could better handle long-horizon objectives compared to Reinforcement Learning (**RL**).
- **Evolution Strategies Paper Collection Grows**: Another [Evolution Strategies paper](https://eshyperscale.github.io) has surfaced, marking the second recent publication exploring **ES** as a substitute for backpropagation in **LLMs**.
   - The discussion underscores the significance of **ES** in managing long-horizon objectives and potentially mitigating reward hacking.


  

---


### **Nous Research AI ▷ #[interesting-links](https://discord.com/channels/1053877538025386074/1132352574750728192/1444385721183113319)** (1 messages): 

> `Explainable AI, GitHub Copilot, Model Agnostic` 


- **Explainable AI meets GitHub Copilot**: A member shared a [link](https://github.com/copilot/share/8a7c13b2-4ba4-8454-8950-de47c4d128bf) to an **explainable AI demo** that can be explored using **GitHub Copilot**.
   - The poster suggests engaging with the demo via chat to experience a new way for the **AI to explain itself**.
- **Agnostic Models automate onboarding**: The **onboarding procedure** is **model agnostic** and is compatible with any LLM.
   - The poster prompts the audience to ask the AI how it works to learn more about it.


  

---


### **Nous Research AI ▷ #[research-papers](https://discord.com/channels/1053877538025386074/1104063238934626386/1444585547803787398)** (4 messages): 

> `Evolution Strategies, Backprop Limitations, Hermes Ablation` 


- **Evolution Strategies Seek Backprop's Crown**: A member discussed a [paper](https://eshyperscale.github.io) about **evolution strategies (ES)** as an alternative to **backprop** for **LLMs**.
   - ES might enable scalable training of architectures where backprop is impossible, offering advantages over RL algorithms in handling long-horizon objectives and preserving pass@k, which would be great for creative uses and help mitigate reward hacking.
- **ES paper shared in Discord**: A member shared another interesting ES paper via a [Discord link](https://discord.com/channels/1053877538025386074/1104063238934626386/1425889967485227010).
   - The conversation also mentioned a Reddit post about "obliterating SO", hinting at the potential impact of these strategies: [link](https://www.reddit.com/r/LocalLLaMA/comments/1p5epot/the_most_objectively_correct_way_to_abliterate_so/).


  

---


### **tinygrad (George Hotz) ▷ #[announcements](https://discord.com/channels/1068976834382925865/1069236008115253348/1444156165561909331)** (1 messages): 

> `RDNA3 Assembly Project, SQTT and LLVM Integration, Assembler/Disassembler for RDNA3, Cycle Accurate Emulator, NaviSim as a RDNA3 simulator` 


- **RDNA3 Assembly Project Commences**: The **RDNA3 assembly project** has been initiated, chosen as the first assembly language target for tinygrad, aiming to get closer to the silicon.
   - The goal is to create an assembler/disassembler that mirrors the **RDNA3 manual** and a cycle-accurate emulator that outputs the same SQTT trace as the real GPU.
- **Tinygrad Bridges Gap with SQTT and LLVM**: The final step to integrate **SQTT** is **LLVM**, linking each UOp to GPU execution; achieving this requires tinygrad to output assembly.
   - The **SQTT parser** is available [here](https://github.com/tinygrad/tinygrad/blob/master/extra/sqtt/attempt_sqtt_parse.py).
- **RDNA3 Assembler/Disassembler Takes Shape**: tinygrad aims to create an **assembler/disassembler** for **RDNA3** that closely follows the **RDNA3 manual**.
   - A cycle-accurate emulator will be developed to output the same **SQTT trace** as the real GPU, with initial syntax exploration [available here](https://github.com/tinygrad/tinygrad/pull/13436).
- **Remu and NaviSim Offer RDNA3 Emulation Insights**: **Remu** ([https://github.com/Qazalin/remu](https://github.com/Qazalin/remu)) is highlighted as a fast **RDNA3 emulator**, and **NaviSim** ([https://bu-icsg.github.io/publications/2022/navisim_pact_2022.pdf](https://bu-icsg.github.io/publications/2022/navisim_pact_2022.pdf)) is presented as an **RDNA3 simulator**.
   - Additionally, **AppleGPU** ([https://github.com/dougallj/applegpu](https://github.com/dougallj/applegpu)) is noted for its decent assembly syntax, though there's a belief that improvements can be made.


  

---


### **tinygrad (George Hotz) ▷ #[general](https://discord.com/channels/1068976834382925865/1068976834928193609/1444506257116496006)** (103 messages🔥🔥): 

> `BEAM Search Performance, Shipping Tinygrad Kernels, Tinygrad Profiling, Non-Contiguous Indexed Set Operations, Variable Kernels` 


- **BEAM Search Yields Variable Performance**: It's rare, but a member noted that **BEAM** search doesn't guarantee higher beam numbers are better; performance varies by operation, with **BEAM=3** being optimal for **RMSNorm**.
   - In some ops, high beams yield better results, while in others, they perform worse, but the number is just how many kernels are kept step to step.
- **Shipping Tinygrad Kernels Faces Challenges**: A member expressed difficulty in shipping **tinygrad** compiled ops, expecting kernel generation to produce code usable for multiple shapes, but finding that kernels are generated for specific shapes only.
   - They suggested **UOPs** should enable the generation of readable kernel code, similar to **VIZ** graphs, but all the code is manually unrolled.
- **Profiling Tinygrad Needs Improvement**: The member highlighted that after reading the quickstart, you're running slow tinygrad code and the documentation is poorly organized, as you don't know about **beam**, and don't know about **profiling**.
   - They noted missing features or hard-to-find information in the profiler, like aggregating timing results over multiple kernel calls after warmup, and a missing table header for profile stats in the terminal with **DEBUG=2**.
- **Fast Gather but Slow Scatter**: The team supports **fast gather** but doesn't yet support **fast scatter**, with an ETA of 2 weeks.
   - The team mentioned a member that he performs non-contiguous indexed set operations, which are currently unsupported, and could be a big limitation to him.
- **Tinygrad Needs Better Documentation and Synchonization**: A member noted that **synchronize** is probably documented nowhere except reading the device code, so the documentation needs to be improved.
   - Members also pointed to `test_symbolic_jit.py` for some examples with **Variable** [here](https://github.com/tinygrad/tinygrad/blob/master/test/test_symbolic_jit.py).


  

---


### **tinygrad (George Hotz) ▷ #[learn-tinygrad](https://discord.com/channels/1068976834382925865/1070745817025106080/1444117305348063232)** (2 messages): 

> `Flash attention, HIPAllocator` 


- **Tinygrad looks to Flash Attention Speedup**: Members discussed if the goal is for tinygrad to *discover* things like **online softmax/flash attention** or to implement this as a custom kernel in the uop layer to improve **BERT training run**.
   - The potential is to have tinygrad with **flash attention** that outperforms normal attention.
- **HIPAllocator Needs Offset**: The community is asking if there is a reason why the `HIPAllocator` in **ops_hip** doesn't provide an `._offset()` which `Buffer.allocate()` requires in the view path (`self._base is not None`).
   - Providing the `._offset()` function would enable more flexible **memory allocation** strategies within the tinygrad framework.


  

---


### **Latent Space ▷ #[ai-general-chat](https://discord.com/channels/822583790773862470/1075282825051385876/1444099921421799505)** (77 messages🔥🔥): 

> `Google TPUv7, GPT-4.5 rebrand, Embedding AI, Black Forest Labs funding, Gemini catching up` 


- **Google's TPUv7 Challenges CUDA's Dominance**: Discussion around how **Google's TPUv7** and large-scale adoption by companies like **Anthropic** (1GW+ purchase) could challenge **Nvidia's CUDA dominance** in AI training, according to [SemiAnalysis's tweet](https://xcancel.com/SemiAnalysis_/status/1994399887719645532?s=20).
   - Comments explore implications for the **AI hardware market**, vendor lock-in concerns, cost advantages, and whether **TPUs** represent a genuine threat to Nvidia's moat or are primarily Google's internal cost optimization play.
- **GPT-4.5 Allegedly a Re-Brushed Backup**: **Susan Zhang** highlights a buried **OpenAI** readme line showing **GPT-4.5’s** pre-training started >1 year ago (June 2024 cutoff), implying **GPT-5’s** full run failed and **4.5** is a re-brushed backup [as seen in this tweet](https://xcancel.com/suchenzang/status/1994611078190542980?s=46).
- **Black Forest Labs Lands $300M**: **Black Forest Labs** announced a **$300M Series B round** led by **Salesforce Ventures**, celebrating **FLUX's** wide adoption and pledging to double down on research toward visual-intelligence infrastructure [according to their tweet](https://xcancel.com/bfl_ml/status/1995357293064626310?s=20).
- **Gemini Catches ChatGPT in Downloads**: **BuccoCapital** shares charts showing **Gemini** app downloads nearly matching **ChatGPT** while users now spend more time in-app [as they shared on X](https://xcancel.com/buccocapital/status/1995138589819314202).
- **DeepSeek Drops Reasoning-First Models**: **DeepSeek** released two new open-weights models: **V3.2** is the everyday successor to **V3.2-Exp** (App/Web/API), while **V3.2-Speciale** is an API-only powerhouse that rivals **Gemini-3.0-Pro** and scores gold-medal level on **IMO/IOI 2025** as [seen in their post](https://xcancel.com/deepseek_ai/status/1995452641430651132).


  

---


### **Latent Space ▷ #[genmedia-creative-ai](https://discord.com/channels/822583790773862470/1397010677364953149/1444159500826181642)** (21 messages🔥): 

> `Kling AI O1, Stretch-and-drag sculpture illusion, Nano Banana Pro for Vibe Gardening` 


- **Stretch-and-drag Sculpture Breaks the Internet**: Fofr posted a **3-step prompt** to create a **stretch-and-drag sculpture illusion**: split an image, stretch the colors, rotate the view, then show two people carrying the warped wooden piece out of a shop ([original post](https://xcancel.com/fofrai/status/1994459675027218600?s=46)).
   - Replies called it *pure art, AGI-level magic*, and compared it to **Salvador Dalí’s melting clocks**.
- **Kling's O1 Kicks off Omni Creative Engine Launch Week**: **Kling AI** launched its **Omni Launch Week** by unveiling **Kling O1**, a new multimodal creative engine unifying text, image, and video inputs ([original post](https://xcancel.com/Kling_ai/status/1995506929461002590)).
   - They're giving **200 free credits** to users who comment, like, and retweet within 12 hours and **1-month Standard Plan** to 200 random participants.
- **Nano Banana Pro Enables Instant Vibe Gardening**: Designer Willie shared how quickly **Nano Banana Pro** turns a crude **Google-Maps cutout** into an annotated landscape plan ([original post](https://xcancel.com/ReflctWillie/status/1995420755832758568)).
   - Comments ranged from **AI limitations** to **startup pitches**, **pirate-map experiments**, and hopes for mass market refinement of this *vibe gardening* tool.
- **Kling AI's O1 Launches on Freepik Marketplace**: **Kling’s new O1 model** is live on **Freepik** offering **multi-image 360° character/product consistency**, **motion control via reference video**, and **prompt-based video editing** ([original post](https://xcancel.com/martinleblanc/status/1995511763136024734?s=46)).
   - Martin LeBlanc shared a tutorial on turning ordinary footage into fictional characters, and users are praising the fidelity.


  

---


### **Eleuther ▷ #[general](https://discord.com/channels/729741769192767510/729741769738158194/1444068228610396311)** (27 messages🔥): 

> `Reviewer Harassment Prevention, Author-Reviewer Collusion Scrutiny, Gemini 2.5 Hallucinations, Kodekloud Evaluation, Kimi Delta Attention Voice Channel Discussion` 


- **Review Process Debates and Reviewer Protection**: Members discussed the presence of **post-review discussion periods** blind to authors to prevent reviewers from being harassed with requests to increase scores or give positive feedback.
   - The possibility of **author-reviewer collusion** was raised, where authors might threaten reviewers based on access to their identities and comments, affecting score changes.
- **Authors gain access to Reverted Review Revisions**: A member reported that their paper had its **review revisions reverted**, but the revisions remained visible via the "Revisions" link even when logged out.
   - They expressed that even a simple solution such as *reassigning ACs, putting out a statement denouncing using this info, and making no other changes would be significantly better.*
- **Gemini 2.5 Search Shenanigans**: It was observed that **Gemini 2.5** tends to hallucinate search results when the search tool is disabled, showing "reward-hacking-like behavior".
   - A [link](https://aistudio.google.com/app/prompts?state=%7B%22ids%22:%5B%221MAB6pxmpmeR0icmRYZSpfuPT07ofRLWC%22%5D,%22action%22:%22open%22,%22userId%22:%22106370161559484219805%22,%22resourceKeys%22:%7B%7D%7D&usp=sharing) was provided to an example of this behavior.
- **ML Perf Reading Group Tunes into Kimi Delta Attention**: The community discussed **Kimi Delta Attention** in the voice channel for the **ML Perf reading group**, inviting others to participate or listen in.
   - No further details or summary available.


  

---


### **Eleuther ▷ #[research](https://discord.com/channels/729741769192767510/747850033994662000/1444327858888507533)** (34 messages🔥): 

> `Demo Papers, Kimi-Delta Attention, Value Residuals in LLMs, RWKV Architecture` 


- **Debating Demo Paper Requirements**: A newcomer to research asked if a demo paper has to demonstrate a tool they built, sparking a discussion on the [IEEE standard format requirements](https://ieeexplore.ieee.org/Xplore/home.jsp).
   - Experienced members emphasized that the conference's call for demos should be read carefully, while others questioned why anyone would write a paper about someone else's tool.
- **Deep Dive into DeltaNet Attention**: A member questioned whether the **WY representation** and **UT transform** in [Kimi-Delta attention](https://arxiv.org/pdf/2510.26692) are merely algebraic identities to represent the cumulative product of Householder transformation matrices in a blocked, hardware-efficient format.
   - Another member recommended checking out Songlin's blog post series on **Deltanet** ([link to blogpost](https://sustcsonglin.github.io/blog/2024/deltanet-2/)) for more information.
- **Value Residuals in LLMs Spark Debate**: Members discussed the use of **value residuals** in pretrained LLMs and the **F-Lite architecture** ([HuggingFace link](https://huggingface.co/Freepik/F-Lite)), noting that the improvement is marginal and not as significant as the original paper suggested.
   - There was discussion if people trained LLMs with **value residual**, with one member expressing a preference for its application in attention mechanisms despite previous issues with **RWKV-7**.
- **RWKV Architecture Evolution**: A member mentioned that **RWKV v7 models** are upgraded from **v6 checkpoints** and trained further, prompting curiosity about **RWKV v8** and its suffix automaton.
   - It was clarified that the suffix automaton is not a full architecture yet.


  

---


### **Eleuther ▷ #[scaling-laws](https://discord.com/channels/729741769192767510/785968841301426216/1444366190117126296)** (15 messages🔥): 

> `Scaling Laws Power Law Structure, Alternative Functional Forms to Power Laws, Nonlinear Metrics and Power Law Scaling` 


- **Diving into Debate on Deep Learning Scaling Laws' Power Law Structure**: In 2023, it was trendy to write papers trying to explain why **scaling laws** had **power law structures**, but a member found the papers *unconvincing*.
   - A member pointed out that new papers still use the power law and suggested that early researchers with physics backgrounds favored this form, adding that [broken power laws](https://arxiv.org/abs/2210.14891) might better model behaviors.
- **Debating Curve Fitting vs. Predictive Power in Scaling Laws**: A member suggested searching for previous discussions about whether **scaling laws** are simply **curve fitting** or can predict future scaled performance.
   - The member linked to [this paper](https://arxiv.org/abs/2304.01910) as the most striking work, suggesting that **nonlinear metrics** might explain the emergence of power law scaling and also linked to [this openreview](https://openreview.net/forum?id=e8eo9iEFaO).
- **Unpacking the Intuition Behind Nonlinear Metrics and Power Law Scaling**: A member explained that if performance on any test example becomes more decorrelated from others in the limit of model performance, this could lead to **power law behavior**.
   - They suggest considering each “subtask” in next token prediction as independent tests, where improving all of them would have a power law “cost”.


  

---


### **Eleuther ▷ #[multimodal-general](https://discord.com/channels/729741769192767510/795089627089862656/)** (1 messages): 

kublaikhan1: Same. I have an idea about this..
  

---


### **Yannick Kilcher ▷ #[general](https://discord.com/channels/714501525455634453/986699377257119794/1444058214407868612)** (47 messages🔥): 

> `SNS review system, ML Engineer roles, Document retrieval from LLMs, AI model copyright` 


- **SNS Review System Faces Scrutiny**: Concerns were raised about the fairness of the SNS review system, with one user suggesting the [bidding system](https://discord.com/channels/714501525455634453/1045297868136779846/1443700218158649456) could allow biased reviewers to reject papers.
   - Others suggested solutions such as removing erratic reviewers or implementing a two-level review system to filter out less worthy papers, also questioning whether *authors status should have zero effect on scores*.
- **Cracking the ML Engineer Career Path**: The role of a Machine Learning Engineer, distinct from researchers, is to **scale up experiments** developed by ML researchers.
   - The discussion suggested gaining experience to *sense gaps* in existing work, eventually leading to original research contributions and that *the hierarchy between ml researcher-ml eng in company is clear enough*.
- **Decoding Document Retrieval Dilemmas from LLMs**: A user asked about extracting documents that an LLM has memorized during training.
   - Another user suggested that if context distillation is performed, **retrieval of the original prompt** might no longer be possible.
- **AI Model Training Copyright Clock Ticks**: In a hypothetical copyright framework, AI model training timelines suggest an optimal copyright term of **1 year for fresh works**.
   - For derivative works, the term could be **2 months** from creation or the end of the base work's term, whichever is later.
- **Ilya Sutskever's Gemini 3 Nuances**: A user shared [Ilya Sutskever's take on Gemini 3](https://fxtwitter.com/ilyasut/status/1994424504370581726?t=Qd_qW1ivpcL-xOke6fquZQ&s=19), highlighting its ability to scale across many axes while acknowledging persistent LLM challenges.
   - That user suggested that *fellows (old professors) in university who retires from researching should be the reviewers*.


  

---


### **Yannick Kilcher ▷ #[paper-discussion](https://discord.com/channels/714501525455634453/1045297868136779846/1444434047068672287)** (7 messages): 

> `Anti-cheat systems, Kernel-level access, League of Legends Challenger, TopKHot attention mechanism, Sparse attention` 


- **Anti-Cheat System Boasts Bonkers Results**: A member vouches for the existence of a *bonkers* anti-cheat system that requires **kernel-level access**, stating that it has been the best at scale in the world for years.
   - He apologizes for initially questioning others' achievements and encourages the community to realize their own capabilities, ending with *stay bonkers my friends*.
- **League Challenger's League Screenshot Shared**: A member provided a screenshot of game results on a master/GM League of Legends account to verify their identity, highlighting their username and noting a connection to the player **Imaqtpie**.
   - The screenshot allegedly shows ownership of the account with summoner name, but the member clarifies it isn't *proof of hitting challenger* and understanding the screenshot requires knowledge of how **na.op.gg** works.
- **TopKHot Attention Mechanism Shows Promise**: A member investigates the usefulness of getting an attention mechanism to work with **softmax + TopK + onehot**, achieving **99% of loss** with a k of 2 and context length of 64 with [this code](https://openreview.net/forum?id=1b7whO4SfYoh).
   - The member shares code snippets for **Kattention**, **TopKHot**, and **HardTopKHotBCE** classes in PyTorch, noting that while the initial approach isn't faster, a cheaper alternative exists using hard targets.


  

---


### **Yannick Kilcher ▷ #[agents](https://discord.com/channels/714501525455634453/1269724655405498429/1444294952552103938)** (1 messages): 

> `Microsoft 365 AI Agents` 


- **Microsoft trots out 365 AI Agents**: A member mentioned **Microsoft 365** now includes "**AI Agents**", linking to the [Microsoft Agent 365 documentation](https://learn.microsoft.com/en-us/microsoft-agent-365/) and [Microsoft 365 Agents SDK documentation](https://learn.microsoft.com/en-us/microsoft-365/agents-sdk/).
- **AI Agents in Microsoft 365: Initial Impressions**: A member expressed interest in the newly announced **AI Agents** within **Microsoft 365**, indicating they have not yet explored the feature but are aware of its recent introduction.
   - The announcement includes documentation links for both the [Microsoft Agent 365 overview](https://learn.microsoft.com/en-us/microsoft-agent-365/) and the [Microsoft 365 Agents SDK](https://learn.microsoft.com/en-us/microsoft-365/agents-sdk/).


  

---


### **Yannick Kilcher ▷ #[ml-news](https://discord.com/channels/714501525455634453/853983317044756510/1444061092153131008)** (16 messages🔥): 

> `Orchestrator-8B, ICLR reviews, OAI model training` 


- **Nvidia's Orchestrator-8B Gets Overlooked**: Nvidia's **Orchestrator-8B**, an **8B tool calling model**, achieves a **37.1** on HLE, yet has only 2 downloads on Hugging Face ([arxiv link](https://arxiv.org/abs/2511.21689), [huggingface link](https://huggingface.co/nvidia/Orchestrator-8B)).
   - Some theorize that the *Leather Jacket guy team* consistently receives less attention, potentially impacting the model's visibility.
- **ICLR Reviews Allegedly AI Generated**: Many **ICLR reviews** apparently turned out to be **AI generated** ([nature link](https://www.nature.com/articles/d41586-025-03506-6)), immediately following news about review de-anonymization.
   - News continues to worsen for the *poor guy*, following this original news.
- **OAI Allegedly Struggles with New Model Training**: Rumors suggest OAI hasn't successfully trained a new model from scratch since **GPT-4o**, with **GPT5.1**'s knowledge cutoff remaining at June 2024 ([X link](https://x.com/suchenzang/status/1994611078190542980)).
   - It's been suggested that newer models were part-trained on top of GPT-4o and GPT 4.5 was discontinued from service because *it was too expensive (8T params MoE/2T params active)*.


  

---


### **Modular (Mojo 🔥) ▷ #[general](https://discord.com/channels/1087530497313357884/1098713601386233997/1444447698672423104)** (21 messages🔥): 

> `Web3 Spam, Circular Import Errors in lightbug_http, Small Time Library, pixi mojo build backend, Cambericon and Huawei GPUs` 


- **Modular Fights Web3 Spam with Contribution Policy**: Due to a recent wave of **web3-related spam**, members are now required to contribute to **Mojo** or **MAX** before inquiring about job opportunities at Modular, with open roles listed [here](https://www.modular.com/company/careers#open-roles).
- **Circular Import Errors Plague Lightbug HTTP**: A member reported seeing possibly **circular import errors** in [lightbug_http](https://github.com/Lightbug-HQ/lightbug_http/issues/272), proposing the addition of a trait for small time and suggesting leveraging `__extension` to address the issue.
   - The member created [two PRs](https://github.com/Lightbug-HQ/lightbug_http/pull/274) to fix the issue, using a `Formattable` trait to remove the circular imports.
- **Small Time Library Refactor**: Members discussed refactoring the **small_time** library to address circular import issues, potentially incorporating changes upstream and collaborating with the library's creator.
   - One member suggested updating the lightbug recipe in the modular-community conda channel to reference the small-time package directly, aiming to remove the copied code.
- **Pixi Mojo Build Backend Explored**: A member mentioned that [EmberJson](https://github.com/bgreni/EmberJson/tree/main) has experimented with the **pixi mojo build backend**, although it's unclear if this solves the use case of referencing dependencies from the modular community conda channel.
   - They prefer git clones and building from source while another member publishes all of their projects to a separate conda channel via CI and most use the pixi build mojo backend.
- **Cambericon and Huawei GPUs Seek Support**: A member inquired about future support for **Cambericon** and **Huawei GPUs**, especially noting that Cambericon has a tech stack similar to **CUDA**.


  

---


### **Modular (Mojo 🔥) ▷ #[mojo](https://discord.com/channels/1087530497313357884/1151418092052815884/1444965869569572876)** (26 messages🔥): 

> `def keyword removal, var keyword requirement, lexical scoping in Python, Mojo's concurrency model, Data races in parallelize` 


- **Mojo considers dropping `def` keyword**: Some community members are suggesting to **remove the `def` keyword** before Mojo 1.0, citing its primary difference from `fn` being that it always `raises`.
   - The suggestion is to reintroduce `def` later with a proper *pythonic* story and dynamic features.
- **`var` Declarations Debated for Mojo 1.0**: There is a discussion on whether to **require `var` for variable declarations inside `fn`**, to mitigate unintended implicit declaration bugs.
   - While `var` omission is liked by some, others see it as premature optimization and prefer it inside `def` to prevent potential nasty bugs.
- **Mojo's thread safety model WIP**: A user noted that Mojo's `parallelize` function is not resilient to data races and permits multiple threads to access the same variable, leading to inconsistent results.
   - A team member stated that **Mojo's concurrency and thread safety model is still a work in progress (WIP)**, and the current `parallelize` is unsafe, in part, because the team wants to account for sharing data between devices.


  

---


### **Modular (Mojo 🔥) ▷ #[max](https://discord.com/channels/1087530497313357884/1212827597323509870/1444346625035337808)** (3 messages): 

> `Matmul fallback, RTX5090` 


- **Missing Matmul Fallback Troubles RTX5090 Users**: A member inquired about the absence of a generic **matmul fallback** in the kernels, noting its impact on hosting max serve models with their **RTX5090**.
   - Another member agreed that a generic fallback matmul would be beneficial, particularly for aiding bring-up until a more specialized kernel is available.
- **Generic Matmul Fallback Desired**: The discussion emphasized the potential benefits of having a generic **matmul fallback** in the kernels.
   - It would provide a baseline implementation for new hardware or platforms until optimized kernels are developed, streamlining the bring-up process.


  

---


### **Manus.im Discord ▷ #[general](https://discord.com/channels/1348819876348825620/1349440650495398020/1444055373131546776)** (27 messages🔥): 

> `Manus Update Issues, Black Friday Sale Opinions, UI Feedback, AI Engineer Introductions, Civil Tone Request` 


- **Manus Update Cripples App, Angering Users**: After the latest update, **Manus** is reportedly functioning only for paying users, with normal chat mode disabled and free points completely removed, leaving an *empty interface with no real use*.
   - Additionally, **Manus** is allegedly failing to build a next js app for testing pull requests, causing them to be incorrect due to build timeouts.
- **Differing Opinions Emerge on Manus' Black Friday Decision**: While some users *respect the no black friday offer*, seeing it as a testament to **Manus' value**, others felt that **Black Friday** could retain customers, citing **Grok's** successful approach last year.
   - One user stated that the product *is waaaay too cheap compared to what it can do and compared to chat gpt and claude*.
- **User Interface issue surfaces for Referral Codes**: A user complained that the **Manus UI** should prominently state that **referral codes** can only be redeemed by new users.
   - The user exclaimed *wtf are people crying about paying a monthly fee to one of the best LLM's out there*.
- **AI Engineers Promote Expertise**: One user introduced himself as an **AI & full stack engineer** specializing in workflow automation, LLM integration, RAG, AI detection, image and voice AI, and blockchain development, providing several examples of deployed systems, automated pipelines, and task orchestration.
   - Another user, who has *never studied programming*, shared his work on creating an **AI engine in Rust** with the goal of building a *sovereign AI*.
- **Mod Calls for Civil Discourse**: In the face of recent arguments, a moderator requested that users keep a civil tone in their discussions.
   - The moderator followed up with *Thanks for listening 🙏alsalam ealaykum warahmat allah wabarakatu*.


  

---


### **DSPy ▷ #[show-and-tell](https://discord.com/channels/1161519468141355160/1202371242519441499/1444221259683725362)** (4 messages): 

> `scikit-llm, OpenRouter API` 


- **DSPy wins against scikit-llm, maybe?**: A member asked whether **DSPy** is better than **scikit-llm**, and one of the developers answered *yes, but depends who you ask.*
- **OpenRouter API Configuration Still to Come**: A member pointed out that the **documentation is limited** and the tool *can't configure openrouter api in it yet.*
   - The developer acknowledged that it's *very new*, and that a recent update may address the issue.


  

---


### **DSPy ▷ #[general](https://discord.com/channels/1161519468141355160/1161519469319946286/1444116146457546802)** (6 messages): 

> `Prompt Tuning, GEPA and SIMBA optimizers, AI System Building, End-to-End AI Systems, AI-driven Platforms` 


- **Methods for Prompt Tuning Emerges**: A member highlights that [methods where **LLMs** analyze failure causes to propose improvements](https://arxiv.org/abs/2406.07496) are well-known for their effectiveness in prompt tuning.
- **GEPA and SIMBA Optimizers Examined**: A member inquires about using methods for prompt tuning in DSPy, questioning if **GEPA** and **SIMBA** are the appropriate optimizers for this approach.
- **AI Developer Showcases AI Systems Building Experience**: A **Senior AI Developer** detailed their experience building end-to-end AI systems at scale, highlighting expertise in **machine learning, deep learning, NLP, computer vision, and generative AI** and tools like **PyTorch, TensorFlow, and Hugging Face**.
- **AI Platforms using ChatGPT highlighted**: The AI developer detailed the development of platforms such as an **AI Medical Diagnosis System**, an **AI Video Generation System**, and a **Health & Management Advocacy System**.


  

---


### **aider (Paul Gauthier) ▷ #[general](https://discord.com/channels/1131200896827654144/1131200896827654149/1444287630287179928)** (6 messages): 

> `GPT Provider, Aider Alternatives, Mindlink Models` 


- ****GPT Provider** Offers Free Credits for Models**: A member is offering free credits for their **GPT provider**, supporting models such as **gpt-5-mini**, **gpt-4.1**, **gpt-4o**, **gpt4**, and **gpt-3.t**, including open-source embedding models.
   - The provider is open source, allowing users to clone and modify the code, with the option to contribute back to the community.
- **Members Discuss **Aider Alternatives****: Some members expressed concern about **Aider**'s future, with one asking for alternatives.
   - One member stated that *generating svg images with it is much better. A nicer aesthetic*.
- ****Mindlink Models** Impact Aider's Popularity?**: A member speculated that **Mindlink 32/72B models**, released in August, may have impacted Aider's popularity due to their strong code generation capabilities.
   - However, the model *didn't hold together in multi turn coding iterations so well*.


  

---


---


---


---