Frozen AI News archive

Cohere''s Command A claims #3 open model spot (after DeepSeek and Gemma)

**Cohere's Command A** model has solidified its position on the LMArena leaderboard, featuring an open-weight **111B** parameter model with an unusually long **256K context window** and competitive pricing. **Mistral AI** released the lightweight, multilingual, and multimodal **Mistral AI Small 3.1** model, optimized for single RTX 4090 or Mac 32GB RAM setups, with strong performance on instruct and multimodal benchmarks. The new OCR model **SmolDocling** offers fast document reading with low VRAM usage, outperforming larger models like Qwen2.5VL. Discussions highlight the importance of system-level improvements over raw LLM advancements, and **MCBench** is recommended as a superior AI benchmark for evaluating model capabilities across code, aesthetics, and awareness.

Canonical issue URL

AI News for 3/14/2025-3/17/2025. We checked 7 subreddits, 433 Twitters and 28 Discords (223 channels, and 9014 messages) for you. Estimated reading time saved (at 200wpm): 990 minutes. You can now tag @smol_ai for AINews discussions!

We briefly mentioned Cohere's Command A launch last week, but since the announcement was comparatively light on broadly comparable benchmarks (there were some, but the selective, self reported, comparisons to DeepSeek V3 and GPT-4o couldnt really contextualize Command A among either SOTA open source or overall SOTA-for-size), it was hard to tell where it would rank in terms of lasting impact.

With today's LMArena result, that is no longer in question:

image.png

As Aidan Gomez points out, Command A actually increases 2 spots in rankings with the Style Control modifier (explored on their LS podcast).

There are many other notable subtle points that make Command A a particularly attractive candidate to include in one's open models arsenal, including the unusually long 256k context window, multilingual capabilities, and focus on optimizing for a 2-H100 serving footprint.

image.png


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

Large Language Models (LLMs) and Model Releases

Model Performance, Benchmarks, and Evaluations

AI Agents, Tool Use, and Applications

AI Safety, Alignment, and Auditing

Meme/Humor


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Advanced AI Video Generation with SDXL, Wan2.1, and Long Context Tuning

Theme 2. OpenAI's Sora: Transforming Cityscapes into Dystopias

Theme 3. OpenAI and DeepSeek: The Open Source Showdown

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding

Theme 1. Criticism of 'Gotcha' tests to determine LLM intelligence

Theme 2. Reactions to Google DeepMind CEO's predictions of AGI in 5-10 years

Theme 3. OpenAI's controversial request to use copyrighted content under U.S. Government consideration

Theme 4. ReCamMaster releases new camera angle changing tool


AI Discord Recap

A summary of Summaries of Summaries by Gemini 2.0 Flash Thinking

Theme 1. Mistral and Google Battle for Small Model Supremacy

Theme 2. Training and Optimization Techniques Get Hot and Heavy

Theme 3. AI Agents and IDEs Vie for Developer Hearts

Theme 4. Hardware Heats Up: AMD APUs and Chinese RTX 4090s Turn Heads

Theme 5. Copyright, Community, and Ethical AI Debates Rage On


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


Cursor IDE Discord


OpenAI Discord


Nous Research AI Discord


aider (Paul Gauthier) Discord


LM Studio Discord


OpenRouter (Alex Atallah) Discord


Perplexity AI Discord


Yannick Kilcher Discord


HuggingFace Discord


Interconnects (Nathan Lambert) Discord


MCP (Glama) Discord


Latent Space Discord


Notebook LM Discord


GPU MODE Discord


Eleuther Discord


tinygrad (George Hotz) Discord


LlamaIndex Discord


Nomic.ai (GPT4All) Discord


Cohere Discord


DSPy Discord


LLM Agents (Berkeley MOOC) Discord


Modular (Mojo 🔥) Discord


MLOps @Chipro Discord


AI21 Labs (Jamba) Discord


Torchtune Discord


The Codeium (Windsurf) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (923 messages🔥🔥🔥):

Gradient steps, Gemma 3 fine tuning, Tokenizer issues, MattBCool's Twitter hack, Unsloth speed

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (34 messages🔥):

llama-server vision support, RWKV-7 support, Q4 vs Q8, bnb library limitations, QLoRA NF4 quantized weights

Link mentioned: QLoRA Weight Dequantizing in Triton: no description found


Unsloth AI (Daniel Han) ▷ #help (480 messages🔥🔥🔥):

Gemma 3 Finetuning Issues, Unsloth GPU Support, RAG Data Formatting for Unsloth, lora upload issue, text dataset formatting

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (20 messages🔥):

Gemma-3-think model, Qwen 2.5 3B instruct, Gemma-3-27b pruned vocab

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (18 messages🔥):

Context Length vs. Model Size, Fine-tuning and Hosting Alternatives to Unsloth, Continued Pre-training and Tokenizer Updates, LLM Scoring on the Political Spectrum, Legal Q&A with Tree-Based Retrieval

Links mentioned:


Cursor IDE ▷ #general (1100 messages🔥🔥🔥):

Cursor vs Windsurf, Claude 3.7 pricing, Linux better than Windows for MCP and dev, vibe coding

Links mentioned:


OpenAI ▷ #ai-discussions (694 messages🔥🔥🔥):

AI Mastery Debate, AI Replacing Humans, Gemini Image Generation, AI-driven OS, LLMs for Finance

Links mentioned:


OpenAI ▷ #gpt-4-discussions (9 messages🔥):

Loveable, Bolt.new, Image-to-code, GPT PRO issues, Deep Research Limit


OpenAI ▷ #prompt-engineering (61 messages🔥🔥):

GPT-4o impressions, AI Self-Reflection, AI Team of Experts, Business Guidance with AI, AI personalities


OpenAI ▷ #api-discussions (61 messages🔥🔥):

GPT-4o usage, Custom GPT improvements, AI self-reflection, AI personalities, AI expert teams for business


Nous Research AI ▷ #general (729 messages🔥🔥🔥):

Scalable AI, Mixture of Experts, Mistral Small 3.1, LLM Copyright issues, LLM Training

Links mentioned:


Nous Research AI ▷ #ask-about-llms (1 messages):

john0galt: Pretty impressive


Nous Research AI ▷ #research-papers (5 messages):

Curse of Depth in LLMs, LayerNorm Scaling, LLMs competing in text-only games, Differentiable Hebbian Consolidation Model

Links mentioned:


Nous Research AI ▷ #interesting-links (21 messages🔥):

Acoustic STS Model, Tool-Integrated Reasoning, Gemma Abliterated

Links mentioned:


Nous Research AI ▷ #research-papers (5 messages):

Curse of Depth, LayerNorm Scaling, LLM Text-Based Game Competition, Differentiable Hebbian Consolidation model

Links mentioned:


aider (Paul Gauthier) ▷ #general (691 messages🔥🔥🔥):

Aider screen recordings, Claude 3.7 Sonnet issues, MCP server value, Baidu ERNIE 4.5 & X1 Models, Aider Custom Commands

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (74 messages🔥🔥):

aider with agents.json, Sluggish v0.77.0, AWS Bedrock Claude 3.7 sonnet error, deepseek r1 slow, learn an API with aider

Links mentioned:


aider (Paul Gauthier) ▷ #links (25 messages🔥):

Refact.ai Leaderboard Claim, Claude Harmony Feature, Qwen Models Hype

Links mentioned:


LM Studio ▷ #general (458 messages🔥🔥🔥):

GPU support on Llama.cpp, GPU Upgrade Recommendations, Parallel inference Possibilities, OCR Model Recommendation for Mac M3, Gemma 3

Links mentioned:


LM Studio ▷ #hardware-discussion (197 messages🔥🔥):

RTX 8000 vs A6000 for LLM inference, Multiple GPUs for running multiple LLMs, 48GB RTX 4090 from China, AMD Strix Halo APU vs RTX 5080 in AI, Mobo/RAM Choice for AI PC build

Links mentioned:


OpenRouter (Alex Atallah) ▷ #announcements (2 messages):

Anthropic Incident, Claude 3.7 Sonnet, Endpoint Quality Measurement

Link mentioned: Elevated errors for requests to Claude 3.7 Sonnet: no description found


OpenRouter (Alex Atallah) ▷ #app-showcase (4 messages):

Personality.gg Launch, RP Sites and OpenRouter API, Chub and Sillytavern Recommendation

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (443 messages🔥🔥🔥):

Gemma 3, RP models, Mistral Small 3.1, OpenRouter OpenAPI spec, Reasoning Tokens

Links mentioned:


OpenRouter (Alex Atallah) ▷ #beta-feedback (1 messages):

eofr: Scam


Perplexity AI ▷ #announcements (1 messages):

Perplexity Accuracy, Perplexity Video Ad


Perplexity AI ▷ #general (409 messages🔥🔥🔥):

Perplexity Pro Oyster Game, Discord Pro Role, Gemini 2 Flash Context, Claude 3.7 Sonnet Limit, AI Coding Models

Links mentioned:


Perplexity AI ▷ #sharing (32 messages🔥):

Quantum Chip, Willow, Vibe Coding, Lunar Lander, Dark Matter


Perplexity AI ▷ #pplx-api (5 messages):

Transferring Credits, API Pay-as-you-go Limits, Sonar Reasoning Pro Limits, French Translation


Yannick Kilcher ▷ #general (356 messages🔥🔥):

Rust Community Toxicity, C vs C++, Optimization vs Search, Stochastic Differential Equations

Links mentioned:


Yannick Kilcher ▷ #paper-discussion (4 messages):

LLM Literature Review, Gemma 3 Model

Link mentioned: 🥇Top AI Papers of the Week: The Top AI Papers of the Week (Mar 10 - 16)


Yannick Kilcher ▷ #ml-news (59 messages🔥🔥):

AI Safety Institute ideological bias, Deepseek R2 release and its issues, SesameAILabs CSM model disappointment, Hallucination in AI search engines, Mistral Small 3.1 release

Links mentioned:


HuggingFace ▷ #announcements (1 messages):

SmolVLM2, Gradio Sketch 2.0, DCLM-Edu Dataset, huggingface.js GGUF metadata, Robot Arms for $299

Links mentioned:


HuggingFace ▷ #general (141 messages🔥🔥):

Dou Shou Qi AI, Stable Diffusion Model, CSM Streaming Generator, Gemini 2.0 Flash Experimental, Hunyuan 3D-2 API

Links mentioned:


HuggingFace ▷ #today-im-learning (3 messages):

ML for 3D, HuggingFace Agents course, Retrievel agent


HuggingFace ▷ #cool-finds (2 messages):

Cross-posting


HuggingFace ▷ #i-made-this (5 messages):

Awesome Vibe Coding, Local LLMs setup, FluxHands-FingerCount Dataset

Links mentioned:


HuggingFace ▷ #reading-group (1 messages):

coldbreeze.: Free fire


HuggingFace ▷ #computer-vision (4 messages):

Autonomous Driving blogpost, VLMs Research Hub, HF DETR model, Meta's Segment Anything Model (SAM)

Link mentioned: GitHub - thubZ09/vision-language-model-hub: Hub for researchers exploring VLMs and Multimodal Learning:): Hub for researchers exploring VLMs and Multimodal Learning:) - GitHub - thubZ09/vision-language-model-hub: Hub for researchers exploring VLMs and Multimodal Learning:)


HuggingFace ▷ #NLP (2 messages):

SetFit with LoRA, SmolLM as teacher model


HuggingFace ▷ #smol-course (3 messages):

smol-course, HuggingFace Agents course, HF inference credits


HuggingFace ▷ #agents-course (134 messages🔥🔥):

Agentic AI team building, Smolagents and Gemma3 issues, Ollama Context Length, HF Course Verification problems, MCP and Smolagent framework

Links mentioned:


HuggingFace ▷ #open-r1 (2 messages):

Open-R1 Reasoning Distillation, grpo code, distributed grpo


Interconnects (Nathan Lambert) ▷ #news (74 messages🔥🔥):

Long Context Evals, 3D Generation Upgrade, DeepSeek Engineer passports, Figure's BotQ humanoid robots, Nvidia Blackwell GPUs and Together AI

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-questions (7 messages):

R1 inference costs, Deepseek free service, Hosting models locally, Fireworks alternative


Interconnects (Nathan Lambert) ▷ #ml-drama (32 messages🔥):

OpenAI vs Elon Musk legal battle, Zochi AI Scientist, ICLR conference spam, AI reviewers, Liam Fedus leaving OpenAI

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (101 messages🔥🔥):

Claude Code Vim mode, Gemma 3 License, Deepseek integrated in Chinese food delivery, LLMs as copy editors, Free Speech Eval

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (3 messages):

Azure AI Agents API vs OpenAi Assistants API, Mistral Meow

Links mentioned:


Interconnects (Nathan Lambert) ▷ #rl (39 messages🔥):

GRPO implementation trick, Applying KL penalty in the loss, DAPO algorithm, Zero-shot RL

Links mentioned:


Interconnects (Nathan Lambert) ▷ #reads (4 messages):

Noam Chomsky, Nicholas Carlini, Future of LLMs, AI risks

Link mentioned: My Thoughts on the Future of "AI" : no description found


Interconnects (Nathan Lambert) ▷ #expensive-queries (36 messages🔥):

RLHF Book, Claude Code vs ChatGPT, Chorus writing checker, ChatGPT Deep Research for teaching websites

Links mentioned:


MCP (Glama) ▷ #general (224 messages🔥🔥):

Swarm vs Mesh vs Sequence for multi-agent systems, OpenSwarm and OpenAI-agents, mycoder.ai vs claude-code, Monetizing MCP services, Glama scans

Links mentioned:


MCP (Glama) ▷ #showcase (25 messages🔥):

Awesome Vibe Coding, Roo Code MCP, MacOS Control MCP, Secretary MCP, Professional Graph MCP

Links mentioned:


Latent Space ▷ #ai-general-chat (34 messages🔥):

Agentic systems multi-threading, Claude's Birthday, GPT-o1 Acing Math Exams, SAE Bench Release, Baidu ERNIE 4.5 & X1

Links mentioned:


Latent Space ▷ #ai-announcements (5 messages):

Snipd Podcast, AI Podcast App, Outdoor Podcast, Tech Stack, Switching from Finance to Tech

Link mentioned: Tweet from Latent.Space (@latentspacepod): 🆕 Snipd: The AI Podcast App for Learninghttps://youtu.be/FNRO_SYx68QOur first ever OUTDOOR podcast! @swyx and @KevinBenSmith chat about @aidotengineer NYC, switching from Finance to Tech, how AI can ...


Latent Space ▷ #ai-in-action-club (122 messages🔥🔥):

Claude 3.5 vs 3.7, Vibe Coding, Levelsio Flight Simulator, Auto Git Commits, Enterprise AI Dev Team Enablement

Links mentioned:


Notebook LM ▷ #use-cases (27 messages🔥):

Gemini-integrated Android, Deepseek R1 Impact, Audio Overview Length, NotebookLM Use Cases, Hyperbolic Tapering Schedule

Link mentioned: Google Agentspace: Google Agentspace is the launch point for enterprise-ready AI agents, helping increase employee productivity for complex tasks with one single prompt.


Notebook LM ▷ #general (132 messages🔥🔥):

Extracting Google Sheets for LM, Gemini for data analysis, Public sharing of NotebookLM, Using NotebookLM to prevent errors, NotebookLM limitations and solutions

Links mentioned:


GPU MODE ▷ #general (6 messages):

Jake Cannell hiring GPU devs, sm90 kernels, GPU performance counters, nebius.ai, Datacrunch


GPU MODE ▷ #triton (13 messages🔥):

Embedded Python Pip Usage, Triton Windows PyPI Release, tl.multiple_of usage in Triton, Efficient Pointer Chasing in Triton, Triton and Sparse Computations


GPU MODE ▷ #cuda (5 messages):

SASS compatibility with NVIDIA architectures, LD/ST unit sharing in SM microarchitecture, L1-dTLB cache, Cutlass 4.0 Python DSL, CUDA streams concurrency issues

Links mentioned:


GPU MODE ▷ #torch (13 messages🔥):

Torch Compile, Graph Breaks, Stride Issue, Std::variants in schemas

Link mentioned: pytorch/aten/src/ATen/core/op_registration/README.md at c7c3e7732443d7994303499bcb01781c9d59ab58 · pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration - pytorch/pytorch


GPU MODE ▷ #announcements (1 messages):

Consumer GPU Performance, AGI, Neuromorphic hardware, vast.ai


GPU MODE ▷ #algorithms (6 messages):

Transformers without Normalization, LayerNorm, tanh, FA3, exp


GPU MODE ▷ #jobs (4 messages):

GPU Code Generation, ML Compiler, HPC Engineers, Superalignment Framework

Link mentioned: Your connected workspace for wiki, docs & projects | Notion: A new tool that blends your everyday work apps into one. It's the all-in-one workspace for you and your team


GPU MODE ▷ #beginner (9 messages🔥):

GPU coalesced access, Nvidia GPU read operation, GPU programming, CUDA learning resources, Installing Triton

Links mentioned:


GPU MODE ▷ #pmpp-book (1 messages):

CUDA kernel, pytorch extension


GPU MODE ▷ #off-topic (3 messages):

AI Agents Hackathon, NVIDIA GTC 2025, SLURM based HPC cluster IDE/Editor

Link mentioned: AI Agents Hackathon - GTC 2025 Edition (1 DAY) · Luma: AI Agents Hackathon - GTC 2025 Edition (1 DAY)As NVIDIA GTC 2025 unites the global AI community, Vertex Ventures US and CreatorsCorner, invite you to turns…


GPU MODE ▷ #irl-meetup (11 messages🔥):

Block Sparse Attention, GEMM, GTC Keynote Missed, GTC Hackathon results, GTC Meetup


GPU MODE ▷ #rocm (5 messages):

MI300X inference optimization, AMD Instinct MI300X workload optimization, DeepSeek-R1 on MI300X, SGLang Optimization

Links mentioned:


GPU MODE ▷ #tilelang (1 messages):

leiwang1999_53585: worked on my h100, maybe you should install nightly wheel🤣


GPU MODE ▷ #self-promotion (5 messages):

GTC CUDA, Wen-mei Hwu GTC, Pruna AI Efficiency Framework, Ruff and UV for project management

Links mentioned:


GPU MODE ▷ #🍿 (5 messages):

Distributed Training, Scaling Laws for DiLoCo, GPU kernel modifications

Link mentioned: Tweet from Zachary Charles (@MatharyCharles): We just put out a key step for making distributed training work at larger and larger models: Scaling Laws for DiLoCoTL;DR: We can do LLM training across datacenters in a way that scales incredibly wel...


GPU MODE ▷ #reasoning-gym (10 messages🔥):

Reasoning Gym, nano-R1 Project, Temporal Clue, Group Relative Policy Optimization (GRPO)

Links mentioned:


GPU MODE ▷ #active-leaderboards (1 messages):

Xavier Init, User ID Issue


GPU MODE ▷ #general (15 messages🔥):

pip install in popcorn, Looking for GTC 2025 Ticket, Free B200 access, AMD Support coming


GPU MODE ▷ #submissions (29 messages🔥):

Leaderboard Submissions, Benchmark Submissions, Test Submissions, Modal Runners


GPU MODE ▷ #status (1 messages):

Leaderboard cleanup, Robust Evaluation


GPU MODE ▷ #hardware (1 messages):

NVIDIA thermal ranges, Arithmetic and Memory Bandwidth Degradation


Eleuther ▷ #general (10 messages🔥):

SMILES string encoding, Stereoisomer Generation, Free GPU Platforms, Managed Inference APIs, EleutherAI welcomes Catherine Arnett


Eleuther ▷ #research (46 messages🔥):

Block Diffusion, Globally Shared Experts, Mixture-of-Experts Universal Transformers, Tan et al.'s SUT paper, Visual Geometry Group (VGGT)

Links mentioned:


Eleuther ▷ #lm-thunderdome (22 messages🔥):

Fewshot Split Fallback, Gen Kwargs to JSON, Old vs New LLM Leaderboard

Link mentioned: lm-evaluation-harness/lm_eval/tasks/benchmarks/openllm.yaml at main · EleutherAI/lm-evaluation-harness: A framework for few-shot evaluation of language models. - EleutherAI/lm-evaluation-harness


tinygrad (George Hotz) ▷ #general (30 messages🔥):

SDXL benchmarks, tensor cat speed, parallel BLAKE3, WebGPU integration, Bitonic Sort indices

Link mentioned: Tweet from vincent (@t0kenl1mit): Tried using compare for @tinygrad tensor cat but still its slow. Attached are my whiteboard thoughts on it. I think I might have to fight ELF and link in some custom C but it might be something el...


tinygrad (George Hotz) ▷ #learn-tinygrad (5 messages):

Print Debugging Tinygrad, Lazy Computation and Gradients, Reproducer Code for Debugging, Multiline Code Blocks

Link mentioned: gsoc_2025/ML4SCI/task1 at main · kayo09/gsoc_2025: GSOC 2025! Happy Coding! ☀️. Contribute to kayo09/gsoc_2025 development by creating an account on GitHub.


LlamaIndex ▷ #blog (2 messages):

Agentic Reasoning System, Corrective RAG, LlamaExtract Public Beta


LlamaIndex ▷ #general (31 messages🔥):

AI Agents Hackathon, Vertex Ventures US, CreatorsCorner, gguf fine tuning, LlamaIndex vs Pydantic AI

Links mentioned:


LlamaIndex ▷ #ai-discussion (1 messages):

Vision-Language Models (VLMs), Multimodal Learning, GitHub Research Hub

Link mentioned: GitHub - thubZ09/vision-language-model-hub: Hub for researchers exploring VLMs and Multimodal Learning:): Hub for researchers exploring VLMs and Multimodal Learning:) - GitHub - thubZ09/vision-language-model-hub: Hub for researchers exploring VLMs and Multimodal Learning:)


Nomic.ai (GPT4All) ▷ #general (29 messages🔥):

Gemma 3 Integration in GPT4All, LocalDocs Crashing Fix, Gemma 3 Language Comprehension, Model license agreements

Link mentioned: Gemma 3 support · Issue #3540 · nomic-ai/gpt4all: System Info I installed GPT4All, opened it, downloaded the Gemma3 Instruct for hugging face (tried two models https://huggingface.co/Mungert/gemma-3-12b-it-gguf https://huggingface.co/ggml-org/gemm...


Cohere ▷ #「💬」general (20 messages🔥):

Fine-tuning for Command A, Azure Cohere Rerank v3 Terraform, Support Channel for New Models, Channel for Private Deployments of CMD A


Cohere ▷ #【📣】announcements (1 messages):

Command A, Developer Office Hours, Enterprise-friendly features, Hardware vs performance


Cohere ▷ #「🔌」api-discussions (3 messages):

Cohere Command A, Vercel SDK integration, Object generation support, Cohere API versioning

Link mentioned: Cohere: Learn how to use the Cohere provider for the AI SDK.


Cohere ▷ #「🤖」bot-cmd (1 messages):

.paolo16: Hello


Cohere ▷ #「🤝」introductions (3 messages):

Introductions, Freelance programmers, Community Assistance


DSPy ▷ #general (13 messages🔥):

dspy/MCP Integration, DSPy Assertions / Suggestions removal, DSPy 2.6 Output Refinement, QdrantRM removal in 2.6

Links mentioned:


LLM Agents (Berkeley MOOC) ▷ #mooc-announcements (1 messages):

Caiming Xiong, Multimodal Agents, Vision-Language-Action Alignment, OSWorld, AgentTrek


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (7 messages):

Advanced LLM agent course enrollment, Course certification


LLM Agents (Berkeley MOOC) ▷ #mooc-lecture-discussion (4 messages):

Self-reflection and self-refinement in LLMs, System prompts and LLM behavior


Modular (Mojo 🔥) ▷ #general (5 messages):

Modular AI Art, Discord Spam


Modular (Mojo 🔥) ▷ #mojo (6 messages):

Compact Dict, SIMD, stdlib Dict

Link mentioned: GitHub - mzaks/compact-dict: A fast and compact Dict implementation in Mojo 🔥: A fast and compact Dict implementation in Mojo 🔥. Contribute to mzaks/compact-dict development by creating an account on GitHub.


MLOps @Chipro ▷ #events (2 messages):

AI4Legislation Competition, AI Demo Jam, Silicon Valley Chinese Association Foundation, Dnipro VC, Data Phoenix

Links mentioned:


MLOps @Chipro ▷ #general-ml (2 messages):

AI4Legislation competition, object detection in MRI

Link mentioned: March AI4Legislation Seminar RSVP: Thank you for your interest in SVCAF's AI4Legislation seminar!Silicon Valley Chinese Association Foundation (incorporated in 2015) is holding a competition this summer to develop open-source AI-dr...


AI21 Labs (Jamba) ▷ #jamba (2 messages):

Qdrant


AI21 Labs (Jamba) ▷ #general-chat (2 messages):

API Feature Requests, Repetition Penalty


Torchtune ▷ #general (1 messages):

yamashi: https://mistral.ai/news/mistral-small-3-1


Torchtune ▷ #papers (2 messages):

Learnable Scalars, Mitigating Issues in Models, Model Convergence

Link mentioned: Transformers without Normalization | alphaXiv: View 1 comments: Awesome work!Transformers without Normalization podcast



{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}