Frozen AI News archive

Project Stargate: $500b datacenter (1.7% of US GDP) and Gemini 2 Flash Thinking 2

**Project Stargate**, a US "AI Manhattan project" led by **OpenAI** and **Softbank**, supported by **Oracle**, **Arm**, **Microsoft**, and **NVIDIA**, was announced with a scale comparable to the original Manhattan project costing **$35B inflation adjusted**. Despite Microsoft's reduced role as exclusive compute partner, the project is serious but not immediately practical. Meanwhile, **Noam Shazeer** revealed a second major update to **Gemini 2.0 Flash Thinking**, enabling **1M token long context** usable immediately. Additionally, **AI Studio** introduced a new **code interpreter** feature. On Reddit, **DeepSeek R1**, a distillation of **Qwen 32B**, was released for free on **HuggingChat**, sparking discussions on self-hosting, performance issues, and quantization techniques. DeepSeek's CEO **Liang Wenfeng** highlighted their focus on **fundamental AGI research**, efficient **MLA architecture**, and commitment to **open-source development** despite export restrictions, positioning DeepSeek as a potential alternative to closed-source AI trends.

Canonical issue URL

AI News for 1/20/2025-1/21/2025. We checked 7 subreddits, 433 Twitters and 34 Discords (225 channels, and 4353 messages) for you. Estimated reading time saved (at 200wpm): 450 minutes. You can now tag @smol_ai for AINews discussions!

Days like these are a conundrum - on one hand, the obvious big earth shattering news is the announcement of Project Stargate, a US "AI Manhattan project" led by OpenAI and Softbank, and supported by Softbank, OpenAI, Oracle, MGX, Arm, Microsoft, and NVIDIA. For scale, the actual Manhattan project cost $35B inflation adjusted.

image.png

Although this was rumored since a year ago, Microsoft's reduced role as exclusive compute partner to OpenAI is prominent by its absence. As with any splashy PR stunt, one should beware AI-washing, but the project is very serious and should be treated as such.

However, it's not really news you can use today, which is what we aim to do here at your local AI newspaper.

Fortunately, Noam Shazeer got you, with a second Gemini 2.0 Flash Thinking, with another big leap on 2.0 Flash, and 1M long context that you can use today (we will enable in AINews and Smol Talk tomorrow):

image.png

AI Studio also got a code interpreter.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

TO BE COMPLETED


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. DeepSeek R1: Release, Performance, and Strategic Vision

Theme 2. New DeepSeek R1 Tooling Enhances Usability and Speed

Theme 3. Comparison of DeepSeek R1 Efficiency and Performance to Competitors

Theme 4. Criticism of 'Gotcha' Tests in LLMs and Competitive Context

Other AI Subreddit Recap

/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT

Theme 1. OpenAI Investment $500B: Partnership with Oracle and Softbank

Theme 2. OpenAI's New Model Operators

Theme 3. Anthropic's ASI Prediction: Implications of 2-3 Year Timeline


AI Discord Recap

A summary of Summaries of Summaries by o1-preview-2024-09-12

Theme 1. DeepSeek R1 Rocks the AI World

Theme 2. OpenAI's Stargate Project Shoots for the Moon

Theme 3. New Models and Techniques Push Boundaries

Theme 4. Users Battle Bugs and Limitations in AI Tools

Theme 5. AI's Expanding Role in Creative and Technical Fields


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


Cursor IDE Discord


Codeium (Windsurf) Discord


aider (Paul Gauthier) Discord


LM Studio Discord


Nous Research AI Discord


Stackblitz (Bolt.new) Discord


Perplexity AI Discord


Interconnects (Nathan Lambert) Discord


MCP (Glama) Discord


OpenRouter (Alex Atallah) Discord


Cohere Discord


Notebook LM Discord Discord


Stability.ai (Stable Diffusion) Discord


GPU MODE Discord


Eleuther Discord


Latent Space Discord


OpenAI Discord


Yannick Kilcher Discord


Modular (Mojo 🔥) Discord


LlamaIndex Discord


Nomic.ai (GPT4All) Discord


LAION Discord


LLM Agents (Berkeley MOOC) Discord


tinygrad (George Hotz) Discord


Torchtune Discord


DSPy Discord


Mozilla AI Discord


AI21 Labs (Jamba) Discord


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Axolotl AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The OpenInterpreter Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The HuggingFace Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Gorilla LLM (Berkeley Function Calling) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (652 messages🔥🔥🔥):

DeepSeek-R1 model limitations, Fine-tuning strategies for classification tasks, Handling model checkpoints, Model tokenization and embeddings, Challenges in using Unsloth notebooks

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (1 messages):

Unsloth training, Fine-tuning LLMs, Weights & Biases integration, vLLM for model serving

Link mentioned: Fine-Tuning Llama-3.1-8B for Function Calling using LoRA: Leveraging Unsloth for fine-tuning with Weights & Biases integration for monitoring and vLLM for model serving


Unsloth AI (Daniel Han) ▷ #help (56 messages🔥🔥):

Fine-tuning models, Using Unsloth with different datasets, Models compatibility, Training on reasoning tasks, Handling CUDA memory issues

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (12 messages🔥):

OpenWebUI integration, Synthetic datasets, Free/Open-source solutions, Colab script testing

Link mentioned: Google Colab: no description found


Unsloth AI (Daniel Han) ▷ #research (169 messages🔥🔥):

Chinchilla Optimal training, Synthetic data in AI training, Emotional tracking in AI, Grokking in language models, 3D modeling vs text in AI applications

Links mentioned:


Cursor IDE ▷ #general (467 messages🔥🔥🔥):

DeepSeek R1 integration, Cursor 0.45 updates, OpenAI Stargate Project, AI competition, Claude 3.5 performance

Links mentioned:


Codeium (Windsurf) ▷ #discussion (49 messages🔥):

Windsurf performance issues, DeepSeek model comparisons, Error troubleshooting, Codeium features and requests, User experiences with tools

Links mentioned:


Codeium (Windsurf) ▷ #windsurf (351 messages🔥🔥):

Windsurf performance issues, DeepSeek integration, Flow Actions limit, Quality of suggestions, Bug reporting and troubleshooting

Links mentioned:


aider (Paul Gauthier) ▷ #announcements (1 messages):

Aider v0.72.0 Release, DeepSeek R1 Support, Kotlin Syntax Support, File Handling Enhancements, Bugfixes and Improvements


aider (Paul Gauthier) ▷ #general (297 messages🔥🔥):

DeepSeek R1 performance, Comparison of AI models, OpenAI subscription discussions, Hardware pricing and availability, Data usage and privacy concerns

Links mentioned:


aider (Paul Gauthier) ▷ #questions-and-tips (89 messages🔥🔥):

Using Aider with Sonnet, Updating Aider Versions, Error Handling in Aider, DeepSeek Model Comparisons, Refactoring Python Codebases

Links mentioned:


aider (Paul Gauthier) ▷ #links (1 messages):

Deepseek R1, Live coding experience, Space Invaders game upgrade

Link mentioned: Space Invaders with Deepseek R1 and Aider in Architect mode.: The new R1 model from Deepseek is second only to 01 from OpenAI on the Aider LLM leaderboard. Plus it's a fraction of the cost.Here I test out its capabiliti...


LM Studio ▷ #general (342 messages🔥🔥):

DeepSeek R1 Models, Mathematics Tutoring, Local Model Deployment, OpenAI Compatibility, Community Support for AI

Links mentioned:


LM Studio ▷ #hardware-discussion (31 messages🔥):

AI/ML Linux Box, NVIDIA DIGITS and Compatibility, DIGITS Cost and Performance, DGX OS Insights, GPU Cooling Issues

Link mentioned: NVIDIA DIGITS - NVIDIA Docs: no description found


Nous Research AI ▷ #general (251 messages🔥🔥):

Crypto Discussions in AI Discord, DeepSeek-R1 Distill Model Insights, Challenges with Local Implementation of Smolagents, AI and Reward Functions in Reinforcement Learning, Intel Acquisition Rumors

Links mentioned:


Nous Research AI ▷ #ask-about-llms (8 messages🔥):

DeepSeek-R1 Feedback, Mechanistic Interpretation of Models


Nous Research AI ▷ #research-papers (6 messages):

Mind Evolution, SleepNet and DreamNet models, Deep Learning Algorithm Inspired by Adjacent Possible, Intrinsic Motivation in AI

Links mentioned:


Nous Research AI ▷ #interesting-links (11 messages🔥):

Liquid AI's LFM-7B model, Automated architecture search, Mistral's new models, Importance of business models in AI, Neural architecture search techniques

Links mentioned:


Nous Research AI ▷ #research-papers (6 messages):

Mind Evolution for LLMs, SleepNet and DreamNet Models, Adjacency in Deep Learning, Dreaming in AI, IMOL Workshop Highlights

Links mentioned:


Stackblitz (Bolt.new) ▷ #announcements (2 messages):

Bolt New Configuration Update, Improvement in Setup Accuracy, Enhancements to Code Inclusion

Links mentioned:


Stackblitz (Bolt.new) ▷ #prompting (4 messages):

Prismic CMS Integration, Mobile Web-App Development, Firebase vs Supabase, Netlify Page Routing Issues


Stackblitz (Bolt.new) ▷ #discussions (171 messages🔥🔥):

Token Management Issues, Connecting Stripe, Project Migration Between Accounts, Next.js and Bolt Compatibility, Public vs Private Projects

Links mentioned:


Perplexity AI ▷ #announcements (1 messages):

Sonar API, Generative search capabilities, Benchmark performance, Data security, Affordable pricing

Link mentioned: Sonar by Perplexity: Build with the best AI answer engine API, created by Perplexity. Power your products with the fastest, cheapest offering out there with search grounding. Delivering unparalleled real-time, web-wide re...


Perplexity AI ▷ #general (157 messages🔥🔥):

CloudBank interest rates, Perplexity Pro issues, DeepSeek and O1 model, Claude Opus retirement, API performance and web searches

Links mentioned:


Perplexity AI ▷ #sharing (11 messages🔥):

Post creation help, Using Perplexity AI effectively, ISO27001 and NIS2 controls, Leveraging Co-Pilot, Research on network engineering


Perplexity AI ▷ #pplx-api (8 messages🔥):

Search Domain Filter in Sonar-Pro, Usage Tiers for Sonar and Sonar Pro, Sonar Pro API vs. Browser Pro Search, Token Consumption Monitoring

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (94 messages🔥🔥):

DeepSeek Performance, Anthropic Developments, Stargate Project Funding, Mistral AI IPO Plans, Market Dynamics in AI

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-questions (9 messages🔥):

PPO Clipping Dynamics, RL Stability Techniques, RLVR Application on R1 Models

Link mentioned: open-instruct/docs/tulu3.md at main · allenai/open-instruct: Contribute to allenai/open-instruct development by creating an account on GitHub.


Interconnects (Nathan Lambert) ▷ #ml-drama (3 messages):

AI infrastructure investment, Stargate joint venture, Texas energy generation

Link mentioned: Trump announces up to $500 billion in private sector AI infrastructure investment: President Trump announced billions in private sector investment by OpenAI, Softbank and Oracle to build AI infrastructure in the U.S.


Interconnects (Nathan Lambert) ▷ #random (18 messages🔥):

AI Models, Davos AI News, Grok 3, Tulu 3's RLVR, Robonato

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (1 messages):

xeophon.: https://x.com/menhguin/status/1881387910316052723?s=61


Interconnects (Nathan Lambert) ▷ #rl (2 messages):

Reinforcement Learning in Computer Vision, CoT integration with Computer Vision, Verification of Computer Vision Labels

Link mentioned: Tuning computer vision models with task rewards: Misalignment between model predictions and intended usage can be detrimental for the deployment of computer vision models. The issue is exacerbated when the task involves complex structured outputs, a...


Interconnects (Nathan Lambert) ▷ #reads (7 messages):

Davos Interviews, Claude AI advancements, Development of AI tools, Trends in Davos fashion

Links mentioned:


Interconnects (Nathan Lambert) ▷ #lectures-and-projects (1 messages):

RLHF Book, Interconnects utility


Interconnects (Nathan Lambert) ▷ #posts (9 messages🔥):

DeepSeek AI R1 Model, The Retort Podcast on AI Science, Thinking Models Podcast, NeurIPs Talk on Post-Training

Link mentioned: DeepSeek R1's recipe to replicate o1 and the future of reasoning LMs: Yes, ring the true o1 replication bells for DeepSeek R1 🔔🔔🔔. Where we go next.


Interconnects (Nathan Lambert) ▷ #policy (27 messages🔥):

Executive Order on AI, NAIRR Event, Defense Llama, AI Cold War, AI Infrastructure Announcement

Links mentioned:


MCP (Glama) ▷ #general (160 messages🔥🔥):

MCP Server Implementations, Coding Tools and Frameworks, Roo-Clines and Agents, Language Server Integration, MCP Applications in AI

Links mentioned:


MCP (Glama) ▷ #showcase (9 messages🔥):

Librechat Issues, Anthropic Models Compatibility, Sage for macOS and iPhone

Link mentioned: LibreChat: Enhanced ChatGPT with Agents, AI model switching, Code Interpreter, DALL-E 3, OpenAPI Actions, secure multi-user auth, and more. Supports OpenAI, Anthropic, Azure, and self-hosting via open-source.


OpenRouter (Alex Atallah) ▷ #announcements (3 messages):

Llama endpoints discontinuation, DeepSeek R1 censorship-free, DeepSeek R1 web search grounding

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (152 messages🔥🔥):

DeepSeek R1 and V3 Comparison, Gemini 2.0 Flash Update, API Key Tiers for Gemini Models, Reasoning Content Retrieval, Perplexity's New Sonar Models

Links mentioned:


Cohere ▷ #discussions (81 messages🔥🔥):

Cohere Access and Usability, Learning Rate Adjustment in Training, Model Training Techniques, Pre-training GPT-2, Cohere For AI Community

Link mentioned: Research | Cohere For AI : Cohere For AI (C4AI) is Cohere's research lab that seeks to solve complex machine learning problems.


Cohere ▷ #announcements (1 messages):

RAG Implementation, Tool Use with Models, Live Q&A Session, Builder Community Connection


Cohere ▷ #questions (4 messages):

Cohere iOS application, Cohere macOS application, Cohere beta testing


Cohere ▷ #api-discussions (3 messages):

Dify.ai Issues, Cohere Key Error, IP Block Concerns


Cohere ▷ #cmd-r-bot (12 messages🔥):

AGI Definition, Duplicate Content Issues in Cohere Command R+, Feedback on Cohere Model Performance


Cohere ▷ #projects (4 messages):

Cohere CLI, Community Support, Building Roles

Link mentioned: GitHub - plyght/cohere-cli: Cohere CLI: Effortlessly chat with Cohere's AI directly from your terminal! 🚀: Cohere CLI: Effortlessly chat with Cohere's AI directly from your terminal! 🚀 - plyght/cohere-cli


Cohere ▷ #cohere-toolkit (5 messages):

Cohere's Math Accuracy, LLM Limitations, Improving AI Response Validity


Notebook LM Discord ▷ #use-cases (14 messages🔥):

NotebookLM for college courses, AI-generated video content, Feedback for feature requests, Guidance on source code understanding, NotebookLM for Church services


Notebook LM Discord ▷ #general (89 messages🔥🔥):

NotebookLM Features, Audio Generation Limitations, Sharing Notebooks, Customizing Conversations, Tools and Add-ons

Links mentioned:


Stability.ai (Stable Diffusion) ▷ #general-chat (90 messages🔥🔥):

AI in Comic Book Creation, Image Generation with AI, AI Art Controversy, Stable Diffusion Configuration, Background Editing Tools

Links mentioned:


GPU MODE ▷ #general (10 messages🔥):

GRPO implementations, TRL development, Float64 software for GPUs

Links mentioned:


GPU MODE ▷ #triton (19 messages🔥):

Matrix Multiplication in Triton, Device-side TMA descriptors, Persistent GEMM implementation, Autotuning issues with TMA, Collaborative GPU research

Links mentioned:


GPU MODE ▷ #cuda (14 messages🔥):

Blackwell compute capability, CUDA Toolkit 12.8, CUDA and SFML integration, Audio processing on CUDA, cuFFT library issues

Links mentioned:


GPU MODE ▷ #torch (7 messages):

FSDP fully_shard() behavior, einops alternatives with PyTorch, torch nightly build with Triton 3.2 compatibility, DeepSpeed checkpointing in Torch Lightning

Links mentioned:


GPU MODE ▷ #cool-links (1 messages):

Lindholm's Career, Unified Architecture Design, Nvidia Developments

Link mentioned: ESB 1013 - CPEN 211 101 - 2024W1 on 2024-11-19 (Tue): no description found


GPU MODE ▷ #beginner (7 messages):

CUDA Toolkit Commands, CUDA and C/C++ Compatibility, Using Graphics Cards for AI, 100 Days of CUDA, Speeding Up Hugging Face Generation

Links mentioned:


GPU MODE ▷ #pmpp-book (5 messages):

Revisiting the PMPP Book, CUDA Programming Platforms

Links mentioned:


GPU MODE ▷ #off-topic (11 messages🔥):

CUDA in Poland, SIMD definition, Dining in Warsaw

Link mentioned: CUDA · Warsaw: no description found


GPU MODE ▷ #rocm (1 messages):

leiwang1999_53585: happy to release https://github.com/tile-ai/tilelang , also support rocm 🙂


GPU MODE ▷ #self-promotion (1 messages):

Fluid Numerics, Galapagos cluster, AMD Instinct MI300A


GPU MODE ▷ #arc-agi-2 (5 messages):

Mind Evolution Strategy, Local GRPO Implementation, RL on Maths Datasets, OpenRLHF Framework

Links mentioned:


Eleuther ▷ #general (21 messages🔥):

GGUF vs other quantized formats, Inference backends comparison, Local vs cloud development, New AI services introductions

Links mentioned:


Eleuther ▷ #research (22 messages🔥):

R1 Model Performance, Titans Paper Insights, Adam-like Update Rules, Deepseek Reward Models

Links mentioned:


Eleuther ▷ #interpretability-general (4 messages):

Open Source Steering for LLMs, Current SAE Steering Methods, Open Source Steering Libraries

Links mentioned:


Eleuther ▷ #lm-thunderdome (13 messages🔥):

4bit/3bit vs f16 performance, Qwen R1 models and Q-RWKV conversion, math500 dataset for evaluation, pass@1 estimation method, evaluation templates for R1

Link mentioned: HuggingFaceH4/MATH-500 · Datasets at Hugging Face: no description found


Eleuther ▷ #gpt-neox-dev (3 messages):

Intermediate Dimension Selection, Exporting Model to HF Format, Model Parallelism Issues

Link mentioned: {: "pipe_parallel_size": 0, "model_parallel_size": 4, "make_vocab_size_divisible_by": 1, # model settings "num_layers": 32, &a...


Latent Space ▷ #ai-general-chat (59 messages🔥🔥):

Stargate Project, Gemini 2.0 updates, DeepSeek insights, Ai2 ScholarQA, WandB SWE-Bench

Links mentioned:


Latent Space ▷ #ai-announcements (1 messages):

Last Week in AI, Free AI in Gmail


OpenAI ▷ #ai-discussions (44 messages🔥):

DeepSeek R1 Performance, Generative AI Impact on Creative Industries, AI Models Comparison, Local Model Running Capabilities, AI Output Compliance Issues

Links mentioned:

        Can an AI automatically create a good knowledge graph as of Jan 2025? - 
    
</a>: no description found

OpenAI ▷ #gpt-4-discussions (7 messages):

GPT downtime issues, Chat response delays


OpenAI ▷ #prompt-engineering (1 messages):

oneidemaria: <:dallestar:1006520565558956092>


OpenAI ▷ #api-discussions (1 messages):

oneidemaria: <:dallestar:1006520565558956092>


Yannick Kilcher ▷ #general (46 messages🔥):

Neural ODE Applications, Modeling vs Algorithmic Choices in ML, RL Techniques for Small Models, Exploration Strategies in RL, MoE vs Attention Mechanisms

Links mentioned:


Yannick Kilcher ▷ #paper-discussion (4 messages):

DeepSeeks Group Relative Policy Optimization, Review Process Challenges, Collaboration of Authors and Reviewers

Links mentioned:


Yannick Kilcher ▷ #agents (1 messages):

rogerngmd: https://github.com/deepseek-ai/DeepSeek-R1/blob/main/README.md


Yannick Kilcher ▷ #ml-news (1 messages):

Suno AI Music Generator, Copyright Infringement Lawsuit, Music Industry Controversies

Link mentioned: $500m-valued Suno hit with new copyright lawsuit from Germany’s GEMA - Music Business Worldwide: GEMA represents the copyrights of around 95,000 members in Germany (composers, lyricists, music publishers) as well as over two million rightsholders worldwide.


Modular (Mojo 🔥) ▷ #general (14 messages🔥):

Programming Language Preferences, Community Showcase Discussions, Mojo Progress Updates


Modular (Mojo 🔥) ▷ #mojo (8 messages🔥):

Mojo Project .gitignore, Netlify compatibility with Mojo apps, Mojo organization domain discussion

Link mentioned: Available software at build time: Learn about the software and tools that are available for your builds at build time.


LlamaIndex ▷ #blog (2 messages):

LlamaIndex Workflows, Chat2DB GenAI Chatbot


LlamaIndex ▷ #general (18 messages🔥):

LlamaParse document parser, LlamaIndex documentation website bugs, Cached Augmented Generation with Gemini

Links mentioned:


Nomic.ai (GPT4All) ▷ #general (20 messages🔥):

Entity Identification for ModernBert, Jinja Template Insights, LMstudio Inquiries, Adobe Photoshop Support, Nomic Taxes

Link mentioned: Willj Oprah GIF - Willj Oprah Oprah Winfrey - Discover & Share GIFs: Click to view the GIF


LAION ▷ #general (5 messages):

Bud-E language capabilities, Suno Music audio input feature, Project delay on current work

Link mentioned: Tweet from Suno (@SunoMusic): Record yourself singing, playing piano, or tapping your pencil + upload into Suno to make your own song from your own sounds 😱 What have you made with our audio input feature? 🎤: @techguyver shows h...


LAION ▷ #announcements (1 messages):

BUD-E, School-BUD-E, Open Source Voice Assistants, AI Education Assistant Framework

Links mentioned:


LAION ▷ #resources (1 messages):

IPTVPlayer, AtlasVPN, TradingView-Premium, Cʀᴀᴄᴋɪɴɢ Cʟᴀss

Link mentioned: Cʀᴀᴄᴋɪɴɢ Cʟᴀss [ᴄʜᴀᴛʀᴏᴏᴍs]: The best programs are only free


LAION ▷ #learning-ml (1 messages):

IPTVPlayer, AtlasVPN, TradingView-Premium, Cʀᴀᴄᴋɪɴɡ Cʟᴀss, Free Programs

Link mentioned: Cʀᴀᴄᴋɪɴɢ Cʟᴀss [ᴄʜᴀᴛʀᴏᴏᴍs]: The best programs are only free


LAION ▷ #paper-discussion (1 messages):

IPTVPlayer offerings, AtlasVPN promotions, TradingView Premium features

Link mentioned: Cʀᴀᴄᴋɪɴɢ Cʟᴀss [ᴄʜᴀᴛʀᴏᴏᴍs]: The best programs are only free


LLM Agents (Berkeley MOOC) ▷ #mooc-questions (6 messages):

Declaration Form Requirement, Corporate Sponsors and Intern-like Tasks, New MOOC Syllabus Release


tinygrad (George Hotz) ▷ #learn-tinygrad (5 messages):

BEAM performance, WebGPU compatibility, YoloV8 FPS issues


Torchtune ▷ #general (1 messages):

Proposal for tune cat command, TRL help command length

Link mentioned: [RFC] Proposal for tune cat Command · Issue #2281 · pytorch/torchtune: First of all, thank you very much for the wonderful package. I’ve started actively looking at the source code, and I must say it’s an absolute pleasure to read. It was difficult to stop myself from...


Torchtune ▷ #papers (4 messages):

Quantifying Uncertainty in LLMs, Chain of Thought in LLMs, RL-LLM Instruction Prompts, Distillation in RL


DSPy ▷ #general (1 messages):

moresearch_: how does DSPy-based RAG deal with dynamic data?


DSPy ▷ #examples (2 messages):

Open Problem, Syntax Typo


Mozilla AI ▷ #announcements (1 messages):

Open Datasets for LLM Training, Mozilla and EleutherAI Partnership


AI21 Labs (Jamba) ▷ #general-chat (1 messages):

AI in Cybersecurity, Impact of AI on Security Teams






{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}