Frozen AI News archive

AlphaProof + AlphaGeometry2 reach 1 point short of IMO Gold

**Search+Verifier** highlights advances in neurosymbolic AI during the 2024 Math Olympics. **Google DeepMind**'s combination of **AlphaProof** and **AlphaGeometry 2** solved four out of six IMO problems, with AlphaProof being a finetuned **Gemini** model using an AlphaZero approach, and AlphaGeometry 2 trained on significantly more synthetic data with a novel knowledge-sharing mechanism. Despite impressive results, human judges noted the AI required much longer time than human competitors. Meanwhile, **Meta AI** released **Llama 3.1** with a 405B parameter model and smaller variants, and **Mistral AI** launched **Mistral Large 2** with 123B parameters and 128k context windows, outperforming Llama 3.1 on coding tasks and multilingual benchmarks. This marks significant progress in AI mathematical reasoning, model scaling, and multilingual capabilities.

Canonical issue URL

AI News for 7/24/2024-7/25/2024. We checked 7 subreddits, 384 Twitters and 30 Discords (474 channels, and 4280 messages) for you. Estimated reading time saved (at 200wpm): 467 minutes. You can now tag @smol_ai for AINews discussions!

It's been a good month for neurosymbolic AI. As humans gather for the 2024 Summer Olympics, AI has been making great advances in Math Olympics. Early this month, Numina won the first AIMO Progress Prize, solving 29/50 private set problems of olympiad math level.

While 6 teenagers on team USA won the 65th International Math Olympiad, taking back China's crown, Google DeepMind announced that their new combination of AlphaProof and a new V2 of AlphaGeometry solved four out of six problems from the IMO (including solving Problem 4 in 19 seconds), with human judges (including the IMO Problem Selection Committee Chair) awarding it 28 points out of a maximum 42, 1 point short of the cutoff for a Gold.

image.png

AlphaProof is a finetuned Gemini model combined with AlphaZero (paper) that proves mathematical statements in Lean, and uses an AlphaZero style aporoach to find solutions: image.png

AlphaGeometry 2 is a neuro-symbolic hybrid system in which the language model was based on Gemini and trained from scratch on an order of magnitude more synthetic data than its predecessor. [It] employs a symbolic engine that is two orders of magnitude faster than its predecessor. When presented with a new problem, a novel knowledge-sharing mechanism is used to enable advanced combinations of different search trees to tackle more complex problems. Before this year’s competition, AlphaGeometry 2 could solve 83% of all historical IMO geometry problems from the past 25 years, compared to the 53% rate achieved by its predecessor.

However it's not all roses: Tim Gowers, one of the human IMO judges, noted:

The main qualification is that the program needed a lot longer than the human competitors -- for some of the problems over 60 hours -- and of course much faster processing speed than the poor old human brain. If the human competitors had been allowed that sort of time per problem they would undoubtedly have scored higher.

This is also similar to 2022 OpenAI work on Lean provers.

How can AI solve both AIMO problems and fail to solve 9.11 > 9.9? There are a couple thoughts on "Jagged Intelligence" that fall to the everpresent problem of generalization.

Nevertheless it's been a big day for prediction markets and private bets on AI in the IMO.


{% if medium == 'web' %}

Table of Contents

[TOC]

{% else %}

The Table of Contents and Channel Summaries have been moved to the web version of this email: [{{ email.subject }}]({{ email_url }})!

{% endif %}


AI Twitter Recap

all recaps done by Claude 3.5 Sonnet, best of 4 runs.

Llama 3.1 and Mistral Large 2 Release

Open Source AI and Industry Impact

AI Development and Research

Industry Trends and Observations


AI Reddit Recap

/r/LocalLlama Recap

Theme 1. Open Source AI Models Challenging Closed Platforms

Theme 2. Breakthroughs in Specialized AI Capabilities

Theme 3. Uncensored AI Models and Ethical Considerations

All AI Reddit Recap

r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/LLMDevs, /r/Singularity

AI Model Releases and Benchmarks

AI Applications and Improvements

AI Generation Challenges


AI Discord Recap

A summary of Summaries of Summaries

1. AI Model Releases and Benchmarks

2. AI Search and Information Retrieval

3. Open Source AI and Community Efforts

4. AI Ethics and Data Usage


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord


LM Studio Discord


HuggingFace Discord


Nous Research AI Discord


Modular (Mojo 🔥) Discord


Perplexity AI Discord


OpenRouter (Alex Atallah) Discord


OpenAI Discord


Stability.ai (Stable Diffusion) Discord


Eleuther Discord


CUDA MODE Discord


Interconnects (Nathan Lambert) Discord


Latent Space Discord


LlamaIndex Discord


Cohere Discord


OpenAccess AI Collective (axolotl) Discord


tinygrad (George Hotz) Discord


OpenInterpreter Discord


Torchtune Discord


LAION Discord


DSPy Discord


LangChain AI Discord


AI Stack Devs (Yoko Li) Discord


The Alignment Lab AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Finetuning (Hamel + Dan) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The LLM Perf Enthusiasts AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The MLOps @Chipro Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The Mozilla AI Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The DiscoResearch Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

{% if medium == 'web' %}

Unsloth AI (Daniel Han) ▷ #general (657 messages🔥🔥🔥):

  • Data Privacy and GDPR
  • Using Discord Logs for AI Training
  • BTEC Education System
  • Value of Software Engineering vs Data Science
  • Impact of AI on Job Security

Links mentioned:


Unsloth AI (Daniel Han) ▷ #off-topic (2 messages):

  • Template Construction for Slack
  • Slack Channel Posting

Unsloth AI (Daniel Han) ▷ #help (104 messages🔥🔥):

  • Max Sequence Length in SFTTrainer
  • Llama 3 Fine-Tuning Issues
  • Inference Challenges with Fine-Tuned Models
  • Multi-Turn Conversation Dataset Formatting
  • Model Implementation on Websites

Links mentioned:


Unsloth AI (Daniel Han) ▷ #research (10 messages🔥):

  • Inference speed comparison
  • Task management with LLMs
  • Batching in inference
  • Autoregressive inference process

LM Studio ▷ #💬-general (298 messages🔥🔥):

  • LM Studio Updates
  • Model Performance
  • GPU vs RAM Usage
  • Coding Models
  • Local Model Limitations

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (85 messages🔥🔥):

  • LLaMA Model Data Mix
  • Naming Preferences in AI
  • Model Performance Comparisons
  • GPU Support in LM Studio
  • Dolphin Model Issues

Links mentioned:


LM Studio ▷ #⚙-configs-discussion (1 messages):

melkanea: i got +5600 if you count cuda cores individually


LM Studio ▷ #🎛-hardware-discussion (144 messages🔥🔥):

  • ML Inference with Various Hardware
  • P40 GPU Experience
  • RTX 3090 vs M3 Max for Inference
  • Performance of Apple Silicon for AI
  • Dual GPU Configurations

LM Studio ▷ #🧪-beta-releases-chat (27 messages🔥):

  • Beta 1 CPU issues
  • Renderer crash reports
  • New UI feedback
  • Model comparison
  • Upcoming Beta 2 release

LM Studio ▷ #amd-rocm-tech-preview (17 messages🔥):

  • Linux AppImage updates
  • GPU offloading with ROCm
  • Compatibility with 7800XT
  • Command line for ROCm
  • OpenCL performance

LM Studio ▷ #model-announcements (1 messages):

  • Mistral Large

LM Studio ▷ #🛠-dev-chat (3 messages):

  • Using Llama 3.1
  • VS Code Extensions
  • Codestral Setup

Link mentioned: Tab Autocomplete (beta) | Continue: Continue now provides support for tab autocomplete in VS Code and JetBrains IDEs. We will be greatly improving the experience over the next few releases, and it is always helpful to hear feedback. If ...


HuggingFace ▷ #announcements (1 messages):

  • Llama 3.1 Release

Link mentioned: HuggingChat: Making the community's best AI chat models available to everyone.


HuggingFace ▷ #general (421 messages🔥🔥🔥):

  • Hugging Face Community Discussions
  • Model Performance Comparisons
  • Training and Fine-tuning LLMs
  • Audio Denoising Research
  • China's Regulatory Impact on AI Models

Links mentioned:


HuggingFace ▷ #cool-finds (12 messages🔥):

  • Dolphin 2.9.3 Model Release
  • AI Solves Mathematical Olympiad
  • K-Nearest Neighbors Algorithm
  • AI Job Security Discussion

Links mentioned:


HuggingFace ▷ #i-made-this (5 messages):

  • W2V2-BERT Model for Ukrainian
  • Next Word AutoComplete
  • Community Engagement

Links mentioned:


HuggingFace ▷ #reading-group (8 messages🔥):

  • Open Source Bounty Programs
  • Diffusion Models
  • Finegrain Bounty
  • Tinygrad Bounties

Links mentioned:


HuggingFace ▷ #core-announcements (1 messages):

  • Quantized Diffusers
  • Memory Optimization
  • Orig PixArt Sigma Checkpoint Reduction

Link mentioned: feat: support diffusion models. by sayakpaul · Pull Request #255 · huggingface/optimum-quanto: What does this PR do? Fixes #252


HuggingFace ▷ #computer-vision (7 messages):

  • Labeling Platforms
  • Road Detection from Satellite Images
  • Understanding LLaVa

HuggingFace ▷ #NLP (21 messages🔥):

  • Embedding Model Fine-tuning
  • RAG System Performance
  • Embedding Numerical Data Challenges
  • Collaborative LLM Projects
  • Llama 3.1 with Inf2 Guides

HuggingFace ▷ #diffusion-discussions (2 messages):

  • Diffusion techniques in biological sequence generation
  • Updates on ComfyUI
  • MediaPipe integration
  • TensorRT performance
  • Workflow changes in ComfyUI

Link mentioned: Reddit - Dive into anything: no description found


Nous Research AI ▷ #datasets (1 messages):

jsarnecki: https://github.com/mlfoundations/MINT-1T


Nous Research AI ▷ #interesting-links (9 messages🔥):

  • Hermes 2 Theta 70B
  • Mistral Large 2
  • Reddit's indexing policy change
  • Condé Nast legal action
  • Wiki Phrases Tokenizer

Links mentioned:


Nous Research AI ▷ #announcements (1 messages):

  • Nous Research subreddit
  • Upcoming AMA

Link mentioned: Reddit - Dive into anything: no description found


Nous Research AI ▷ #general (246 messages🔥🔥):

  • Nous Research Updates
  • LLaMA Model Performance
  • Quantization and Precision in AI
  • Synthetic Data Generation
  • OpenAI Features and Releases

Links mentioned:


Nous Research AI ▷ #ask-about-llms (33 messages🔥):

  • Hermes release on Llama 3.1
  • H100 GPUs vs Gaming GPUs
  • Data Synthesis in AI
  • Image-to-Text Finetuning
  • Consumer Grade Models

Link mentioned: AI models collapse when trained on recursively generated data - Nature:  Analysis shows that indiscriminately training generative artificial intelligence on real and generated content, usually done by scraping data from the Internet, can lead to a collap...


Nous Research AI ▷ #rag-dataset (2 messages):

  • Grounded Refusals
  • Meta Team Intelligence

Nous Research AI ▷ #world-sim (1 messages):

kentrid: No code available for it, I guess?


Nous Research AI ▷ #reasoning-tasks-master-list (78 messages🔥🔥):

  • Moral Reasoning Tasks
  • Syllogism Reasoning
  • Task Structuring
  • Dataset Collaboration

Links mentioned:


Modular (Mojo 🔥) ▷ #general (20 messages🔥):

  • Open Source Git Tool - stack-pr
  • Posits and MLIR
  • Game Development and AI Overlap

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (2 messages):

  • Modular updates
  • Modular community engagement

Modular (Mojo 🔥) ▷ #✍︱blog (5 messages):

  • stack-pr tool
  • Feedback on stack-pr
  • Benefits of stacked PRs

Link mentioned: Modular: Announcing stack-pr: an open source tool for managing stacked PRs on GitHub: We are building a next-generation AI developer platform for the world. Check out our latest post: Announcing stack-pr: an open source tool for managing stacked PRs on GitHub


Modular (Mojo 🔥) ▷ #ai (1 messages):

  • Meta's commitment to open AI
  • Llama 3.1 model advancements
  • Open intelligence accessibility
  • Synthetic data generation

Link mentioned: no title found: no description found


Modular (Mojo 🔥) ▷ #mojo (97 messages🔥🔥):

  • Mojo regex support
  • Tenka package manager
  • SDL window creation
  • Iterator traits
  • Infrared 2D primitives

Links mentioned:


Modular (Mojo 🔥) ▷ #nightly (198 messages🔥🔥):

  • SIMD Comparisons
  • EqualityComparable Trait
  • SIMD Behavior for Lists
  • Performance and API Design
  • Function Overloading and Return Types

Links mentioned:


Modular (Mojo 🔥) ▷ #mojo-marathons (6 messages):

  • Mojo Implementation
  • Spam Messages

Link mentioned: Discord - Group Chat That’s All Fun & Games: Discord is great for playing games and chilling with friends, or even building a worldwide community. Customize your own space to talk, play, and hang out.


Perplexity AI ▷ #announcements (1 messages):

  • Scheduled Downtime
  • Database Maintenance

Perplexity AI ▷ #general (305 messages🔥🔥):

  • Mistral vs. Llama models
  • Perplexity's API usage
  • SearchGPT's anticipated launch
  • Education system concerns
  • Subscription and discount issues

Links mentioned:


Perplexity AI ▷ #sharing (11 messages🔥):

  • Mistral Large 2 Release
  • Reddit Blocks Unpaid Search Engines
  • Condé Nast Legal Action Against Perplexity
  • Hydrogen vs Atomic Bombs
  • First Nations Funding Opportunities

Links mentioned:


Perplexity AI ▷ #pplx-api (3 messages):

  • Microsoft Copilot Studio
  • Llama 3.1 models API

OpenRouter (Alex Atallah) ▷ #announcements (5 messages):

  • Llama 405B price cut
  • Middle-out transform changes
  • Database traffic surge
  • Llama 3.1 price reduction
  • Database performance issues

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (215 messages🔥🔥):

  • Llama 3.1 Performance
  • Inference Engine Issues
  • Price Competition Among Providers
  • Model Quantization
  • OpenRouter Provider Accountability

Links mentioned:


OpenRouter (Alex Atallah) ▷ #일반 (1 messages):

  • Mistral Large 2

OpenAI ▷ #annnouncements (1 messages):

  • SearchGPT
  • AI search features

OpenAI ▷ #ai-discussions (177 messages🔥🔥):

  • Mistral Model Download
  • MacBook Pro Performance
  • Internet Speed Upgrades
  • Voice Features in AI
  • Llama 3.1 Accessibility

Links mentioned:


OpenAI ▷ #gpt-4-discussions (8 messages🔥):

  • Feedback on GPT-4o
  • SearchGPT API Availability

OpenAI ▷ #prompt-engineering (7 messages):

  • Memory Function Calls
  • Guidance for Memory Storage
  • Specificity in Events
  • Types of Information to Store

OpenAI ▷ #api-discussions (7 messages):

  • Function calls for chatbot memories
  • Guidance for memory storage
  • Event types for memory saving
  • Specificity in user memory requirements

OpenAI ▷ #api-projects (2 messages):

  • Error uploading files to OpenAI
  • Python code for file upload
  • Vector stores configuration

Stability.ai (Stable Diffusion) ▷ #announcements (1 messages):

  • Stable Video 4D
  • Dynamic multi-angle video generation
  • Technical report release

Link mentioned: Stable Video 4D — Stability AI: We are pleased to announce the availability of Stable Video 4D, an innovative model that allows users to upload a single video and receive dynamic novel-view videos of eight new angles/views, deliveri...


Stability.ai (Stable Diffusion) ▷ #general-chat (147 messages🔥🔥):

  • Updates on Stability AI Projects
  • Usage of Stable Diffusion
  • Discussion on Models and Performance
  • Lora Training Techniques
  • Inpainting Techniques

Links mentioned:


Eleuther ▷ #general (83 messages🔥🔥):

  • Flash Attention vs Traditional Attention
  • VRAM Usage in Inference
  • Chunking in Attention Mechanisms
  • Comparisons of Attention Algorithms
  • Multiple-Choice Datasets and APIs

Links mentioned:


Eleuther ▷ #research (51 messages🔥):

  • Inference Costs for Models
  • MoE Efficiency
  • Meta Research Strategy
  • AlphaProof Breakthrough
  • xAI's Market Position

Links mentioned:


Eleuther ▷ #scaling-laws (9 messages🔥):

  • Meta scaling laws
  • Data scaling functions

Eleuther ▷ #interpretability-general (2 messages):

  • Awesome Interpretability Repository
  • NDIF Llama3-405b Access Opportunity

Links mentioned:


Eleuther ▷ #lm-thunderdome (2 messages):

  • Evaluating MMLU on External APIs
  • Calculating VRAM Requirements

Link mentioned: Refactor API models by baberabb · Pull Request #2008 · EleutherAI/lm-evaluation-harness: This PR introduces a new superclass for API request models, providing: Modularity for downstream classes Overloadable methods for request transformation, API requests and response parsing Tokeniza...


CUDA MODE ▷ #general (2 messages):

  • NCCL Performance
  • Flute Matrix Multiplications

Links mentioned:


CUDA MODE ▷ #triton (1 messages):

  • CUDA profiling tools
  • Nsight Compute
  • Triton testing helpers

Links mentioned:


CUDA MODE ▷ #torch (1 messages):

andreaskoepf: PyTorch 2.4 was released: https://pytorch.org/blog/pytorch2-4/


CUDA MODE ▷ #cool-links (1 messages):

  • AlphaProof
  • AlphaGeometry 2
  • Mathematical reasoning
  • AGI potential in math

Link mentioned: AI achieves silver-medal standard solving International Mathematical Olympiad problems: Breakthrough models AlphaProof and AlphaGeometry 2 solve advanced reasoning problems in mathematics


CUDA MODE ▷ #jobs (3 messages):

  • ML/AI Career Roadmap
  • Programming and Math Background

Link mentioned: ML Roadmap: 3 months - (sept, oct, nov) roadmap Statistics: https://www.youtube.com/watch?v=MXaJ7sa7q-8&list=PL0KQuRyPJoe6KjlUM6iNYgt8d0DwI-IGR&t=11s (1 week) Linear Algebra - https://www.youtube.com/wat...


CUDA MODE ▷ #beginner (6 messages):

  • Quantization techniques for models
  • Memory issues with fp16 execution

Link mentioned: Quantization: no description found


CUDA MODE ▷ #torchao (1 messages):

marksaroufim: <@1213148470664495114>


CUDA MODE ▷ #ring-attention (18 messages🔥):

  • Blockwise Attention Implementation
  • KV Cache Splitting
  • Ring Attention in Llama 3
  • Pipeline Parallelism
  • Llama 3.1 Features

CUDA MODE ▷ #off-topic (6 messages):

  • Slider Game Launch
  • Game Comparison with Baba Is You
  • New Member Introduction
  • Business Model Discussion

CUDA MODE ▷ #irl-meetup (2 messages):

  • ICML Conference
  • Coffee Meet-up

CUDA MODE ▷ #llmdotc (96 messages🔥🔥):

  • FP8 Training Challenges
  • Outlier Detection in Training
  • muP and Unit Scaling
  • Model Performance Improvements
  • GitHub Pull Requests

Links mentioned:


CUDA MODE ▷ #rocm (1 messages):

andreaskoepf: https://x.com/AMD/status/1816168883587538946


CUDA MODE ▷ #lecture-qa (2 messages):

  • Lecture 24 Slides
  • GitHub Repository Updates

Link mentioned: GitHub - cuda-mode/lectures: Material for cuda-mode lectures: Material for cuda-mode lectures. Contribute to cuda-mode/lectures development by creating an account on GitHub.


Interconnects (Nathan Lambert) ▷ #news (11 messages🔥):

  • DeepMind AI achievements
  • Runway AI training data leaks
  • OpenAI's SearchGPT prototype

Links mentioned:


Interconnects (Nathan Lambert) ▷ #ml-questions (9 messages🔥):

  • Books on Modern Architectures
  • LLAMA 3.1 Annealing
  • Foundations of Computer Vision Book

Link mentioned: Understanding Deep Learning: no description found


Interconnects (Nathan Lambert) ▷ #ml-drama (19 messages🔥):

  • Student Open Letter Contest
  • New York Times Opinions
  • B2B Pricing Competition
  • GPT-4 Magnet Link
  • Parker Conrad and Rippling

Link mentioned: Tweet from Alex Cohen 🤠 (@anothercohen): Update: Holy shit Quoting Alex Cohen 🤠 (@anothercohen) Y'all want to see a dead body?


Interconnects (Nathan Lambert) ▷ #random (50 messages🔥):

  • GPT-4o Training Data Insights
  • Importance of Prompt Diversity
  • Galactica LLM Retrospective
  • SearchGPT Testing
  • Challenges in Dataset Diversity

Links mentioned:


Interconnects (Nathan Lambert) ▷ #memes (39 messages🔥):

  • Perplexity's Overhype
  • Zuckerberg vs OpenAI Strategies
  • Web Browsing Capabilities of LLMs
  • Research Queries and Agent Efficiency

Link mentioned: Tweet from kif (@kifleswing): In ChatGPT's recent search engine announcement, they ask for "music festivals in Boone North Carolina in august" There are five results in the example image in the ChatGPT blog post : ...


Interconnects (Nathan Lambert) ▷ #rlhf (1 messages):

  • Pluralistic Alignment
  • Synthetic Personas
  • Persona Hub

Link mentioned: Tweet from SynthLabs (@synth_labs): 🚨New paper🚨 PERSONA: A Reproducible Testbed for Pluralistic Alignment We evaluate how LMs align with diverse user values using 1,586 synthetic personas & 317,200 preference pairs Personas reflect...


Interconnects (Nathan Lambert) ▷ #reads (2 messages):

  • Future of AI Control
  • OpenAI Rule-Based Reward Paper

Link mentioned: Opinion | Sam Altman: AI’s future must be democratic - The Washington…: no description found


Latent Space ▷ #ai-general-chat (127 messages🔥🔥):

  • SearchGPT Launch
  • AI at IMO
  • Rule-Based Rewards
  • LLM as Judge
  • Synthetic Data Concerns

Links mentioned:


LlamaIndex ▷ #blog (2 messages):

  • Structured Data Extraction
  • LlamaExtract
  • Pydantic Integration
  • LLM-powered ETL

LlamaIndex ▷ #general (98 messages🔥🔥):

  • OpenAI Calls with MultiStepQueryEngine
  • RAG Chatbot Development
  • Updating Knowledge Graph Node Embeddings
  • Document Summary Index Errors
  • Chunking and Triple Extraction Modifications

Links mentioned:


LlamaIndex ▷ #ai-discussion (4 messages):

  • Monitoring Llama Agents
  • Route Planning with RAG

Cohere ▷ #general (70 messages🔥🔥):

  • Cohere Overview
  • Writing Research Papers
  • Langchain's ChatPromptTemplate

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (31 messages🔥):

  • Mistral Large 2
  • Multi-token predictions
  • Training data efficiency
  • Perplexity issues
  • Release confusion

Link mentioned: Large Enough: Today, we are announcing Mistral Large 2, the new generation of our flagship model. Compared to its predecessor, Mistral Large 2 is significantly more capable in code generation, mathematics, and reas...


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (5 messages):

  • AdamW 8-bit optimization
  • FSDP and Zero3 challenges
  • 405B model loading issues
  • QLoRA efficiency

OpenAccess AI Collective (axolotl) ▷ #general-help (2 messages):

  • Training Configurations

tinygrad (George Hotz) ▷ #learn-tinygrad (37 messages🔥):

  • Kernel Sharing Discussion
  • Tinygrad Cache Sharing
  • Multiple Gradients in Tinygrad
  • Random Tensor Generation Issue
  • Optimization in NumPy Conversion

Links mentioned:


OpenInterpreter ▷ #general (14 messages🔥):

  • Mistral-Large-Instruct-2407
  • Llama 3.1 output token max
  • Ubuntu installation instructions
  • GPT-4o-mini fine-tuning
  • Deepseek performance

Link mentioned: Issues · OpenInterpreter/open-interpreter: A natural language interface for computers. Contribute to OpenInterpreter/open-interpreter development by creating an account on GitHub.


OpenInterpreter ▷ #O1 (6 messages):

  • Shipping updates for 01
  • React Native/Expo app development
  • WatchOS custom case for 01
  • Interpreter on Rabbit device

Link mentioned: GitHub: Let’s build from here: GitHub is where over 100 million developers shape the future of software, together. Contribute to the open source community, manage your Git repositories, review code like a pro, track bugs and fea...


OpenInterpreter ▷ #ai-content (5 messages):

  • Database Complexity
  • Business Presentation Needs
  • Solutions by OpenInterpreter
  • Case Studies
  • Implementation Overview

Torchtune ▷ #general (6 messages):

  • Llama 3/3.1 70B Generation Recipe
  • Multi-GPU Inference
  • Quantization Techniques
  • FSDP Integration

Torchtune ▷ #dev (9 messages🔥):

  • Llama 3.1 Updates
  • Memory Management in Fine-Tuning
  • RFC for Cross Attention
  • Memory Optimizations with Snowflake
  • New Transformations in Models

Links mentioned:


LAION ▷ #general (1 messages):

adiptamartu: is whisper speech model support bahasa indonesia language ? @here thanks for the info


LAION ▷ #research (10 messages🔥):

  • Mistral Large 2
  • DFT Vision Transformer Architecture
  • Rotary Position Encoding
  • Complex Number Parameters
  • Normalization Techniques

Links mentioned:


DSPy ▷ #papers (7 messages):

  • SymbolicAgentLearner Development
  • GitHub Sharing Plans

DSPy ▷ #general (1 messages):

  • litellm proxy
  • function calling across models

DSPy ▷ #examples (1 messages):

  • News categorization
  • GPT-3.5-turbo
  • MIPRO
  • ColBERTv2
  • F1 score

LangChain AI ▷ #general (7 messages):

  • LangChain Agents Consistency Issues
  • Working with Multi Agents
  • Using ConversationSummary with Database Agents
  • LangChain and Ollama Video Release
  • LangGraph Persistence Options

Link mentioned: Fully local tool calling with Ollama: Tools are utilities (e.g., APIs or custom functions) that can be called by an LLM, giving the model new capabilities. However, LLMs need to be able to 1) sel...


AI Stack Devs (Yoko Li) ▷ #ai-raspberry-pi (1 messages):

felixultimaforeverromanempire: this is cool, tell us more








{% else %}

The full channel by channel breakdowns have been truncated for email.

If you want the full breakdown, please visit the web version of this email: [{{ email.subject }}]({{ email_url }})!

If you enjoyed AInews, please share with a friend! Thanks in advance!

{% endif %}