Frozen AI News archive

World_sim.exe

**NVIDIA** announced **Project GR00T**, a foundation model for humanoid robot learning using multimodal instructions, built on their tech stack including Isaac Lab, OSMO, and Jetson Thor. They revealed the **DGX Grace-Blackwell GB200** with over **1 exaflop** compute, capable of training **GPT-4 1.8T parameters** in 90 days on 2000 Blackwells. Jensen Huang confirmed GPT-4 has **1.8 trillion parameters**. The new **GB200 GPU** supports float4/6 precision with ~3 bits per parameter and achieves **40,000 TFLOPs** on fp4 with 2x sparsity.

Canonical issue URL

Lots of Nvidia GTC recaps out there - youtube does a better job than we can.

We were accidentally part of the news cycle yesterday, with Karan (CEO of Nous Research) demoing his world_sim.exe explorations. It's purely for fun, but a very interesting exploration of where roleplay prompt engineering can take you.


Table of Contents

[TOC]


PART X: AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs

NVIDIA GTC Announcements

Open Source LLMs and Implementations

Retrieval Augmented Generation (RAG)

Emerging Trends and Opinions


PART 0: Summary of Summaries of Summaries

Since Claude 3 Haiku was released recently, we're adding them to this summary run for you to compare. We'll keep running these side by side for a little longer while we build the AINews platform for a better UX.

Claude 3 Haiku (3B?)

Claude 3 Sonnet (14B?)

Claude 3 Opus (>220B?)

ChatGPT (GPT4T)

These themes collectively capture the dynamic and multifaceted nature of the AI landscape, characterized by rapid technological advancements, ethical and policy debates, community engagement in optimization efforts, and the ongoing quest for enhancing AI training and data management practices.


PART 1: High level Discord summaries

Stability.ai (Stable Diffusion) Discord


Perplexity AI Discord


Unsloth AI (Daniel Han) Discord


LM Studio Discord


Nous Research AI Discord


Eleuther Discord


OpenAI Discord


HuggingFace Discord


LlamaIndex Discord

New Tricks for Old Dogs: Interactive techniques to treat documents as dynamic entities in the Retrieval-Augmented Generation (RAG) pipeline are proposed, potentially improving RAG performance through more sophisticated interactions. The discussion included a step-by-step guide covering effective RAG implementation with tools like LlamaParse and Qdrant.

LlamaIndex 0.10.20 Instrumental for Engineers: The release of LlamaIndex v0.10.20, with its new Instrumentation module, offers enhanced observability features and API call monitoring, illustrated in shared notebooks. The release announcement and resources can be found through their Twitter update.

Search Safari: A novel method termed Search-in-the-Chain, integrating retrieval and planning for ultimate question answering prowess, is showcased - possibly revolutionizing real-time adjustment abilities in QA pipelines. A paper on the matter was highlighted, and community interest seemed piqued by the tweet.

Resume Routing Revolution: A blog post demonstrates a new model that marries LlamaParse and LlamaIndex to facilitate efficient job matching, parsing complex CV formats with relative ease. Kyosuke Morita's post on the subject is findable in this Twitter thread.

Agentic Memory Architecture Arrives: The advent of MemGPT, an architecture designed to enhance memory functions of AI agents, seems to promise significant improvements to assistant APIs, focusing on reliable memory operations. Engineers are directed to a webinar tweet for more enlightenment.


Latent Space Discord


LAION Discord


OpenAccess AI Collective (axolotl) Discord

Axolotl Stepping Up the Model Optimization Game: Axolotl devs have introduced ScatterMoE, an optimization aimed at boosting Huggingface throughput, and users are directed to its GitHub branch for more details. Upgrading to PyTorch 2.2 or above is necessary for compatibility, with some already on PyTorch 2.2.1.

Groking Grok's Gargantuan Size: The release of Grok-1 model weights with 314 Billion parameters was a topic of discussion, with a member commenting on suboptimal performance and the resource-intensiveness of running it. While only the int8 version is released, there's speculation about managing it using Axolotl's qLoRA FSDP, per the Grok GitHub page.

NVIDIA's Hardware Hype Hits New Heights: Expected around 2025, NVIDIA's RTX 5000 series could bring a 50% VRAM increase and 78% bandwidth boost; specifics can be found in articles from Heise and TechPowerUp.

Model Training and Conversion Conundrums: Tokenizer issues were noticed when using <summary> tags, leading one to discover tokenization inconsistencies. Another user struggled with local model and data setups, leading to HFValidationError challenges. Conversational data fine-tuning errors were resolved with reference to Axolotl's readme, addressing empty dataset "role" arrays by mapping additional roles and excluding short conversations.

Datasets Dialogue Drives Discovery: A user showed interest in a Mistral model fine-tuned on math and coding datasets, with a suggestion floated about utilizing merging tactics such as mergekit to handle extensive data without individual training. The compatibility of different model chat formats during merging was also questioned, but not conclusively addressed.


CUDA MODE Discord


OpenRouter (Alex Atallah) Discord

Relevant links of interest from these discussions include OpenRouter and xai-org's Grok open release on GitHub.


LangChain AI Discord

API Confusion in LangChain Land: Members debated the merits of LangChain's astream_log versus astream_events, with concerns about the latter being in beta and potential deprecation. However, there was no clear consensus on whether one API is favoured over the other or if they are meant to serve distinct purposes.

Community to the Documentation Rescue: The call for clarifications and contributions to LangChain documentation resonated, as users faced challenges with navigation and found the materials somewhat lacking, particularly for newcomers to the platform.

Rubik’s AI Assembles Its Beta Testing Squad: An invitation for beta testing a robust research assistant called Rubik's AI was issued, promising access to high-powered models like Claude 3 Opus, GPT-4 Turbo, and Mistral Large. Keen participants are directed to their waitlist.

LangChain AI Showcase: From AI chatbots for data analysis to living bookmarks and personalized nutrition apps, members shared their LangChain-powered innovations with the community. Projects demonstrated integration of advanced features with repositories available on GitHub and demonstrations via YouTube.

Streaming Stuck in Static: Technical issues arose with LangChain's RemoteRunnable when attempting to stream outputs in JavaScript, which diverts to an /invoke call rather than the expected /stream. The matter appears complex, with no recent documentation or changes addressing the JavaScript-specific streaming conundrum.


Interconnects (Nathan Lambert) Discord


Alignment Lab AI Discord


LLM Perf Enthusiasts AI Discord


DiscoResearch Discord


Datasette - LLM (@SimonW) Discord

Unfortunately, the query about recovering the seed used by the OpenAI models for a previous API request did not contain sufficient detail to warrant inclusion in this summary.


Skunkworks AI Discord


PART 2: Detailed by-Channel summaries and links

Stability.ai (Stable Diffusion) ▷ #announcements (1 messages):

Link mentioned: Introducing Stable Video 3D: Quality Novel View Synthesis and 3D Generation from Single Images — Stability AI: When we released Stable Video Diffusion, we highlighted the versatility of our video model across various applications. Building upon this foundation, we are excited to release Stable Video 3D. This n...


Stability.ai (Stable Diffusion) ▷ #general-chat (988 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #announcements (1 messages):


Perplexity AI ▷ #general (795 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (35 messages🔥):


Perplexity AI ▷ #pplx-api (64 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #general (853 messages🔥🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #announcements (1 messages):

Link mentioned: GitHub - unslothai/unsloth: 2-5X faster 70% less memory QLoRA & LoRA finetuning: 2-5X faster 70% less memory QLoRA & LoRA finetuning - unslothai/unsloth


Unsloth AI (Daniel Han) ▷ #random (25 messages🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (568 messages🔥🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #suggestions (21 messages🔥):

Links mentioned:


LM Studio ▷ #💬-general (301 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (138 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧠-feedback (12 messages🔥):

Link mentioned: andrewcanis/c4ai-command-r-v01-GGUF · Hugging Face: no description found


LM Studio ▷ #🎛-hardware-discussion (480 messages🔥🔥🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (4 messages):

Link mentioned: GitHub - lmstudio-ai/configs: LM Studio JSON configuration file format and a collection of example config files.: LM Studio JSON configuration file format and a collection of example config files. - lmstudio-ai/configs


LM Studio ▷ #langchain (1 messages):


LM Studio ▷ #avx-beta (5 messages):


LM Studio ▷ #amd-rocm-tech-preview (5 messages):

Link mentioned: GitHub - brknsoul/ROCmLibs: Prebuild Windows ROCM Libs for gfx1031 and gfx1032: Prebuild Windows ROCM Libs for gfx1031 and gfx1032 - brknsoul/ROCmLibs


LM Studio ▷ #crew-ai (1 messages):


Nous Research AI ▷ #off-topic (56 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #interesting-links (16 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (656 messages🔥🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (25 messages🔥):

Links mentioned:


Nous Research AI ▷ #bittensor-finetune-subnet (18 messages🔥):


Nous Research AI ▷ #rag-dataset (100 messages🔥🔥):

Link mentioned: scratchTHOUGHTS/commanDUH.py at main · EveryOneIsGross/scratchTHOUGHTS: 2nd brain scratchmemory to avoid overrun errors with self. - EveryOneIsGross/scratchTHOUGHTS


Eleuther ▷ #general (273 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (245 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (11 messages🔥):


Eleuther ▷ #interpretability-general (13 messages🔥):

Links mentioned:


Eleuther ▷ #lm-thunderdome (31 messages🔥):

Links mentioned:


Eleuther ▷ #gpt-neox-dev (3 messages):


OpenAI ▷ #ai-discussions (193 messages🔥🔥):

Link mentioned: Enterprise privacy: no description found


OpenAI ▷ #gpt-4-discussions (34 messages🔥):


OpenAI ▷ #prompt-engineering (79 messages🔥🔥):


OpenAI ▷ #api-discussions (79 messages🔥🔥):


HuggingFace ▷ #general (96 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (12 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (12 messages🔥):

Links mentioned:


HuggingFace ▷ #NLP (18 messages🔥):

Link mentioned: Introduction - Hugging Face NLP Course: no description found


LlamaIndex ▷ #blog (7 messages):

Links mentioned:


LlamaIndex ▷ #general (303 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (4 messages):

Link mentioned: RAG with LlamaParse, Qdrant and Groq | Step By Step: In this video, I will show you how to create a effective RAG with LlamaParse, Qdrant and Groq. I will explain what LlamaParse is and briefly walk you through...


Latent Space ▷ #ai-general-chat (202 messages🔥🔥):

Links mentioned:


Latent Space ▷ #ai-announcements (2 messages):

Link mentioned: Suno, an AI music generator | Hacker News: no description found


Latent Space ▷ #llm-paper-club-west (20 messages🔥):


Latent Space ▷ #ai-in-action-club (36 messages🔥):

Links mentioned:


LAION ▷ #general (168 messages🔥🔥):

Links mentioned:


LAION ▷ #research (13 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (99 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (24 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general-help (35 messages🔥):


OpenAccess AI Collective (axolotl) ▷ #datasets (8 messages🔥):

Link mentioned: GitHub - NVIDIA/NeMo-Curator: Scalable toolkit for data curation: Scalable toolkit for data curation. Contribute to NVIDIA/NeMo-Curator development by creating an account on GitHub.


OpenAccess AI Collective (axolotl) ▷ #rlhf (1 messages):

duh_kola: Is it possible to use different lora adapter to do dpo on another model


CUDA MODE ▷ #general (43 messages🔥):

Links mentioned:


CUDA MODE ▷ #triton (7 messages):

Link mentioned: Google Colaboratory: no description found


CUDA MODE ▷ #cuda (68 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #suggestions (5 messages):

Links mentioned:


CUDA MODE ▷ #jobs (1 messages):

vim410: Depends. But yes.


CUDA MODE ▷ #beginner (5 messages):

Link mentioned: no title found: no description found


CUDA MODE ▷ #pmpp-book (6 messages):


CUDA MODE ▷ #ring-attention (14 messages🔥):

Links mentioned:


CUDA MODE ▷ #off-topic (5 messages):

Link mentioned: MLSys 2024: no description found


CUDA MODE ▷ #gtc-meetup (9 messages🔥):

Link mentioned: I Snuck Into A Secret Arms-Dealer Conference: Get an exclusive video every month at https://www.patreon.com/Boy_BoyWe made this in collaboration with the legendary Australian political satire group The C...


OpenRouter (Alex Atallah) ▷ #general (159 messages🔥🔥):

Links mentioned:


LangChain AI ▷ #general (95 messages🔥🔥):

Links mentioned:


LangChain AI ▷ #langserve (45 messages🔥):

Links mentioned:


LangChain AI ▷ #share-your-work (11 messages🔥):

Links mentioned:


LangChain AI ▷ #tutorials (2 messages):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #other-papers (8 messages🔥):

Link mentioned: Logits of API-Protected LLMs Leak Proprietary Information: The commercialization of large language models (LLMs) has led to the common practice of high-level API-only access to proprietary models. In this work, we show that even with a conservative assumption...


Interconnects (Nathan Lambert) ▷ #ml-drama (19 messages🔥):

Link mentioned: Tweet from Stella Biderman (@BlancheMinerva): @natolambert @felix_red_panda You're wrong though :P


Interconnects (Nathan Lambert) ▷ #random (63 messages🔥🔥):

Links mentioned:


Alignment Lab AI ▷ #general-chat (6 messages):


Alignment Lab AI ▷ #oo (32 messages🔥):

Link mentioned: keirp/hungarian_national_hs_finals_exam · Datasets at Hugging Face: no description found


LLM Perf Enthusiasts AI ▷ #general (1 messages):

Since there is only a single message with an incomplete context provided here, it is not possible to generate a summary. If you provide more of the channel's message history, I'd be able to compile the requested summary for you.


LLM Perf Enthusiasts AI ▷ #claude (7 messages):

Link mentioned: Tweet from roon (@tszzl): anthropic is controlled opposition to put the fear of god in the members of technical staff


LLM Perf Enthusiasts AI ▷ #reliability (16 messages🔥):

Links mentioned:


LLM Perf Enthusiasts AI ▷ #openai (1 messages):

res6969: https://x.com/leopoldasch/status/1768868127138549841?s=46


DiscoResearch ▷ #general (21 messages🔥):

Links mentioned:


DiscoResearch ▷ #discolm_german (4 messages):


Datasette - LLM (@SimonW) ▷ #ai (20 messages🔥):

Links mentioned:


Datasette - LLM (@SimonW) ▷ #llm (1 messages):

obra: Is it possible to recover the seed used by the openai models for a previous api request?


Skunkworks AI ▷ #general (17 messages🔥):


Skunkworks AI ▷ #off-topic (1 messages):

pradeep1148: https://www.youtube.com/watch?v=ZlJbaYQ2hm4