Frozen AI News archive

MM1: Apple''s first Large Multimodal Model

**Apple** announced the **MM1** multimodal LLM family with up to **30B parameters**, claiming performance comparable to **Gemini-1** and beating larger older models on VQA benchmarks. The paper targets researchers and hints at applications in embodied agents and business/education. **Yann LeCun** emphasized that human-level AI requires understanding the physical world, memory, reasoning, and hierarchical planning, while **Fran\0ois Chollet** cautioned that NLP is far from solved despite LLM advances. **Cohere** released **Command-R**, a model for Retrieval Augmented Generation, and **Anthropic** highlighted the **Claude 3** family (Opus, Sonnet, Haiku) for various application needs. Open-source hardware **DexCap** enables dexterous robot manipulation data collection affordably. Tools like **CopilotKit** simplify AI integration into React apps, and migration to **Keras 3** with JAX backend offers faster training. New projects improve reranking for retrieval and add financial agents to **LangChain**. The content includes insights on AI progress, new models, open-source tools, and frameworks.

Canonical issue URL

Apple continues to make moves in AI, announcing (but not releasing) MM1 with a paper, claiming it is Gemini-1 level:

image.png

The 30B model beats larger older models at the (flawed) VQA benchmarks:

image.png

The paper is oriented at researchers, providing some useful ablations for hyperparams and architecture.

The appendices hints at usecases for embodied agents:

image.png

and business/education:

image.png

For a selection of competing open VLMs, there is a new HF leaderboard you can reference.


Table of Contents

[TOC]


PART X: AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs

AI Progress and Limitations

New Models and Datasets

Open Source and Reproducibility

Tools and Frameworks

Memes and Humor


PART 0: Summary of Summaries of Summaries

Since Claude 3 Haiku was released recently, we're adding them to this summary run for you to compare. We'll keep running these side by side for a little longer while we build the AINews platform for a better UX.

Claude 3 Haiku (3B?)

Commentary: We experimented tweaking the Haiku prompt since it was not doing well. It seems Flow Engineering > Prompt Engineering for Haiku. However the topic clustering doesn't look great yet.

Positional Encoding and Language Model Capabilities:

Function Calling and JSON Handling:

Fine-Tuning and Model Performance:

Hardware and System Optimizations:

Community Knowledge Sharing and Open-Source Practices:

Claude 3 Sonnet (14B?)

Commentary: Sonnet kinda broke today and didn't follow our instructions as well as every single day prior. We manually prompted it back toward somehow behaving but something feels off.

Claude 3 Opus (>220B?)

Commentary: this one comes closest to what was originally prompted (we asked for top 4-5 themes across everything)... but we actually prefer the output of the other 2 despite the length. In this case adhering too closely to our prompt was not good.

ChatGPT (GPT4T)

Commentary: good list of prompt eng tools in there. Our GPT prompt has fallen behind our Claude prompt in terms of readable quality so we will focus on improving this next.


PART 1: High level Discord summaries

Nous Research AI Discord Summary


Unsloth AI (Daniel Han) Discord Summary


LM Studio Discord Summary

Model Conundrums and Quantization Queries: Users delved into LM Studio intricacies, such as seeking advice to improve API inferencing and addressing difficulties using multiple GPUs. Misunderstandings about model support and extensions, like the .gguf file, were clarified, with a focus on model types like Command-R 35B and Mistral Non-Instruct. Upcoming features like RAG integration in LM Studio v0.2.17 and IQ1 model compression tests also sparked interest, revealing that quality levels Q3 or 3-bit are needed for stable Mixtral and MOE model performance.

Interdisciplinary Hardware Harmony: Hardware discussions spanned from optimizing Apple Silicon for LLMs to considering the efficacy of NVLINK for enhancing Goliath 120B model performance. Enthusiasts shared experiences on system memory, with debates on the ideal RAM configuration and the anticipation for Nvidia's new RTX 5090 GPU. Concurrently, ROCm beta limitations were highlighted with reports of issues with GPU offloading, particularly on AMD 7480HS and integrated GPUs. A Reddit post and a GitHub repository provided additional insights into tweaking VRAM and resolving AMD GPU offloading dilemmas.

Relevant links for additional context:


Perplexity AI Discord Summary

Haiku for the Technical Mind: Claude 3 Haiku has been unleashed at Perplexity Labs, offering a new poetic twist to AI.

Techies Prefer Claude 3: Users are gravitating towards Claude 3 for an array of tasks, including writing and content creation, citing its strengths over other GPT models.

Perplexing API Quirks and Queries: The Perplexity API is stirring both intrigue and confusion among users with issues around real-time data querying and inconsistent responses when compared to the chat interface.

Firefox Extension Uses Perplexity API: A user is experimenting with a Firefox extension that taps into the Perplexity API, still at a proof of concept stage.

Mind the API Deprecations: Members are puzzled by the operational status of the pplx-70b-online model, noting planned deprecation but observing ongoing responses as of March 15.


Eleuther Discord Summary

Game AI Gets Green Thumbs: Discussions envisioned an AI mastering Animal Crossing, epitomizing the capability of game-playing AIs and highlighting benchmarks for their success. The analyses reflected on AI strategies and fairness, with constraints suggested like action limits or induced latency to level the playing field against human gamers.

Interpreting the Unseen in AI: Engineers examined latent decoding by vector-db-lookup to demystify AI's intermediate representations, employing multilingual embeddings from Llama2 to decode at various layers. They engaged in bilingual tokenizer experiments, pondering the weight of training data on AI biases and exploring text generation from n-gram statistics, citing an implementation on GitHub.

AI Detection and Authorship Integrity: The limitations of AI content detectors were scrutinized, suggesting reliance on verifiable creation processes as the only substantial proof of human authorship. Cryptographic watermarking debates ensued, centering on its true efficacy and ramifications for model utility, with additional talk regarding innovations such as Quiet-STaR for AI reasoning improvement.

Workflow Woes in AI Evaluation: The verbosity of the latest language models poses challenges for extracting useful responses in LLM evaluation tasks. Skepticism arose around vector space models effectively capturing language meaning, fueled by the ungrammatical outputs observed from models like GPT-J. In trying to incorporate custom models into lm-evaluation-harness, new users expressed the need for clearer examples for integrating functions like generate_until.

Augmenting AI's Prompt Perspicacity: A link to Brian Fitzgerald's exploration of prompt augmentation was shared (brianfitzgerald.xyz/prompt-augmentation/), possibly alluding to recent advancements or methods in bolstering AI's response generation through enriched input prompts, capturing the interest of those invested in enhancing AI interactions.


HuggingFace Discord Summary


LlamaIndex Discord Summary


Latent Space Discord Summary


OpenAI Discord Summary


OpenAccess AI Collective (axolotl) Discord Summary


OpenRouter (Alex Atallah) Discord Summary


CUDA MODE Discord Summary


LangChain AI Discord Summary


LAION Discord Summary


LLM Perf Enthusiasts AI Discord Summary


Skunkworks AI Discord Summary


Datasette - LLM (@SimonW) Discord Summary


Interconnects (Nathan Lambert) Discord Summary


DiscoResearch Discord Summary


PART 2: Detailed by-Channel summaries and links

Nous Research AI ▷ #ctx-length-research (3 messages):


Nous Research AI ▷ #off-topic (23 messages🔥):

Links mentioned:


Nous Research AI ▷ #interesting-links (10 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (406 messages🔥🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (60 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #bittensor-finetune-subnet (3 messages):


Unsloth AI (Daniel Han) ▷ #general (151 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #random (17 messages🔥):


Unsloth AI (Daniel Han) ▷ #help (221 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #suggestions (12 messages🔥):


LM Studio ▷ #💬-general (216 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (28 messages🔥):

Links mentioned:


LM Studio ▷ #🧠-feedback (6 messages):

Link mentioned: andrewcanis/c4ai-command-r-v01-GGUF · Hugging Face: no description found


LM Studio ▷ #🎛-hardware-discussion (126 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (1 messages):


LM Studio ▷ #amd-rocm-tech-preview (19 messages🔥):

Links mentioned:


Perplexity AI ▷ #announcements (2 messages):


Perplexity AI ▷ #general (325 messages🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (12 messages🔥):


Perplexity AI ▷ #pplx-api (31 messages🔥):

Link mentioned: About "return_citations": no description found


Eleuther ▷ #general (132 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (117 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (1 messages):

kerls: are there any resources on scaling laws for video generation models?


Eleuther ▷ #interpretability-general (32 messages🔥):

Links mentioned:


Eleuther ▷ #lm-thunderdome (5 messages):

Links mentioned:


Eleuther ▷ #multimodal-general (1 messages):

boneamputee: https://brianfitzgerald.xyz/prompt-augmentation/


HuggingFace ▷ #announcements (1 messages):


HuggingFace ▷ #general (115 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (4 messages):

Link mentioned: GitHub - nvbn/thefuck: Magnificent app which corrects your previous console command.: Magnificent app which corrects your previous console command. - nvbn/thefuck


HuggingFace ▷ #cool-finds (6 messages):

Links mentioned:


HuggingFace ▷ #i-made-this (9 messages🔥):


HuggingFace ▷ #reading-group (11 messages🔥):

Links mentioned:


HuggingFace ▷ #core-announcements (1 messages):


HuggingFace ▷ #diffusion-discussions (8 messages🔥):

Link mentioned: Kohya Hires fix · Issue #7265 · huggingface/diffusers: is diffusers possible to support this hires fix? it looks 1.5 work too AUTOMATIC1111/stable-diffusion-webui#13974 https://www.youtube.com/watch?v=SbgMwHDXthU same seed at 1024x1024 with without Thi...


HuggingFace ▷ #computer-vision (8 messages🔥):


HuggingFace ▷ #NLP (9 messages🔥):


HuggingFace ▷ #diffusion-discussions (8 messages🔥):

Link mentioned: Kohya Hires fix · Issue #7265 · huggingface/diffusers: is diffusers possible to support this hires fix? it looks 1.5 work too AUTOMATIC1111/stable-diffusion-webui#13974 https://www.youtube.com/watch?v=SbgMwHDXthU same seed at 1024x1024 with without Thi...


LlamaIndex ▷ #blog (4 messages):

Links mentioned:


LlamaIndex ▷ #general (132 messages🔥🔥):

Links mentioned:


Latent Space ▷ #ai-general-chat (61 messages🔥🔥):


Latent Space ▷ #ai-announcements (5 messages):

Link mentioned: Making Transformers Sing - with Mikey Shulman of Suno: Giving computers a voice has always been at the center of sci-fi movies; “I’m sorry Dave, I’m afraid I can’t do that” wouldn’t hit as hard if it just appeare...


Latent Space ▷ #llm-paper-club-west (24 messages🔥):


Latent Space ▷ #ai-in-action-club (36 messages🔥):

Links mentioned:


OpenAI ▷ #ai-discussions (60 messages🔥🔥):

Link mentioned: Enterprise privacy: no description found


OpenAI ▷ #gpt-4-discussions (1 messages):

wesego: Hi, having that problem right now.


OpenAI ▷ #prompt-engineering (7 messages):


OpenAI ▷ #api-discussions (7 messages):


OpenAccess AI Collective (axolotl) ▷ #general (47 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (13 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general-help (9 messages🔥):

Link mentioned: Quickstart — vLLM: no description found


OpenRouter (Alex Atallah) ▷ #announcements (4 messages):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (54 messages🔥):

Links mentioned:


CUDA MODE ▷ #general (12 messages🔥):

Links mentioned:


CUDA MODE ▷ #triton (10 messages🔥):

Links mentioned:


CUDA MODE ▷ #cuda (13 messages🔥):

Link mentioned: Programming Massively Parallel Processors: A Hands-on Approach: Hwu, Wen-mei W., Kirk, David B., El Hajj, Izzat: 9780323912310: Amazon.com: Books: no description found


CUDA MODE ▷ #jobs (1 messages):

vim410: Depends. But yes.


CUDA MODE ▷ #pmpp-book (8 messages🔥):


CUDA MODE ▷ #ring-attention (7 messages):

Link mentioned: add naive triton kernel for varlen · zhuzilin/ring-flash-attention@10d992c: no description found


CUDA MODE ▷ #off-topic (3 messages):

Link mentioned: Meta sues “brazenly disloyal” former exec over stolen confidential docs: Meta's former exec allegedly shared data center secrets with a shadowy startup.


LangChain AI ▷ #announcements (1 messages):

Link mentioned: RFC: Expedited langchain 0.2 release · langchain-ai/langchain · Discussion #19083: Context Currently langchain (the package) depends on langchain-community. This is done only for backwards compatibility with langchain versions that predate the split of langchain and langchain-com...


LangChain AI ▷ #general (34 messages🔥):

Links mentioned:


LangChain AI ▷ #langchain-templates (1 messages):


LangChain AI ▷ #share-your-work (6 messages):

Links mentioned:


LangChain AI ▷ #tutorials (1 messages):

pradeep1148: https://www.youtube.com/watch?v=PzaidfqDtGI


LAION ▷ #general (27 messages🔥):

Links mentioned:


LAION ▷ #research (13 messages🔥):

Links mentioned:


LLM Perf Enthusiasts AI ▷ #general (1 messages):

Since there is only one message provided and no additional context such as previous messages, links, or discussion points, a summary cannot be generated based on the instructions given. Please provide a series of messages or more context to summarize.


LLM Perf Enthusiasts AI ▷ #gpt4 (1 messages):


LLM Perf Enthusiasts AI ▷ #claude (18 messages🔥):

Link mentioned: <a href=https://x.com/tszzl/status/1768530219378631137?s=20>Tweet from roon (@tszzl): anthropic is controlled opposition to put the fear of god in the members of technical staff


LLM Perf Enthusiasts AI ▷ #reliability (16 messages🔥):

Links mentioned:


Skunkworks AI ▷ #general (17 messages🔥):


Skunkworks AI ▷ #off-topic (2 messages):

Links mentioned:


Datasette - LLM (@SimonW) ▷ #ai (16 messages🔥):

Links mentioned:


Datasette - LLM (@SimonW) ▷ #llm (1 messages):

obra: Is it possible to recover the seed used by the openai models for a previous api request?


Interconnects (Nathan Lambert) ▷ #other-papers (8 messages🔥):

Link mentioned: Logits of API-Protected LLMs Leak Proprietary Information: The commercialization of large language models (LLMs) has led to the common practice of high-level API-only access to proprietary models. In this work, we show that even with a conservative assumption...


Interconnects (Nathan Lambert) ▷ #ml-questions (4 messages):

Link mentioned: Towards Agile Text Classifiers for Everyone: Text-based safety classifiers are widely used for content moderation and increasingly to tune generative language model behavior - a topic of growing concern for the safety of digital assistants and c...


Interconnects (Nathan Lambert) ▷ #random (5 messages):

Link mentioned: Tweet from Teknium (e/λ) (@Teknium1): This explains why Yann is so bearish on LLMs... 😲


DiscoResearch ▷ #general (3 messages):


DiscoResearch ▷ #embedding_dev (1 messages):


DiscoResearch ▷ #discolm_german (5 messages):