Frozen AI News archive

Adept Fuyu-Heavy: Multimodal model for Agents

**Adept** launched **Fuyu-Heavy**, a multimodal model focused on UI understanding and visual QA, outperforming **Gemini Pro** on the MMMU benchmark. The model uses **DPO** (Direct Preference Optimization), gaining attention as a leading tuning method. The size of Fuyu-Heavy is undisclosed but estimated between **20B-170B** parameters, smaller than rumored frontier models like **Claude 2**, **GPT4V**, and **Gemini Ultra**. Meanwhile, **Mamba** was rejected at ICLR for quality concerns. In Discord discussions, **DeepSeek Coder 33B** was claimed to outperform **GPT-4** in coding tasks, and deployment strategies for large models like **Yi-34B-200K** and **Goliath-120B** were explored. Quantization debates highlighted mixed views on **Q8** and **EXL2 quants**. Fine-tuning and instruct-tuning of **Mistral 7B Instruct v0.2** were discussed, alongside insights on RMS optimization and heterogeneous AI architectures combining **Transformers** and **Selective SSM (Mamba)**. The potential of recurrent LLMs like **RWKV** and techniques like **Contrastive Preference Optimization (CPO)** were also noted.

Canonical issue URL

Adept's turn for a splashy launch:

image.png

The emphasis seems to be UI understanding, which given Adept's business makes sense as a focus. The demo video shows very good and precise visual QA on 7 screenshots of UIs, but revealed no other part of the Adept product because it was on a gradio interface. Fuyu also uses DPO, which has suddenly become the presumptive winner of the brief DPO vs IPO vs KTO wars. Fuyu-Heavy beats Gemini Pro on the new MMMU benchmark, but it's unclear where GPT4V registers on this (someone run it?)

A couple people called out the side comments on the size of Fuyu-Heavy vs Claude 2 and GPT4-V and Gemini Ultra given those details aren't public, and Adept itself didn't actually even mention their own model size (it's bigger than Fuyu-8B, that's all we really know). Assuming those frontier models are in the rumored 400B to 1.7T param range, being 10-20x smaller puts Fuyu-Heavy around the 20B-170B lower-upper bounds.

In other news, Mamba was rejected for ICLR as "not good enough". Lol?


Table of Contents

[TOC]

PART 1: High level Discord summaries

TheBloke Discord Summary


Nous Research AI Discord Summary


OpenAI Discord Summary


Perplexity AI Discord Summary


LM Studio Discord Summary


Eleuther Discord Summary


OpenAccess AI Collective (axolotl) Discord Summary


Latent Space Discord Summary


Mistral Discord Summary


HuggingFace Discord Summary

AI Study Courts VFX Artists: A survey seeking insights from VFX artists and producers is underway as part of an AI study, soliciting valuable industry input.

Greener Alexa Alternatives on Your Own Terms: @mattbcool addresses electronic waste by retrofitting Alexa hardware, detailing efforts to build a local, open-source personal assistant using Raspberry Pis.

CircleCI Powers LLM Automated Testing Course: Deep Learning.ai and CircleCI have teamed up to offer a course on using continuous integration tools to assess LLM applications effectively.

Breathing Life Into Text with 3DTopia: Discovered by @meatfucker, 3DTopia's GitHub repository promises to transform text into 3D models promptly with downloadable model weights and code.

Python Module Marries Steering Vectors with Hugging Face's Transformers: @mihai4256 created a Python module that integrates steering vectors with transformers, hinting at more details in a tweet.


LAION Discord Summary

Please note that while certain usernames were initially cited, they have been omitted from this summary as their direct relevance to the topics is not clarified to be of importance for an AI Engineer audience.


LlamaIndex Discord Summary


LangChain AI Discord Summary

X/Twitter Account Unblocked: The X/Twitter account has been recovered, and users previously affected have been unblocked. Those still experiencing issues can seek help by posting in the thread.

Streamline Your Apps with LangChain Streaming API: LangChain reveals a new streaming API to support real-time responsiveness in user applications. Resources include API documentation, specific modules for AgentExecutor and LangGraph (AgentExecutor docs, LangGraph Notebook), and a YouTube tutorial on stream_events. Feedback and discussions on the feature are welcomed on GitHub.

Database Dilemmas & Discussions: Queries range from determining if LangChain is open source to how to best integrate vector embeddings in a Postgres Database schema, alongside a call to share preferred vector storage solutions. Helpful references include the PostgreSQL Schemas documentation.

LangServe Learning Curve: Users in the LangServe channel grapple with utilizing agent_executor and understanding the capabilities of LCELs, some wishing for direct guidance from more experienced members in setting up and expanding tool usage.

Innovations and Connections in Shared Work: The launch of AgentHub, a platform aimed at combining RPA with AI, is announced along with a blog post on potential productivity gains (AI and RPA: The Future of Work). Meanwhile, a user calls for collaboration without providing specific context.

Educate with AI-Oriented Courses: A free 9-part AI series including "Building Multimodal AI Applications with LangChain & the OpenAI API" is available at DataCamp, and a new free course on automated testing of AI applications is offered by CircleCI and Deeplearning.ai (Automated Testing with LLMOPS).


DiscoResearch Discord Summary


PART 2: Detailed by-Channel summaries and links

TheBloke ▷ #general (1239 messages🔥🔥🔥):

Links mentioned:


TheBloke ▷ #characters-roleplay-stories (116 messages🔥🔥):

Links mentioned:


TheBloke ▷ #training-and-fine-tuning (67 messages🔥🔥):

Links mentioned:


TheBloke ▷ #model-merging (3 messages):


TheBloke ▷ #coding (4 messages):

Nous Research AI ▷ #off-topic (10 messages🔥):

Links mentioned:

Jordan Batter Looksmaxxing GIF - Jordan batter Looksmaxxing No - Discover & Share GIFs: Click to view the GIF


Nous Research AI ▷ #interesting-links (27 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (231 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (20 messages🔥):


Nous Research AI ▷ #project-obsidian (5 messages):

Links mentioned:

GitHub - nttmdlab-nlp/InstructDoc: InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024): InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions (AAAI2024) - GitHub - nttmdlab-nlp/InstructDoc: InstructDoc: A Dataset for Zero-Shot Generaliz...

,

OpenAI ▷ #annnouncements (1 messages):


OpenAI ▷ #ai-discussions (18 messages🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (66 messages🔥🔥):


OpenAI ▷ #prompt-engineering (71 messages🔥🔥):


OpenAI ▷ #api-discussions (71 messages🔥🔥):

Perplexity AI ▷ #general (160 messages🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (4 messages):

Links mentioned:


Perplexity AI ▷ #pplx-api (5 messages):

LM Studio ▷ #💬-general (71 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (20 messages🔥):

Links mentioned:


LM Studio ▷ #🎛-hardware-discussion (60 messages🔥🔥):

Links mentioned:

LM Studio Beta Releases: no description found


LM Studio ▷ #🧪-beta-releases-chat (2 messages):

Links mentioned:

Open LLM Leaderboard - a Hugging Face Space by HuggingFaceH4: no description found


LM Studio ▷ #memgpt (1 messages):


LM Studio ▷ #open-interpreter (2 messages):

Eleuther ▷ #announcements (1 messages):


Eleuther ▷ #general (11 messages🔥):


Eleuther ▷ #research (80 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (2 messages):


Eleuther ▷ #interpretability-general (14 messages🔥):

Links mentioned:

Circuits Updates - January 2024: no description found


Eleuther ▷ #gpt-neox-dev (7 messages):

Links mentioned:

Issues · microsoft/DeepSpeed): DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. - Issues · microsoft/DeepSpeed

,

OpenAccess AI Collective (axolotl) ▷ #general (59 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general-help (13 messages🔥):


OpenAccess AI Collective (axolotl) ▷ #datasets (15 messages🔥):

Links mentioned:

GitHub - abachaa/Existing-Medical-QA-Datasets: Multimodal Question Answering in the Medical Domain: A summary of Existing Datasets and Systems: Multimodal Question Answering in the Medical Domain: A summary of Existing Datasets and Systems - GitHub - abachaa/Existing-Medical-QA-Datasets: Multimodal Question Answering in the Medical Domain:...


OpenAccess AI Collective (axolotl) ▷ #announcements (1 messages):


OpenAccess AI Collective (axolotl) ▷ #replicate-help (1 messages):

Latent Space ▷ #ai-general-chat (49 messages🔥):

Links mentioned:


Latent Space ▷ #ai-event-announcements (2 messages):

Links mentioned:

Latent Space (Paper Club & Other Events) · Luma: View and subscribe to events from Latent Space (Paper Club & Other Events) on Luma. Latent.Space events. PLEASE CLICK THE RSS LOGO JUST ABOVE THE CALENDAR ON THE RIGHT TO ADD TO YOUR CAL. "Ad...


Latent Space ▷ #llm-paper-club (19 messages🔥):

Links mentioned:


Latent Space ▷ #llm-paper-club-chat (1 messages):

Links mentioned:

GitHub - Yifan-Song793/RestGPT: An LLM-based autonomous agent controlling real-world applications via RESTful APIs: An LLM-based autonomous agent controlling real-world applications via RESTful APIs - GitHub - Yifan-Song793/RestGPT: An LLM-based autonomous agent controlling real-world applications via RESTful APIs

,

Mistral ▷ #general (62 messages🔥🔥):

Links mentioned:

Cody | AI coding assistant: Cody is the most powerful and accurate AI coding assistant for writing, fixing, and maintaining code.


Mistral ▷ #deployment (1 messages):


Mistral ▷ #ref-implem (1 messages):


Mistral ▷ #finetuning (1 messages):


Mistral ▷ #showcase (2 messages):

Links mentioned:

A-JEPA neural model: Unlocking semantic knowledge from .wav / .mp3 audio file or audio spectrograms: 🌟 Unlock the Power of AI Learning from Audio ! 🔊 Watch a deep dive discussion on the A-JEPA approach with Oliver, Nevil, Ojasvita, Shashank, Srikanth and N...


Mistral ▷ #la-plateforme (3 messages):

Links mentioned:

LLM/llama-cpp-rag - final.ipynb at main · Quad-AI/LLM: Contribute to Quad-AI/LLM development by creating an account on GitHub.

,

HuggingFace ▷ #general (39 messages🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (1 messages):


HuggingFace ▷ #cool-finds (2 messages):

Links mentioned:


HuggingFace ▷ #i-made-this (3 messages):


HuggingFace ▷ #reading-group (3 messages):

Links mentioned:


HuggingFace ▷ #diffusion-discussions (5 messages):

Links mentioned:

Video generation using target audio and reference video. · huggingface/diffusers · Discussion #6696: I am working on a personal project which involves : Input a reference video and a target audio, synthesise a target video (lip synced talking head video generation driven by the target audio). I wo...


HuggingFace ▷ #computer-vision (4 messages):

Links mentioned:

Mixins & serialization methods: no description found


HuggingFace ▷ #NLP (5 messages):

Links mentioned:


HuggingFace ▷ #diffusion-discussions (5 messages):

Links mentioned:

Video generation using target audio and reference video. · huggingface/diffusers · Discussion #6696: I am working on a personal project which involves : Input a reference video and a target audio, synthesise a target video (lip synced talking head video generation driven by the target audio). I wo...

,

LAION ▷ #general (41 messages🔥):

Links mentioned:


LAION ▷ #research (15 messages🔥):

Links mentioned:

LlamaIndex ▷ #blog (1 messages):


LlamaIndex ▷ #general (49 messages🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (1 messages):

rawwerks: 👋 community question❓ what is your favorite vector store company and why? ,

LangChain AI ▷ #announcements (2 messages):

Links mentioned:


LangChain AI ▷ #general (21 messages🔥):

Links mentioned:

5.9. Schemas: 5.9. Schemas # 5.9.1. Creating a Schema 5.9.2. The Public Schema 5.9.3. The Schema Search Path 5.9.4. Schemas and Privileges 5.9.5. …


LangChain AI ▷ #langserve (1 messages):


LangChain AI ▷ #langchain-templates (1 messages):

sideways1: Has anyone built a Q&A chatbot that interacts with a database of JSON files?


LangChain AI ▷ #share-your-work (4 messages):

Links mentioned:

AI and RPA: The Future of Work: The marriage of RPA tooling and AI is going to cause a monumental explosion in productivity in the next few years.


LangChain AI ▷ #tutorials (2 messages):

Links mentioned:

DiscoResearch ▷ #disco_judge (1 messages):


DiscoResearch ▷ #general (9 messages🔥):

Links mentioned:

GitHub - jondurbin/airoboros: Customizable implementation of the self-instruct paper.: Customizable implementation of the self-instruct paper. - GitHub - jondurbin/airoboros: Customizable implementation of the self-instruct paper.


DiscoResearch ▷ #embedding_dev (11 messages🔥):


DiscoResearch ▷ #discolm_german (5 messages):

Links mentioned:

DiscoResearch/DiscoLM-70b · Hugging Face: no description found

Alignment Lab AI ▷ #oo (2 messages):

AI Engineer Foundation ▷ #general (2 messages):