Frozen AI News archive

Google Solves Text to Video

**Google Research** introduced **Lumiere**, a text-to-video model featuring advanced inpainting capabilities using a Space-Time diffusion process, surpassing previous models like Pika and Runway. Manveer from UseScholar.org compiled a comprehensive list of code evaluation benchmarks beyond HumanEval, including datasets from **Amazon Science**, **Hugging Face**, and others. Discord communities such as **TheBloke** discussed topics including running **Mistral-7B** via API, GPU rentals, and multimodal model integration with **LLava**. **Nous Research AI** highlighted learning rate strategies for LLM fine-tuning, issues with inference, and benchmarks like HumanEval and MBPP. **RestGPT** gained attention for controlling applications via RESTful APIs, showcasing LLM application capabilities.

Canonical issue URL

Lumiere - text to video

Enter Lumiere from Google Research. Every part of this video is computer generated:

https://www.youtube.com/watch?v=wxLr02Dz2Sc

In particular I would draw your attention to their inpainting capabilities - watch the syrup pour on the cake and stay there:

image.png

This is a step above anything we've yet seen coming out of Pika and Runway. This seems to come from a Space-Time diffusion process:

image.png

which we think Einstein would particularly enjoy.

Code Evals beyond HumanEval

In other news, Manveer of UseScholar.org is collating a comprehensive list of all evals, including some code ones we haven't heard of:

--

Table of Contents

[TOC]

PART 1: High level Discord summaries

TheBloke Discord Summary


Nous Research AI Discord Summary


Mistral Discord Summary

Key links mentioned:


LM Studio Discord Summary


OpenAccess AI Collective (axolotl) Discord Summary

Logit Distillation's Progress and Voice Synthesis Challenge: Discussions revealed progress in logit distillation using GPT-4 logits with success in backfilling strategies. However, adding custom tokens for voice synthesis to LLMs, as high as 8k, would require extensive pretraining, as shared by participants like @ex3ndr, @le_mess, and @stefangliga.

Jupyter SSL Woes and Self-Rewarding Language Models: SSL issues with Jupyter in the Latitude container surfaced without a solution, leading @dctanner to utilize SSH port forwarding. Interest in Self-Rewarding Language Models sparked discussion, with a PyTorch implementation shared by @caseus_.

DPO Dataset Loading Success, Strategy Struggles, and Local Dataset Queries: Members discussed overcoming DPO dataset loading issues using a PR, with @dangfutures using a micro batch size of 1 amidst out-of-memory errors. There was a collaborative effort to address prompt strategies and finetuning with llava models, indicating the Axolotl framework's flexibility, referenced by @caseus_, @noobmaster29, and @gameveloster.

Insight into Optimal LoRA Hyperparameters and Dataset Overlap Confirmation: A shared Lightning AI article provided insights on effective LoRA hyperparameter usage, as @noobmaster29 and @c.gato discussed alpha, rank, and batch size variations. Dataset overlap concerns between dolphin and openorca datasets were confirmed, signaling data redundancy awareness.

YAML Configuration and Prompt Tokenization for RLHF: RLHF projects encountered a KeyError within YAML configurations, but a resolution via new type formats (chatml.argilla and chatml.intel) was found and shared by @alekseykorshuk. Configurations for local datasets and prompt tokenization strategy updates were also discussed, emphasizing the evolving nature of these components.

Cog Configurations for ML Containers: @dangfutures shared a Cog configuration guide detailing the use of CUDA "12.1", Python "3.11", and Python packages installations for machine learning containers, as per the Cog's documentation. This practical snippet demonstrates active community guidance on infrastructure setup.


Eleuther Discord Summary

Byte-Level BPE Enables Multilingual LLM Responses: The Llama 2 model generates responses in multiple languages using byte-level BPE, which supports Hindi, Tamil, and Gujarati.

Mamba's Scalability Questioned: Enthusiastic debate unfolded over Mamba's potential to scale and replace Transformers, with a lack of evidence concerning its performance at larger scales provoking skepticism among technical users.

Google Steals the Show with Lumiere: Google Research's space-time diffusion model for video generation, Lumiere, attracted attention, despite concerns over dataset size and data advantages.

First-of-its-kind Conference on Language Modeling: Excitement buzzed around the announcement of the inaugural Conference on Language Modeling at the University of Pennsylvania, promising to bring deep insights into language modeling research.

MoE Implementation Challenges and Parallelism: A developer shared a pull request to implement Mixture of Experts (MoE) in GPT-NeoX, voicing conundrums on validating MoE with single GPU limits and seeking insights into parallelism optimizations, while another pull request scrutinizes the potential of fused layernorm in performance enhancements.


LAION Discord Summary


HuggingFace Discord Summary


Perplexity AI Discord Summary


LlamaIndex Discord Summary


OpenAI Discord Summary


DiscoResearch Discord Summary


Latent Space Discord Summary


LangChain AI Discord Summary


LLM Perf Enthusiasts AI Discord Summary


Skunkworks AI Discord Summary

Based on the provided messages, there isn't sufficient context or substantial technical content relevant to an engineer audience to generate a summary. Both messages appear to be informal communications without any discernible technical discussion or key points.


YAIG (a16z Infra) Discord Summary


Alignment Lab AI Discord Summary


Datasette - LLM (@SimonW) Discord Summary


PART 2: Detailed by-Channel summaries and links

TheBloke ▷ #general (1398 messages🔥🔥🔥):

Links mentioned:


TheBloke ▷ #characters-roleplay-stories (427 messages🔥🔥🔥):

Links mentioned:


TheBloke ▷ #training-and-fine-tuning (5 messages):


TheBloke ▷ #model-merging (2 messages):


TheBloke ▷ #coding (13 messages🔥):

Nous Research AI ▷ #off-topic (18 messages🔥):

Links mentioned:


Nous Research AI ▷ #interesting-links (38 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (271 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (57 messages🔥🔥):

Links mentioned:

Mistral ▷ #general (225 messages🔥🔥):

Links mentioned:


Mistral ▷ #models (81 messages🔥🔥):

Links mentioned:


Mistral ▷ #deployment (21 messages🔥):

Links mentioned:

Engine Arguments — vLLM: no description found


Mistral ▷ #finetuning (7 messages):


Mistral ▷ #showcase (2 messages):


Mistral ▷ #la-plateforme (11 messages🔥):

Links mentioned:

LM Studio ▷ #💬-general (172 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (15 messages🔥):


LM Studio ▷ #🧠-feedback (16 messages🔥):


LM Studio ▷ #🎛-hardware-discussion (29 messages🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (2 messages):


LM Studio ▷ #autogen (1 messages):

senecalouck: Try it using 127.0.0.1 in the script.


LM Studio ▷ #langchain (1 messages):

gciri001: Is it possible to use Langchain and MySql with LLAMA 2 withouts openAI api?


LM Studio ▷ #crew-ai (4 messages):

OpenAccess AI Collective (axolotl) ▷ #general (50 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (4 messages):

Links mentioned:

GitHub - lucidrains/self-rewarding-lm-pytorch: Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI: Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI - GitHub - lucidrains/self-rewarding-lm-pytorch: Implementation of the training framework proposed in...


OpenAccess AI Collective (axolotl) ▷ #general-help (116 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #datasets (4 messages):

Links mentioned:

cognitivecomputations/dolphin · Datasets at Hugging Face: no description found


OpenAccess AI Collective (axolotl) ▷ #rlhf (13 messages🔥):


OpenAccess AI Collective (axolotl) ▷ #replicate-help (1 messages):

Links mentioned:

Eleuther ▷ #general (56 messages🔥🔥):

Links mentioned:

Self-Rewarding Language Models: We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal. Current approaches commonly train reward models from human prefer...


Eleuther ▷ #research (65 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (3 messages):


Eleuther ▷ #interpretability-general (2 messages):


Eleuther ▷ #gpt-neox-dev (3 messages):

Links mentioned:

LAION ▷ #general (124 messages🔥🔥):

Links mentioned:


LAION ▷ #research (2 messages):

Links mentioned:

HuggingFace ▷ #announcements (1 messages):

Links mentioned:


HuggingFace ▷ #general (49 messages🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (6 messages):


HuggingFace ▷ #cool-finds (9 messages🔥):

Links mentioned:


HuggingFace ▷ #i-made-this (14 messages🔥):

Links mentioned:


HuggingFace ▷ #diffusion-discussions (5 messages):


HuggingFace ▷ #NLP (4 messages):

Links mentioned:

Efficient Training on Multiple GPUs: no description found


HuggingFace ▷ #diffusion-discussions (5 messages):

Perplexity AI ▷ #general (57 messages🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (12 messages🔥):


Perplexity AI ▷ #pplx-api (3 messages):

LlamaIndex ▷ #blog (2 messages):

Links mentioned:

MemGPT: no description found


LlamaIndex ▷ #general (65 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (1 messages):

Links mentioned:

Empowering Your Chatbot: Unveiling Dynamic Knowledge Sources with Advanced Integration: Explore the next frontier in chatbot development adding how dynamic knowledge sources are harnessed.

OpenAI ▷ #ai-discussions (7 messages):


OpenAI ▷ #gpt-4-discussions (28 messages🔥):


OpenAI ▷ #prompt-engineering (10 messages🔥):


OpenAI ▷ #api-discussions (10 messages🔥):

DiscoResearch ▷ #mixtral_implementation (10 messages🔥):

Links mentioned:


DiscoResearch ▷ #general (11 messages🔥):

Links mentioned:

In-Context Pretraining: Language Modeling Beyond Document Boundaries: Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to do...


DiscoResearch ▷ #embedding_dev (4 messages):

Links mentioned:


DiscoResearch ▷ #discolm_german (19 messages🔥):

Links mentioned:

Latent Space ▷ #ai-general-chat (28 messages🔥):

Links mentioned:


Latent Space ▷ #llm-paper-club (5 messages):

Links mentioned:

GitHub - lucidrains/self-rewarding-lm-pytorch: Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI: Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAI - GitHub - lucidrains/self-rewarding-lm-pytorch: Implementation of the training framework proposed in...

,

LangChain AI ▷ #general (16 messages🔥):

Links mentioned:


LangChain AI ▷ #langserve (2 messages):


LangChain AI ▷ #share-your-work (1 messages):

Links mentioned:

Ollama models - Image Summarization: Ollama models - Image Summarization. GitHub Gist: instantly share code, notes, and snippets.


LangChain AI ▷ #tutorials (2 messages):

Links mentioned:

LLM Perf Enthusiasts AI ▷ #announcements (1 messages):


LLM Perf Enthusiasts AI ▷ #offtopic (2 messages):

Links mentioned:

GitHub - AlibabaResearch/AdvancedLiterateMachinery: A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Alibaba DAMO Academy.: A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Alibaba DAMO Academy. - Git...


LLM Perf Enthusiasts AI ▷ #feedback-meta (6 messages):

Skunkworks AI ▷ #general (1 messages):

far_el: good lad


Skunkworks AI ▷ #off-topic (1 messages):

pradeep1148: https://www.youtube.com/watch?v=n3gkZ_IRwCI ,

YAIG (a16z Infra) ▷ #ai-ml (2 messages):

Alignment Lab AI ▷ #open-orca-community-chat (1 messages):

Links mentioned:

Open-Orca/SlimOrca · Datasets at Hugging Face: no description found

,

Datasette - LLM (@SimonW) ▷ #llm (1 messages):

Links mentioned:

Release 0.3 · simonw/llm-gpt4all: Now provides access to model options such as -o max_tokens 3. Thanks, Mauve Signweaver. #3 Models now work without an internet connection. Thanks, Cameron Yick. #10 Documentation now includes the l...