Frozen AI News archive

12/30/2023: Mega List of all LLMs

**Stella Biderman**'s tracking list of **LLMs** is highlighted, with resources shared for browsing. The **Nous Research AI** Discord discussed the **Local Attention Flax** module focusing on computational complexity, debating linear vs quadratic complexity and proposing chunking as a solution. Benchmark logs for various LLMs including **Deita v1.0** with its **SFT+DPO** training method were shared. Discussions covered model merging, graded modal types, function calling in AI models, and data contamination issues in **Mixtral**. Community insights were sought on **Amazon Titan Text Express** and **Amazon Titan Text Lite** LLMs, including a unique training strategy involving bad datasets. Several GitHub repositories and projects like **DRUGS**, **MathPile**, **CL-FoMo**, and **SplaTAM** were referenced for performance and data quality evaluations.

Canonical issue URL

(gist form: https://gist.github.com/veekaybee/f8e589fea42ba7131e4ca0a0f280c0a4?utm_source=ainews&utm_medium=email)

image.png

also, notable image AI activity in Huggingface-land

https://www.youtube.com/watch?v=ApcJ1UyLQB8&feature=youtu.be

[TOC]

Nous Research AI Discord Summary

Nous Research AI Channel Summaries

▷ #ctx-length-research (7 messages):

Links mentioned:

▷ #off-topic (26 messages🔥):

Links mentioned:

▷ #benchmarks-log (2 messages):

Links mentioned:

LLM-Benchmark-Logs/benchmark-logs/deita-v1.0-Mistral-7B.md at main · teknium1/LLM-Benchmark-Logs: Just a bunch of benchmark logs for different LLMs....

▷ #interesting-links (39 messages🔥):

Links mentioned:

▷ #general (231 messages🔥🔥):

Links mentioned:

▷ #ask-about-llms (22 messages🔥):

Links mentioned:

Amazon Titan Text models—Express and Lite—now generally available in Amazon Bedrock


LAION Discord Summary

Only 1 channel had activity, so no need to summarize...

Links mentioned:


OpenAI Discord Summary

Links mentioned:

OpenAI Channel Summaries

▷ #ai-discussions (14 messages🔥):

▷ #openai-chatter (120 messages🔥🔥):

Links mentioned:

GitHub - guardrails-ai/guardrails: Adding guardrails to large language models.: Adding guardrails to large language models. Contri...

▷ #openai-questions (76 messages🔥🔥):

▷ #gpt-4-discussions (15 messages🔥):

▷ #prompt-engineering (4 messages):

Links mentioned:

Usage policies

▷ #api-discussions (4 messages):

Links mentioned:

Usage policies


OpenAccess AI Collective (axolotl) Discord Summary

OpenAccess AI Collective (axolotl) Channel Summaries

▷ #general (87 messages🔥🔥):

Note: The conversations are ongoing and the discussion topics could be better summarized with more context from future messages.

Links mentioned:

▷ #axolotl-dev (23 messages🔥):

Links mentioned:

▷ #general-help (13 messages🔥):

▷ #datasets (9 messages🔥):

Links mentioned:

▷ #rlhf (24 messages🔥):

Links mentioned:

▷ #shearedmistral (28 messages🔥):

Links mentioned:


Eleuther Discord Summary

Eleuther Channel Summaries

▷ #general (35 messages🔥):

Links mentioned:

▷ #research (48 messages🔥):

Links mentioned:

▷ #gpt-neox-dev (2 messages):


HuggingFace Discord Discord Summary

HuggingFace Discord Channel Summaries

▷ #general (46 messages🔥):

Links mentioned:

▷ #today-im-learning (10 messages🔥):

▷ #cool-finds (6 messages):

Links mentioned:

Generate AI Images with Text - Text Diffuser 2, DiffMorpher & SDXL Auto FaceSwap!: A brief video about some of the trending huggingfa...

▷ #i-made-this (5 messages):

Links mentioned:

▷ #reading-group (1 messages):

▷ #diffusion-discussions (4 messages):

Links mentioned:

Introducing Würstchen: Fast Diffusion for Image Generation

▷ #NLP (4 messages):

▷ #diffusion-discussions (4 messages):

Links mentioned:

Introducing Würstchen: Fast Diffusion for Image Generation


Mistral Discord Summary

Selected Quotes and Direct Mentions

.tanuj.: "If you can get good reasoning from a small model, you can get pretty powerful agents made in real time by a user, and be as powerful as you'd like them to be! It can be a solution like one prompt -> building a full web app and deploying it, no user input needed in between."

@theledgerluminary: "But applying a similar architectural pattern to a large model could achieve better results. Really the only thing I see smaller models being beneficial for are real-time communication. If the overall goal is a large “long-running” task, it seems like a waste of time to only use a small model."

@poltronsuperstar on potential question posed to AGI: "What's your first question to an AGI?"

Links

Google Colaboratory

How to fine tune Mixtral 8x7B Mistral Ai Mixture of Experts (MoE) AI model

Mistral Channel Summaries

▷ #general (35 messages🔥):

Relevant quotes include:

.tanuj.: "If you can get good reasoning from a small model, you can get pretty powerful agents made in real time by a user, and be as powerful as you'd like them to be! It can be a solution like one prompt -> building a full web app and deploying it, no user input needed in between."

@theledgerluminary: "But applying a similar architectural pattern to a large model could achieve better results. Really the only thing I see smaller models being beneficial for are real time communication. If the overall goal is a large “long running” task, it seems like a waste of time to only use a small model."

Relevant Links:

▷ #deployment (3 messages):

▷ #showcase (12 messages🔥):

Links mentioned:

Google Colaboratory

▷ #random (4 messages):

▷ #la-plateforme (8 messages🔥):

Links mentioned:

How to fine tune Mixtral 8x7B Mistral Ai Mixture of Experts (MoE) AI model: When it comes to enhancing the capabilities of the...


DiscoResearch Discord Summary

DiscoResearch Channel Summaries

▷ #disco_judge (1 messages):

Links mentioned:

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning: Instruction tuning is a standard technique employe...

▷ #mixtral_implementation (8 messages🔥):

Links mentioned:

▷ #general (18 messages🔥):

Links mentioned:


LangChain AI Discord Summary

Only 1 channel had activity, so no need to summarize...

Links mentioned:

langchain_examples/examples/how_to_llm_chain_pass_multiple_inputs_to_prompt.py at main · rajib76/langchain_examples: This repo consists of examples to use langchain. C...


Alignment Lab AI Discord Summary

Only 1 channel had activity, so no need to summarize...

Links mentioned:

Tweet from undefined


LLM Perf Enthusiasts AI Discord Summary

LLM Perf Enthusiasts AI Channel Summaries

▷ #general (4 messages):

▷ #offtopic (2 messages):

▷ #prompting (3 messages):

Links mentioned:

GitHub - Ayenem/TokenHealer: Contribute to Ayenem/TokenHealer development by cr...


Latent Space Discord Summary

Latent Space Channel Summaries

▷ #ai-general-chat (4 messages):

▷ #ai-event-announcements (1 messages):

Links mentioned:

Tweet from Latent Space Podcast (@latentspacepod): 🆕 NeurIPS 2023 Recap — Top Startups! https://www...

▷ #llm-paper-club (1 messages):

Links mentioned:

Ten Noteworthy AI Research Papers of 2023: This year has felt distinctly different. I've...


Skunkworks AI Discord Summary

Only 1 channel had activity, so no need to summarize...

caviterginsoy: https://arxiv.org/abs/2305.11243


MLOps @Chipro Discord Summary

Only 1 channel had activity, so no need to summarize...