Frozen AI News archive

AI2 releases OLMo - the 4th open-everything LLM

**AI2** is gaining attention in 2024 with its new **OLMo** models, including 1B and 7B sizes and a 65B model forthcoming, emphasizing open and reproducible research akin to **Pythia**. The **Miqu-70B** model, especially the Mistral Medium variant, is praised for self-correction and speed optimizations. Discussions in **TheBloke** Discord covered programming language preferences, VRAM constraints for large models, and fine-tuning experiments with **Distilbert-base-uncased**. The **Mistral** Discord highlighted challenges in the **GPU shortage** affecting semiconductor production involving **TSMC**, **ASML**, and **Zeiss**, debates on open-source versus proprietary models, and fine-tuning techniques including **LoRA** for low-resource languages. Community insights also touched on embedding chunking strategies and JSON output improvements.

Canonical issue URL

As teased on Nathan Lambert's Latent Space appearance, we're about to see AI2 come up a lot more this year under new leadership. The first results of that are coming through now with OLMo (Open Language MOdels) - a 1B, and set of 7B models, with a 65B on the way.

image.png

Nathan's Substack has the less corpo take if you enjoy that tone (we do) and it is also fun to note that the releasing-models-thru-magnet-link meta still has not yet run out of juice.

In the LS Discord we had the honor of discussion with Nathan in more detail, including the odd choice to release a "Twin" AMD model, the exclusion of Mistral 7B from benchmarks, and more.

image.png

We happened to cover Pythia (one of the top 10 papers of 2023) in this week's Paper Club, and Nathan agreed that OLMo might be regarded a spiritual successor to Pythia in its commitment to reproducible and fully open research.

Hopefully the start of more in 2024.


Table of Contents

[TOC]

PART 1: High level Discord summaries

TheBloke Discord Summary


Mistral Discord Summary


Nous Research AI Discord Summary


HuggingFace Discord Summary


LAION Discord Summary


LM Studio Discord Summary


Eleuther Discord Summary


Latent Space Discord Summary


OpenAI Discord Summary


OpenAccess AI Collective (axolotl) Discord Summary


Perplexity AI Discord Summary


LlamaIndex Discord Summary


LangChain AI Discord Summary


DiscoResearch Discord Summary


LLM Perf Enthusiasts AI Discord Summary


CUDA MODE (Mark Saroufim) Discord Summary


Datasette - LLM (@SimonW) Discord Summary


PART 2: Detailed by-Channel summaries and links

TheBloke ▷ #general (1279 messages🔥🔥🔥):


TheBloke ▷ #characters-roleplay-stories (665 messages🔥🔥🔥):


TheBloke ▷ #training-and-fine-tuning (26 messages🔥):


TheBloke ▷ #model-merging (2 messages):


TheBloke ▷ #coding (12 messages🔥):


Mistral ▷ #general (178 messages🔥🔥):


Mistral ▷ #models (4 messages):


Mistral ▷ #ref-implem (4 messages):


Mistral ▷ #finetuning (44 messages🔥):


Mistral ▷ #showcase (16 messages🔥):


Mistral ▷ #la-plateforme (26 messages🔥):


Mistral ▷ #office-hour (240 messages🔥🔥):


Nous Research AI ▷ #ctx-length-research (3 messages):


Nous Research AI ▷ #off-topic (12 messages🔥):

Links mentioned:


Nous Research AI ▷ #interesting-links (14 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (374 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (38 messages🔥):

Links mentioned:


Nous Research AI ▷ #project-obsidian (6 messages):


HuggingFace ▷ #announcements (3 messages):

Links mentioned:


HuggingFace ▷ #general (241 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (12 messages🔥):

Links mentioned:

introduction to using pretrained LLMs: Introduction to using pretrained LLMs Hafedh hichri Released last year, Was SOTA on different tasks, such as image classification, image segmentation


HuggingFace ▷ #cool-finds (2 messages):

Links mentioned:


HuggingFace ▷ #i-made-this (9 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (28 messages🔥):

Links mentioned:

Eric's Presentation - When2meet: no description found


HuggingFace ▷ #core-announcements (1 messages):

Links mentioned:

diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_sd15_advanced.py at main · huggingface/diffusers: 🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch - huggingface/diffusers


HuggingFace ▷ #diffusion-discussions (1 messages):

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.


HuggingFace ▷ #computer-vision (18 messages🔥):

Links mentioned:

Akshay's Personal Website: I am a Machine Learning Enthusiast. Check out my Projects and Blogs


HuggingFace ▷ #NLP (5 messages):


HuggingFace ▷ #diffusion-discussions (1 messages):

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.


HuggingFace ▷ #gradio-announcements (1 messages):

Links mentioned:

gradio_modal V0.0.1 - a Hugging Face Space by aliabid94: no description found


LAION ▷ #general (298 messages🔥🔥):

Links mentioned:


LAION ▷ #research (1 messages):

felfri_: https://allenai.org/olmo


LM Studio ▷ #💬-general (137 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (32 messages🔥):

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.


LM Studio ▷ #🧠-feedback (12 messages🔥):


LM Studio ▷ #🎛-hardware-discussion (44 messages🔥):

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.


LM Studio ▷ #🧪-beta-releases-chat (1 messages):


LM Studio ▷ #autogen (3 messages):


LM Studio ▷ #open-interpreter (2 messages):


Eleuther ▷ #general (33 messages🔥):

Links mentioned:


Eleuther ▷ #research (159 messages🔥🔥):

Links mentioned:


Eleuther ▷ #interpretability-general (11 messages🔥):

Links mentioned:


Eleuther ▷ #lm-thunderdome (1 messages):

daniellepintz: Nope, no limit


Latent Space ▷ #ai-general-chat (151 messages🔥🔥):

Links mentioned:


Latent Space ▷ #ai-announcements (3 messages):


OpenAI ▷ #ai-discussions (50 messages🔥):


OpenAI ▷ #gpt-4-discussions (92 messages🔥🔥):


OpenAccess AI Collective (axolotl) ▷ #general (79 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (1 messages):

caseus_: A fix for this was merged upstream in transformers


OpenAccess AI Collective (axolotl) ▷ #general-help (10 messages🔥):


OpenAccess AI Collective (axolotl) ▷ #runpod-help (17 messages🔥):

Links mentioned:

Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.


OpenAccess AI Collective (axolotl) ▷ #shearedmistral (1 messages):

dangfutures: did you guys figure out the configs for mistral


Perplexity AI ▷ #general (66 messages🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (4 messages):

Links mentioned:

The Right Direction for AI: In this blog and in my book, Pandemic of Delusion, I have focused a lot on AI and particularly on its tremendous potential to shape our thinking for better or for worse. While AI represents a frigh…


Perplexity AI ▷ #pplx-api (8 messages🔥):


LlamaIndex ▷ #blog (3 messages):

Links mentioned:


LlamaIndex ▷ #general (40 messages🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (3 messages):

Links mentioned:

Whisper: How to Create Robust ASR (2 / N): Part 2 of a multi-part series in which we delve deep into Whisper, OpenAI's state-of-the-art automatic speech recognition model


LangChain AI ▷ #general (32 messages🔥):

Links mentioned:


LangChain AI ▷ #langserve (2 messages):

Links mentioned:

langserve/examples/local_llm/server.py at main · langchain-ai/langserve: LangServe 🦜️🏓. Contribute to langchain-ai/langserve development by creating an account on GitHub.


LangChain AI ▷ #share-your-work (9 messages🔥):

Links mentioned:


LangChain AI ▷ #tutorials (2 messages):

Links mentioned:

How to Compress LLM Contexts with LangChain: In this tutorial, you will learn to reduce token usage by up to 90% using LangChain.


DiscoResearch ▷ #disco_judge (1 messages):


DiscoResearch ▷ #general (15 messages🔥):

Links mentioned:


DiscoResearch ▷ #discolm_german (2 messages):

Links mentioned:

Google Colaboratory: no description found


LLM Perf Enthusiasts AI ▷ #embeddings (3 messages):

Links mentioned:

nomic-ai/nomic-embed-text-v1 · Hugging Face: no description found


LLM Perf Enthusiasts AI ▷ #reliability (1 messages):

firefox8975: Has anyone tried or is there a guide on how to deploy oss models to aws using vllm?


LLM Perf Enthusiasts AI ▷ #irl (2 messages):

Links mentioned:


LLM Perf Enthusiasts AI ▷ #openai (2 messages):


LLM Perf Enthusiasts AI ▷ #prompting (5 messages):


CUDA MODE (Mark Saroufim) ▷ #cuda (7 messages):


CUDA MODE (Mark Saroufim) ▷ #beginner (3 messages):


CUDA MODE (Mark Saroufim) ▷ #pmpp-book (1 messages):


Datasette - LLM (@SimonW) ▷ #ai (1 messages):

dbreunig: https://www.dbreunig.com/2024/02/01/pursuing-quiet-ai.html


Datasette - LLM (@SimonW) ▷ #llm (2 messages):

Links mentioned:

Datasette documentation: no description found