Frozen AI News archive

Qwen 1.5 Released

**Chinese AI models Yi, Deepseek, and Qwen** are gaining attention for strong performance, with **Qwen 1.5** offering up to **32k token context** and compatibility with Hugging Face transformers and quantized models. The **TheBloke Discord** discussed topics like quantization of a **70B LLM**, the introduction of the **Sparse MoE model Sparsetral** based on **Mistral**, debates on merging vs fine-tuning, and Direct Preference Optimization (DPO) for character generation. The **Nous Research AI Discord** covered challenges in Japanese Kanji generation, AI scams on social media, and Meta's VR headset prototypes showcased at **SIGGRAPH 2023**. Discussions also included fine-tuning frozen networks and new models like **bagel-7b-v0.4**, **DeepSeek-Math-7b-instruct**, and **Sparsetral-16x7B-v2**.

Canonical issue URL

The Chinese models (Yi, Deepseek, and Qwen, to a lesser extent Zhipu) have been quietly cooking up a storm. Qwen's release this week claims strong performance vs Mistral and Llama2 equivalents:

image.png

with up to 32k token context. The technical report also discusses a number of evals made on multilingual, RAG, agent planning, and code generation capabilities. The Qwen team are also showing serious dedication to the downstream ecosystem, releasing with HF transformers compatibility and official AWQ/GPTQ 4/8bit quantized models.


Table of Contents

[TOC]

PART 1: High level Discord summaries

TheBloke Discord Summary


Nous Research AI Discord Summary


Eleuther Discord Summary


HuggingFace Discord Summary


LM Studio Discord Summary


Mistral Discord Summary


CUDA MODE Discord Summary


OpenAI Discord Summary


LangChain AI Discord Summary


Latent Space Discord Summary


LAION Discord Summary

Call for Collaboration in Foundational Models: A discussion initiated by @pratikk10 invites interested parties to contribute to the creation of foundational models across different media, including text, image, and video, seeking exchanges with serious creators.

Bias Watch in Reinforcement Learning: RLHF's introduction of significant biases is debated, with @pseudoterminalx and @astropulse noting the potentially counterproductive effect on base model development, while also observing a distinctive style in Midjourney's images potentially rooted in such biases.

Tackling Textual Bias in Pixart: Conversations reveal challenges in unlearning textual biases from datasets, specifically version 5.1 of pixart. Critique is directed at the use of the JourneyDB dataset, with suggestions to find more robust alternatives for unbiased text modalities.

Innovative Reading of Ancient Texts: The Vesuvius Challenge 2023 Grand Prize announcement highlighted a successful method for reading 2000-year-old scrolls without unrolling them, using a TimeSformer model and a particle accelerator, although at a high cost of $40,000 per scroll.

Chinese Machine Learning Thrives Despite Restrictions: Discussions ponder the success of Chinese ML entities in light of GPU restrictions, noting their preemptive procurement of NVIDIA's H100s and A100s before restrictions came into play, questioning the overall impact on technological progress.

Critique of Hugging Face's OWLSAM: @SegmentationFault shared and commented on the performance of OWLSAM in a Hugging Face Space, indicating that the model lacked coverage in visual representation and accuracy in object detection.


LlamaIndex Discord Summary


OpenAccess AI Collective (axolotl) Discord Summary


Perplexity AI Discord Summary


DiscoResearch Discord Summary


Alignment Lab AI Discord Summary


Datasette - LLM (@SimonW) Discord Summary


LLM Perf Enthusiasts AI Discord Summary


PART 2: Detailed by-Channel summaries and links

TheBloke ▷ #general (1293 messages🔥🔥🔥):

Links mentioned:


TheBloke ▷ #characters-roleplay-stories (457 messages🔥🔥🔥):

Links mentioned:


TheBloke ▷ #training-and-fine-tuning (2 messages):


TheBloke ▷ #model-merging (3 messages):

Links mentioned:

GitHub - 54rt1n/ComfyUI-DareMerge: ComfyUI powertools for SD1.5 and SDXL model merging: ComfyUI powertools for SD1.5 and SDXL model merging - GitHub - 54rt1n/ComfyUI-DareMerge: ComfyUI powertools for SD1.5 and SDXL model merging


TheBloke ▷ #coding (4 messages):

Links mentioned:


Nous Research AI ▷ #off-topic (28 messages🔥):

Links mentioned:


Nous Research AI ▷ #interesting-links (14 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (514 messages🔥🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (36 messages🔥):


Eleuther ▷ #general (363 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (61 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (2 messages):


Eleuther ▷ #interpretability-general (10 messages🔥):

Links mentioned:

GitHub - fblgit/model-similarity: Simple Model Similarities Analysis: Simple Model Similarities Analysis. Contribute to fblgit/model-similarity development by creating an account on GitHub.


Eleuther ▷ #lm-thunderdome (16 messages🔥):

Links mentioned:


Eleuther ▷ #gpt-neox-dev (3 messages):

Links mentioned:

Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling: Large language models are trained on massive scrapes of the web, which are often unstructured, noisy, and poorly phrased. Current scaling laws show that learning from such data requires an abundance o...


HuggingFace ▷ #general (321 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (1 messages):


HuggingFace ▷ #cool-finds (5 messages):

Links mentioned:


HuggingFace ▷ #i-made-this (3 messages):

Links mentioned:

Andyrasika/mistral-ft-optimized-dpo · Hugging Face: no description found


HuggingFace ▷ #reading-group (3 messages):

Links mentioned:

no title found: no description found


HuggingFace ▷ #computer-vision (1 messages):


HuggingFace ▷ #NLP (11 messages🔥):

Links mentioned:

bootcupboard/flair/SentimentalNERD.py at main · CodeAKrome/bootcupboard: It's bigger on the inside than the outside! Contribute to CodeAKrome/bootcupboard development by creating an account on GitHub.


LM Studio ▷ #💬-general (103 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (3 messages):

Links mentioned:

GitHub - deepseek-ai/DeepSeek-Math: Contribute to deepseek-ai/DeepSeek-Math development by creating an account on GitHub.


LM Studio ▷ #announcements (1 messages):

Links mentioned:


LM Studio ▷ #🧠-feedback (9 messages🔥):

Links mentioned:


LM Studio ▷ #🎛-hardware-discussion (40 messages🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (14 messages🔥):

Links mentioned:


LM Studio ▷ #autogen (1 messages):

lowkey9920: Try autogen studio . It's two commands to get started with a ui


LM Studio ▷ #langchain (1 messages):


Mistral ▷ #general (118 messages🔥🔥):

Links mentioned:


Mistral ▷ #deployment (6 messages):


Mistral ▷ #finetuning (1 messages):


Mistral ▷ #showcase (1 messages):

Links mentioned:

GitHub - jakobdylanc/discord-llm-chatbot: Supports OpenAI, Mistral, ollama, oobagooba and more • Multi-user chat • Vision support • Streamed responses • 200 lines of code 🔥: Supports OpenAI, Mistral, ollama, oobagooba and more • Multi-user chat • Vision support • Streamed responses • 200 lines of code 🔥 - GitHub - jakobdylanc/discord-llm-chatbot: Supports OpenAI, Mistr.....


Mistral ▷ #random (5 messages):

Links mentioned:

Sanic The Hedgehob GIF - Sanic The Hedgehob Running - Discover & Share GIFs: Click to view the GIF


Mistral ▷ #la-plateforme (2 messages):


CUDA MODE ▷ #general (29 messages🔥):

Links mentioned:


CUDA MODE ▷ #cuda (27 messages🔥):

Links mentioned:

Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numeric Behaviors: Tensor Cores have been an important unit to accelerate Fused Matrix Multiplication Accumulation (MMA) in all NVIDIA GPUs since Volta Architecture. To program Tensor Cores, users have to use either leg...


CUDA MODE ▷ #torch (18 messages🔥):

Links mentioned:


CUDA MODE ▷ #beginner (3 messages):


CUDA MODE ▷ #pmpp-book (2 messages):


CUDA MODE ▷ #youtube-recordings (4 messages):


CUDA MODE ▷ #jax (5 messages):

Links mentioned:

Pallas Quickstart — JAX documentation: no description found


OpenAI ▷ #ai-discussions (31 messages🔥):


OpenAI ▷ #gpt-4-discussions (50 messages🔥):

Links mentioned:

Brand guidelines: Language and assets for using the OpenAI brand in your marketing and communications.


OpenAI ▷ #prompt-engineering (2 messages):


OpenAI ▷ #api-discussions (2 messages):


LangChain AI ▷ #general (28 messages🔥):

Links mentioned:


LangChain AI ▷ #langserve (33 messages🔥):

Links mentioned:


LangChain AI ▷ #share-your-work (6 messages):

Links mentioned:


Latent Space ▷ #ai-general-chat (66 messages🔥🔥):

Links mentioned:


LAION ▷ #general (61 messages🔥🔥):

Links mentioned:


LAION ▷ #research (5 messages):

Links mentioned:

OWLSAM - a Hugging Face Space by merve: no description found


LlamaIndex ▷ #announcements (1 messages):


LlamaIndex ▷ #blog (6 messages):

Links mentioned:

Notion – The all-in-one workspace for your notes, tasks, wikis, and databases.: A new tool that blends your everyday work apps into one. It's the all-in-one workspace for you and your team


LlamaIndex ▷ #general (33 messages🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (25 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (41 messages🔥):

Links mentioned:

Introducing Qwen1.5: GITHUB HUGGING FACE MODELSCOPE DEMO DISCORD Introduction In recent months, our focus has been on developing a “good” model while optimizing the developer experience. As we progress towards...


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (14 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #rlhf (1 messages):

dangfutures: does anyone know how to the configs for zephyer <@257999024458563585>


Perplexity AI ▷ #general (25 messages🔥):

Links mentioned:


Perplexity AI ▷ #sharing (7 messages):

Links mentioned:


Perplexity AI ▷ #pplx-api (4 messages):

Links mentioned:


DiscoResearch ▷ #general (27 messages🔥):

Links mentioned:


DiscoResearch ▷ #embedding_dev (1 messages):

Links mentioned:

jinaai/jina-embeddings-v2-base-code · Hugging Face: no description found


Alignment Lab AI ▷ #general-chat (14 messages🔥):

Links mentioned:

axolotl/examples/llama-2/fft_optimized.yml at main · OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.


Datasette - LLM (@SimonW) ▷ #ai (1 messages):

Links mentioned:

Audacity now has free AI-powered sound tools from Intel - CDM Create Digital Music: The free audio editor now gets a suite of free AI tools from Intel, some competing with expensive paid subscription services. That covers useful stuff like noise suppression and transcriptions and mus...


Datasette - LLM (@SimonW) ▷ #llm (2 messages):

Links mentioned:

chatdb/natural-sql-7b · Hugging Face: no description found


LLM Perf Enthusiasts AI ▷ #opensource (2 messages):

Links mentioned:

Introducing Qwen1.5: GITHUB HUGGING FACE MODELSCOPE DEMO DISCORD Introduction In recent months, our focus has been on developing a “good” model while optimizing the developer experience. As we progress towards...