Frozen AI News archive

Mistral Large disappoints

**Mistral** announced **Mistral Large**, a new language model achieving **81.2% accuracy on MMLU**, trailing **GPT-4 Turbo** by about 5 percentage points on benchmarks. The community reception has been mixed, with skepticism about open sourcing and claims that **Mistral Small** outperforms the open **Mixtral 8x7B**. Discussions in the **TheBloke** Discord highlighted performance and cost-efficiency comparisons between **Mistral Large** and **GPT-4 Turbo**, technical challenges with **DeepSpeed** and **DPOTrainer** for training, advances in AI deception for roleplay characters using **DreamGen Opus V1**, and complexities in model merging using linear interpolation and PEFT methods. Enthusiasm for AI-assisted decompilation was also expressed, emphasizing the use of open-source projects for training data.

Canonical issue URL

Mistral came out swinging today announcing Mistral-Large on La Plateforme and on Azure, trailing GPT4 about 5 percentage points on their aggregated benchmarks:

image.png

The community reception has been mildly negative.

image.png

And hopes are not high for open sourcing. Notably, Mistral are also claiming that the new Mistral-Small is "significantly better" than the openly released Mixtral 8x7B.

image.png


Table of Contents

[TOC]

PART 0: Summary of Summaries of Summaries

PART 1: High level Discord summaries

TheBloke Discord Summary


Mistral Discord Summary

Mistral Large Takes the Stage: The introduction of Mistral Large, a highly optimized language model with an 81.2% accuracy on MMLU and features such as multilingual capabilities and native function calling, stirred interest and discussion across the community. It's available for use via platforms such as la Plateforme.

Technical Hurdles & Triumphs in LLM Deployment: Members shared experiences and exchanged technical advice on the challenges of deploying Mistral models, such as the Mistral7x8b and Mistral-7B-Instruct, on various hardware setups including Tesla V100s and local machines with limited VRAM. Tips on adjusting layer sharing, precision levels, and dealing with freezing issues were exchanged, highlighting the technical nuances of high-performance model usage.

Fine-Tuning Finesse: The community discussed fine-tuning practices, emphasizing the need for experimentation and adequate data quantities, with suggestions pointing to around 4000 instances for specific tasks. There was also a focus on the right data format for fine-tuning with Mistral models, and the necessity of understanding advanced fine-tuning techniques like LoRA.

Contemplating Commercial Impacts & Open Access: Conversations around Mistral's shift towards more business-oriented, closed-weight models like Mistral Small and Large surfaced concerns about the future of open models. However, many members are hopeful for the continued support of open model development despite big tech partnerships.

Mistral API Insights and Queries: Queries related to the Mistral API were numerous, ranging from concerns about data privacy, with confirmations that data isn't used for model training, to functional inquiries about running Mistral on local machines without GPUs. There was also a discussion on third-party offerings and potential integrations for extending Mistral's capabilities.

User-Driven Design and Application Ideas: The community actively shared ideas for new applications and enhancements, including the development of plugins and mobile apps that leverage Mistral. One user proposed adding a language level setting to Mistral's Le Chat and there's a buzz around the feature simplicity of Mistral-Next within Le Chat, which could indicate a user preference for streamlined AI products.


LM Studio Discord Summary


Perplexity AI Discord Summary


OpenAI Discord Summary


LAION Discord Summary


HuggingFace Discord Summary


Eleuther Discord Summary


LlamaIndex Discord Summary


Latent Space Discord Summary

Sora's Consistency Questioned: In a correction to a WSJ video, @swyxio pointed out that OpenAI's Sora maintains consistency over more than 1-minute videos by interpolating from a start image.

NVIDIA's GEARing Up: NVIDIA announced a new research group, GEAR (Generalist Embodied Agent Research), co-founded by Dr. Jim Fan, focusing on autonomous machines and general-purpose AI.

AI-Generated Podcasts Hit the Airwaves: Perplexity has launched an AI-generated podcast, drawing content from their Discover feed and employing ElevenLabs' voices for narration.

One Line of AI Code with Cloudflare: Cloudflare's new AI Gateway has been introduced, featuring easy integration via a single line of code for AI analytics and insights.

AI Takes on Data Analysis with GPT-4-ada-v2: A new tool - ChatGPT Data Analysis V2 enhances data analysis by offering targeted replies and data grid overlay editor, possibly implementing interactive charts and leveraging gpt-4-ada-v2.

LLM Paper Club T5 Session Recap: A recent LLM Paper Club session led by @bryanblackbee dissected the T5 paper with discussions encapsulated in shared Notion notes. Open inquiries included model vocabulary, fine-tuning processes, and architecture differences for NLP tasks.

Local Model Enthusiasts Convene in AI in Action Club: The "AI in Action" event highlighted local model exploration, tooling discussions for local AI models, and references to model fine-tuning with LoRAs deploying tools like ComfyUI. The Latent Space Final Frontiers event was announced, inviting teams to push the boundaries of AI with an application link here.


OpenAccess AI Collective (axolotl) Discord Summary


CUDA MODE Discord Summary


LangChain AI Discord Summary


Datasette - LLM (@SimonW) Discord Summary


LLM Perf Enthusiasts AI Discord Summary


DiscoResearch Discord Summary


AI Engineer Foundation Discord Summary


Alignment Lab AI Discord Summary


Skunkworks AI Discord Summary


PART 2: Detailed by-Channel summaries and links

TheBloke ▷ #general (1013 messages🔥🔥🔥):

Links mentioned:


TheBloke ▷ #characters-roleplay-stories (275 messages🔥🔥):

Links mentioned:


TheBloke ▷ #training-and-fine-tuning (71 messages🔥🔥):

Links mentioned:


TheBloke ▷ #model-merging (37 messages🔥):

Links mentioned:


TheBloke ▷ #coding (6 messages):


Mistral ▷ #general (1198 messages🔥🔥🔥):

Links mentioned:


Mistral ▷ #models (209 messages🔥🔥):

Links mentioned:


Mistral ▷ #deployment (56 messages🔥🔥):

Links mentioned:


Mistral ▷ #ref-implem (6 messages):


Mistral ▷ #finetuning (185 messages🔥🔥):

Links mentioned:

Serverless GPUs for AI Inference and Training: no description found


Mistral ▷ #announcements (2 messages):

Links mentioned:


Mistral ▷ #showcase (24 messages🔥):

Links mentioned:

Twitch: no description found


Mistral ▷ #random (17 messages🔥):

Links mentioned:


Mistral ▷ #la-plateforme (66 messages🔥🔥):

Links mentioned:


Mistral ▷ #le-chat (69 messages🔥🔥):


LM Studio ▷ #💬-general (608 messages🔥🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (98 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧠-feedback (8 messages🔥):


LM Studio ▷ #🎛-hardware-discussion (178 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (27 messages🔥):


LM Studio ▷ #autogen (9 messages🔥):

Links mentioned:

It Problem Phone Call GIF - It Problem Phone Call Have You Tried Turning It Off And On Again - Discover & Share GIFs: Click to view the GIF


LM Studio ▷ #langchain (1 messages):

bigsuh.eth: Hello, can I use LM Studio and use RAG in langchain?


LM Studio ▷ #open-interpreter (7 messages):


Perplexity AI ▷ #announcements (1 messages):

Links mentioned:


Perplexity AI ▷ #general (348 messages🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (23 messages🔥):


Perplexity AI ▷ #pplx-api (339 messages🔥🔥):

Links mentioned:


OpenAI ▷ #ai-discussions (183 messages🔥🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (103 messages🔥🔥):


OpenAI ▷ #prompt-engineering (209 messages🔥🔥):

Links mentioned:


OpenAI ▷ #api-discussions (209 messages🔥🔥):

Links mentioned:


LAION ▷ #general (624 messages🔥🔥🔥):

Links mentioned:


LAION ▷ #research (67 messages🔥🔥):

Links mentioned:


LAION ▷ #learning-ml (2 messages):


LAION ▷ #paper-discussion (1 messages):

said2000: https://arxiv.org/abs/2402.05892


HuggingFace ▷ #general (182 messages🔥🔥):

<ul>
  <li><strong>AI Hardware Endeavors and Speculations</strong>: Users discussed the potential of developing proprietary TPUs and the availability of particular nanometer manufacturing processes, highlighting how this democratization could grant freedom akin to the car industry. The conversation referenced comparisons to the RAM industry's price practices, indicating skepticism about tech promises from companies like Samsung.</li>
  <li><strong>Ongoing AI Debates</strong>: Community members voiced opinions on the impact of AI and capitalism, with some debating whether open-source efforts could rival giants like Intel or Nvidia. Discussions reflected concerns about the loss of jobs and wealth inequality tied to technology advancements, balanced by the practicalities of AI product development to secure individual financial well-being.</li>
  <li><strong>Inquiries and Assistance on Model Utilization</strong>: Users sought help for a range of topics, including the use of specific models on certain GPUs and integrations, limitations related to model sizes and memory constraints, the management of datasets, and finding resources for projects. The community contributed with suggestions such as using llama.cpp for model parallelization and employing CPU offloading with accelerate for large models.</li>
  <li><strong>Exploring Practical Applications and Collaborations</strong>: From seeking partnerships for neural network projects to finding efficient strategies to work with open-source models, users exchanged ideas and advice. They covered areas like machine learning, object detection, language models, and the use of serverless GPU services for cost-effective research and development.</li>
  <li><strong>Technical Support and Problem-Solving</strong>: The backend issues of Hugging Face services, such as inference-api serverless timeouts, were discussed, with user experiences highlighting fluctuating performance. Community members also addressed problems with data serialization, style customization in components, and concerns about GPU support for different models.</li>
</ul>

Links mentioned:


HuggingFace ▷ #today-im-learning (8 messages🔥):


HuggingFace ▷ #cool-finds (33 messages🔥):

Links mentioned:


HuggingFace ▷ #i-made-this (24 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (26 messages🔥):

Links mentioned:


HuggingFace ▷ #diffusion-discussions (15 messages🔥):

Links mentioned:


HuggingFace ▷ #computer-vision (23 messages🔥):

Links mentioned:


HuggingFace ▷ #NLP (109 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #diffusion-discussions (15 messages🔥):

Links mentioned:


Eleuther ▷ #general (190 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (84 messages🔥🔥):

Links mentioned:


Eleuther ▷ #interpretability-general (18 messages🔥):


Eleuther ▷ #lm-thunderdome (30 messages🔥):

Links mentioned:

Tweet from Alham Fikri Aji (@AlhamFikri): Many LLM evaluations use a restrictive multiple-choice (MCQ) format, but in practice, these LLMs are used in a more open-ended, free-text format 🔎Our new study reveals that their probability-based M...


Eleuther ▷ #gpt-neox-dev (6 messages):


LlamaIndex ▷ #blog (5 messages):


LlamaIndex ▷ #general (234 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (9 messages🔥):

Links mentioned:

no title found: no description found


Latent Space ▷ #ai-general-chat (79 messages🔥🔥):

Links mentioned:


Latent Space ▷ #ai-announcements (9 messages🔥):

Links mentioned:


Latent Space ▷ #llm-paper-club-west (16 messages🔥):

Links mentioned:

Notion – The all-in-one workspace for your notes, tasks, wikis, and databases.: A new tool that blends your everyday work apps into one. It's the all-in-one workspace for you and your team


Latent Space ▷ #ai-in-action-club (136 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (52 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (14 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general-help (121 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #rlhf (5 messages):


OpenAccess AI Collective (axolotl) ▷ #community-showcase (3 messages):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #runpod-help (1 messages):


CUDA MODE ▷ #general (61 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #triton (6 messages):


CUDA MODE ▷ #cuda (22 messages🔥):

Links mentioned:


CUDA MODE ▷ #torch (9 messages🔥):

Links mentioned:


CUDA MODE ▷ #announcements (1 messages):


CUDA MODE ▷ #algorithms (4 messages):

Links mentioned:

Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems: In the rapidly evolving landscape of artificial intelligence (AI), generative large language models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However, the computati...


CUDA MODE ▷ #suggestions (1 messages):

Links mentioned:

MIT 6.5940 Fall 2023 TinyML and Efficient Deep Learning Computing: no description found


CUDA MODE ▷ #jobs (3 messages):


CUDA MODE ▷ #beginner (11 messages🔥):

Links mentioned:


CUDA MODE ▷ #pmpp-book (3 messages):


CUDA MODE ▷ #youtube-recordings (3 messages):

Links mentioned:


CUDA MODE ▷ #smol-hw (8 messages🔥):

Links mentioned:


CUDA MODE ▷ #ring-attention (45 messages🔥):

Links mentioned:


LangChain AI ▷ #general (39 messages🔥):

Links mentioned:


LangChain AI ▷ #langserve (3 messages):


LangChain AI ▷ #share-your-work (11 messages🔥):

Links mentioned:


LangChain AI ▷ #tutorials (7 messages):

Links mentioned:


Datasette - LLM (@SimonW) ▷ #ai (4 messages):


Datasette - LLM (@SimonW) ▷ #llm (30 messages🔥):

Links mentioned:


LLM Perf Enthusiasts AI ▷ #general (6 messages):

Links mentioned:


LLM Perf Enthusiasts AI ▷ #finetuning (1 messages):


LLM Perf Enthusiasts AI ▷ #opensource (3 messages):

Links mentioned:

Tweet from Lin Qiao (@lqiao): 🔥 Structure is all you need. 🔥 We’re excited to announce: - FireFunction V1 - our new, open-weights function calling model: - GPT-4-level structured output and decision-routing at 4x lower lat...


LLM Perf Enthusiasts AI ▷ #offtopic (5 messages):

Links mentioned:


LLM Perf Enthusiasts AI ▷ #collaboration (3 messages):

Links mentioned:

Tweet from Niccolò Zanichelli (in SF in May) (@nc_znc): Interesting analysis evaluating the capabilities of different LLMs (GPT-4, GPT-3.5 and some open ones) w.r.t. generating spaced repetition flashcards conditioned on some explanatory text. Clear improv...


LLM Perf Enthusiasts AI ▷ #openai (5 messages):

Links mentioned:


DiscoResearch ▷ #general (4 messages):

Links mentioned:


DiscoResearch ▷ #benchmark_dev (10 messages🔥):

Links mentioned:

GitHub - EQ-bench/EQ-Bench: A benchmark for emotional intelligence in large language models: A benchmark for emotional intelligence in large language models - EQ-bench/EQ-Bench


DiscoResearch ▷ #embedding_dev (2 messages):

Links mentioned:

🪆 Introduction to Matryoshka Embedding Models: no description found


DiscoResearch ▷ #discolm_german (6 messages):


AI Engineer Foundation ▷ #general (12 messages🔥):

Links mentioned:

Fine tuning for function-calling | OpenAI Cookbook: no description found


Alignment Lab AI ▷ #oo (1 messages):

Links mentioned:

imone/gemma-7b-with-it-tokens · Hugging Face: no description found


Skunkworks AI ▷ #general (1 messages):