Frozen AI News archive

Companies liable for AI hallucination is Good Actually for AI Engineers

**Air Canada** faced a legal ruling requiring it to honor refund policies communicated by its AI chatbot, setting a precedent for corporate liability in AI engineering accuracy. The tribunal ordered a refund of **$650.88 CAD** plus damages after the chatbot misled a customer about bereavement travel refunds. Meanwhile, AI community discussions highlighted innovations in **quantization techniques** for GPU inference, **Retrieval-Augmented Generation (RAG)** and fine-tuning of LLMs, and **CUDA** optimizations for PyTorch models. New prototype models like **Mistral-Next** and the **Large World Model (LWM)** were introduced, showcasing advances in handling large text contexts and video generation with models like **Sora**. Ethical and legal implications of AI autonomy were debated alongside challenges in dataset management. Community-driven projects such as the open-source TypeScript agent framework **bazed-af** emphasize collaborative AI development. Additionally, benchmarks like **BABILong** for up to **10M context evaluation** and tools from **karpathy** were noted.

Canonical issue URL

image.png

This isn't strictly technical news, but not enough engineers are talking about the Air Canada ruling this weekend (summary below):

While the amounts here are small and this is just a tiny Canadian ruling, we think this is significant for engineers because it is precedent that courts are increasingly going to hold companies liable for sloppy AI Engineering.

Other notables:


Table of Contents

[TOC]

PART 0: Summary of summaries of summaries

PART 1: High level Discord summaries

TheBloke Discord Summary


Eleuther Discord Summary


LM Studio Discord Summary

Boosting Token Generation to Maximize Speed: Engineers optimize GPU utilization to enhance token generation speed, exploring settings that push an RTX 4050 and Ryzen 7 up to 34 tokens/s. A user is looking to exceed this performance by offloading 50 layers and seeks advice on further improvements, while .gguf models are being fine-tuned for more human-like responses and censorship removal.

Hardware Tweaks and Multi-GPU Musings: Intel cores are being leveraged for KVMs on macOS and Windows, with an eye on upgrading from a 3090 to a 5090 GPU for better performance. The community is also sharing insights on multi-GPU configurations, power, space considerations, and tooling for optimized VRAM utilization across mismatched graphics cards.

LM Studio Model Recommendations and Quantization Insights: For users seeking the best 7b models with 32k context support, check TheBloke's repositories and sort by downloads in LMStudio's Model Explorer. Discussions point to Q5_K_M models for efficiency, and a Reddit post was highlighted for in-depth quantization method comparison.

LM Studio Autogen and CrewAI Starting Points: A beginner's tutorial on using Autogen with Local AI Agent was shared, while a broken link in the autogen channel was reported. The pin regarding the link was successfully removed after a user's suggestion.

LM Studio Integration and Tech Troubleshooting: Discussion on integrating LM Studio with Flowise and LangFlow was initiated, with users sharing attempts to connect using http_client and tackling server connection issues. Configuration insights were shared, involving manual settings introduction to achieve functional integration.


Mistral Discord Summary


Nous Research AI Discord Summary


LAION Discord Summary


OpenAI Discord Summary


Perplexity AI Discord Summary


HuggingFace Discord Summary

Fine-Tuning Fervor for Indian Laws: Users discussed approaches for processing Indian IPS laws with @keshav... leaning towards fine-tuning llama 2 while @vishyouluck suggested using a RAG approach instead.

Game Development Gets Competitive: @om7059 probed the community for tips on integrating model evaluation as a scoring mechanism in a multiplayer doodle game.

Geographical Model Mastery Sought: @retonq sought insights on the best model between Mistral medium, pplx, and llama for interpreting geographic information like coordinates and directions.

Persian Language Model Quest: @alifthi was in search of a high-performance, Persian-supporting open-source language model, with @alchemist_17. recommending fine-tuning models such as mistral or llama2 with a custom dataset.

Code Quality via Plagiarism Tools: @brady_kelly shared a method for ensuring documentation completion by using plagiarism detection in software CI/CD pipelines.

Prompt-Driven RAG Innovations: @subham5089 shared a blog post discussing the challenges in prompt-driven RAG systems, adding depth to the conversation on technological advancements in this area. Read the blog post.

Reinforcement Learning Enhancements: A lecture exploring RLHF and alternatives to PPO, including DPO, was shared by @nagaraj_arvind for those looking to refine LLM completions with RL techniques. Watch the lecture video and read the DPO paper.

Protein Language Models Deciphered: Limitations of PLMs were discussed with reference to a recent paper shared by @grimsqueaker, highlighting the need for new pretraining methods despite beneficial outcomes from current pretraining practices. (Read the abstract, Discuss on Twitter).

Text to 3D for VR by Intel: @abhinit21 pointed out Intel's LDM3D-VR, which has opened new opportunities within virtual reality development by converting text to 3D models (model on Hugging Face, read the paper).

Deepfake Detection Development: @lucas_selva promoted a web app using XAI to identify deepfakes and expressed intentions of future advancements (try the app).

Databricks Directs Generative AI: @valeriiakuka shared an article outlining Databrick's implications on generative AI space and their strategy amid recent acquisitions (read the full story).

Creations and Computations Collide: The i-made-this channel buzzed with praise for <@848983314018336809>'s creation, rollout of new models at FumesAI, and the unveiling of Statricks founder's journey and ProteinBERT’s efficient architecture (GitHub repo, research paper).

PEFT Presentation Locked In: @prateeky2806 set the expectation for an enlightening demo on integrating merging methods in the PEFT library on Friday, March 1st (GitHub PR).

YouTube Explored for Mamba Insights: A compilation of videos explaining Mamba and SSMs was shared to support the community's understanding of the technologies (compiled playlist).

DPO Dynamics Discussed: @maxpappa and @arturzm traded insights on full DPO's impact, while others sought guidance on BitsAndBytes conversions and delved into the mathematical intricacies of diffusion models, with resources shared for bolstering their comprehension.

Discovering Model Compatibility for Varied Tasks: Various users inquired about tools and practices across unique uses such as @corneileous for UI elements in training, @smallcrawler for patching weather prediction models, and @little.stone's curiosity about diffusion models on time series data, showing the versatile application of AI models.


LlamaIndex Discord Summary


Latent Space Discord Summary


CUDA MODE Discord Summary


OpenAccess AI Collective (axolotl) Discord Summary

8-Bit Models Step Up: @nafnlaus00 discussed full finetunes on 8-bit models, reflecting on AI Explained's advancements, while Stability AI's focus was questioned by @dreamgen. Adam's Replacement: The optimizer shift from Adam to Lion was debated, with a GitHub repository being shared for implementation.

Perplexity and Learnability in LLMs: @dreamgen and @suikamelon contemplated using perplexity and learnability for selecting fine-tuning data, alluding to a scientific paper for a deeper dive. SPIN's Implementation: An official Self-Play Fine-Tuning (SPIN) GitHub Link was provided by @nruaif.

PyTorch and Torchdistx on the Merge Front: Updates to PyTorch were suggested alongside discussions about Torchdistx integration shown in a GitHub commit, highlighting challenges with non-native optimizers.

Dataset Blues and Consistent Prompts: Frustrations with dataset management were vented by @iatetoomanybeans, and uniformity within dataset system prompts was confirmed. Interest was sparked by the Neural-DPO dataset centered around AI and the Aya initiative found on Huggingface.

DPO Puzzles Players: Users like @filippob82 and @noobmaster29 voiced confusions and challenges surrounding evaluations with DPO, suggesting it's an unresolved issue.

RunPod and Replicate Queries: Brief messages implied a user error in RunPod, mentioned by c.gato, while j_sp_r shared an insight via a link comparing Replicate and Fly.


LangChain AI Discord Summary

Fine-Tuning Tactics sought for Sales LLMs: @david1542 expressed challenges in fine-tuning LLMs for domain-specific tasks, such as sales, due to the agent's lack of understanding of company-specific processes.

Trace Troubles Trouble Pricing: @pasko70 highlighted a cost issue with LangSmith, where trace costs are prohibitively expensive for applications with low to medium token throughput, lacking a provided solution or community response.

Vector DB Confusions Complicate Whisper: @cablecutter delved into issues when processing Whisper transcripts into vector databases for thematic summarization and QA, struggling with integrating short-context segments.

Tech Troubles With LangChain Updates: Users encountered errors with LangChain updates, specifically with the TextLoader module and were left seeking fixes, with @dre99899 suggesting workarounds based on GitHub issue #17585.

Seeking RAG API Wisdom: A request was made by @mamo7410 for guidance on implementing a RAG API with langserv, including questions about streaming, runtime IDs, and context document handling, with no clear instructions found.

Multimodal RAG Mingles with PrivateGPT: @zhouql1978 created a Multimodal RAG utilizing Langchain and PrivateGPT, communicated in a Twitter post and proclaimed to work with various document formats, accomplished in under 300 lines of code.

Scribe Seeks Scrutiny: @shving90 requested feedback on a writing platform project called Scribe, which can be found here, yet no specific feedback has been mentioned in the conversation.

Memory Mimicking via Open Source: @courtlandleer from Plastic Labs introduced an open-source alternative to OpenAI's 'memory' with Honcho, featuring a demo & discord bot as explained in their blog post.

Whisper Writings: @amgadoz produced a three-part series on OpenAI's Whisper for ASR, exploring architecture, multitasking, and development process, linked to Substack articles.

LangChain Learns Rust: The LangChain library was ported to Rust by @edartru., aiming to simplify writing LLM-based programs, with the GitHub repository available here.

Financial Analyst AI Tutorial: @solo78 shared a Medium article detailing the process to analyze the risk profiles of insurance companies using OpenAI's Assistant API, find the guide here.

YouTube Aids LangChain Apprenticeship: Tutorial videos discussed include creating a Retrieval Augmented Generation UI with ChainLit, adding live stock data to crewAI, and introducing LangSmith for LLM development, found on their respective YouTube channels mentioned above.


DiscoResearch Discord Summary


Skunkworks AI Discord Summary


Alignment Lab AI Discord Summary


AI Engineer Foundation Discord Summary


PART 2: Detailed by-Channel summaries and links

TheBloke ▷ #general (1409 messages🔥🔥🔥):

Links mentioned:


TheBloke ▷ #characters-roleplay-stories (274 messages🔥🔥):

Links mentioned:


TheBloke ▷ #training-and-fine-tuning (26 messages🔥):


TheBloke ▷ #coding (129 messages🔥🔥):

Links mentioned:


Eleuther ▷ #general (339 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (196 messages🔥🔥):

Links mentioned:


Eleuther ▷ #interpretability-general (3 messages):

Links mentioned:


Eleuther ▷ #lm-thunderdome (38 messages🔥):

Links mentioned:


Eleuther ▷ #gpt-neox-dev (13 messages🔥):

Links mentioned:

Zoology (Blogpost 2): Simple, Input-Dependent, and Sub-Quadratic Sequence Mixers: no description found


LM Studio ▷ #💬-general (328 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (33 messages🔥):

Links mentioned:


LM Studio ▷ #🧠-feedback (14 messages🔥):


LM Studio ▷ #🎛-hardware-discussion (127 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (12 messages🔥):

Links mentioned:

Don't ask to ask, just ask: no description found


LM Studio ▷ #autogen (8 messages🔥):

Links mentioned:


LM Studio ▷ #rivet (1 messages):

mend1440: Dang, this project is fuego!!!


LM Studio ▷ #langchain (7 messages):

Links mentioned:


Mistral ▷ #general (370 messages🔥🔥):

Links mentioned:


Mistral ▷ #models (30 messages🔥):


Mistral ▷ #deployment (43 messages🔥):

Links mentioned:

6freedom | Experts en technologies immersives: 6freedom est une agence experte en technologies immersives. Nous vous accompagnons dans l'élaboration de vos projets sur-mesure.


Mistral ▷ #finetuning (6 messages):

Links mentioned:

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks: Large language models (LLMs) have proven to be highly effective across various natural language processing tasks. However, their large number of parameters poses significant challenges for practical d...


Mistral ▷ #random (2 messages):

Links mentioned:

Applied Generative AI Engineer - Elqano - CDI: Elqano recrute un(e) Applied Generative AI Engineer !


Mistral ▷ #la-plateforme (24 messages🔥):

Links mentioned:

GitHub - e-p-armstrong/augmentoolkit at api-branch: Convert Compute And Books Into Instruct-Tuning Datasets - GitHub - e-p-armstrong/augmentoolkit at api-branch


Nous Research AI ▷ #ctx-length-research (16 messages🔥):

Links mentioned:

GitHub - elder-plinius/MYLN: A language compressor for enhanced context length and efficiency of LLM-to-LLM communication.: A language compressor for enhanced context length and efficiency of LLM-to-LLM communication. - elder-plinius/MYLN


Nous Research AI ▷ #off-topic (33 messages🔥):

Links mentioned:


Nous Research AI ▷ #benchmarks-log (1 messages):

Links mentioned:

GitHub - teknium1/LLM-Benchmark-Logs: Just a bunch of benchmark logs for different LLMs: Just a bunch of benchmark logs for different LLMs. Contribute to teknium1/LLM-Benchmark-Logs development by creating an account on GitHub.


Nous Research AI ▷ #interesting-links (26 messages🔥):

Links mentioned:


Nous Research AI ▷ #general (323 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (54 messages🔥):

Links mentioned:


Nous Research AI ▷ #project-obsidian (5 messages):


LAION ▷ #general (368 messages🔥🔥):

Links mentioned:


LAION ▷ #research (27 messages🔥):

Links mentioned:


OpenAI ▷ #ai-discussions (131 messages🔥🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (142 messages🔥🔥):

Links mentioned:


OpenAI ▷ #prompt-engineering (54 messages🔥):

Links mentioned:


OpenAI ▷ #api-discussions (54 messages🔥):

Links mentioned:


Perplexity AI ▷ #general (227 messages🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (38 messages🔥):

Links mentioned:


Perplexity AI ▷ #pplx-api (27 messages🔥):

Links mentioned:


HuggingFace ▷ #general (144 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (8 messages🔥):

Links mentioned:


HuggingFace ▷ #cool-finds (17 messages🔥):

Links mentioned:


HuggingFace ▷ #i-made-this (10 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (58 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #diffusion-discussions (13 messages🔥):

Links mentioned:

Quantization primitives: no description found


HuggingFace ▷ #computer-vision (7 messages):

Links mentioned:

Know Your Data: no description found


HuggingFace ▷ #NLP (10 messages🔥):


HuggingFace ▷ #diffusion-discussions (13 messages🔥):

Links mentioned:

Quantization primitives: no description found


LlamaIndex ▷ #announcements (1 messages):

jerryjliu0: webinar happening now!


LlamaIndex ▷ #blog (9 messages🔥):

Links mentioned:

Unboxing Nomic Embed v1.5: Resizable Production Embeddings with Matryoshka Representation Learning: Nomic introduces a truly open text embedding model with resizable embeddings.


LlamaIndex ▷ #general (222 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (24 messages🔥):

Links mentioned:


Latent Space ▷ #ai-general-chat (32 messages🔥):

Links mentioned:


Latent Space ▷ #ai-announcements (8 messages🔥):

Links mentioned:


Latent Space ▷ #llm-paper-club-east (32 messages🔥):

Links mentioned:


Latent Space ▷ #ai-in-action-club (182 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #general (46 messages🔥):

Links mentioned:


CUDA MODE ▷ #cuda (9 messages🔥):

Links mentioned:


CUDA MODE ▷ #announcements (1 messages):


CUDA MODE ▷ #algorithms (11 messages🔥):

Links mentioned:


CUDA MODE ▷ #beginner (23 messages🔥):

Links mentioned:


CUDA MODE ▷ #pmpp-book (4 messages):

Links mentioned:

flash-attention/csrc/flash_attn/flash_api.cpp at 5cdabc2809095b98c311283125c05d222500c8ff · Dao-AILab/flash-attention: Fast and memory-efficient exact attention. Contribute to Dao-AILab/flash-attention development by creating an account on GitHub.


CUDA MODE ▷ #youtube-recordings (1 messages):

Links mentioned:

Lecture 5: Going Further with CUDA for Python Programmers: Material here https://github.com/cuda-mode/lectures


CUDA MODE ▷ #jax (3 messages):

Links mentioned:

Torch.fx: Practical Program Capture and Transformation for Deep Learning in Python: Modern deep learning frameworks provide imperative, eager execution programming interfaces embedded in Python to provide a productive development experience. However, deep learning practitioners somet...


CUDA MODE ▷ #ring-attention (79 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (54 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (19 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general-help (8 messages🔥):


OpenAccess AI Collective (axolotl) ▷ #datasets (8 messages🔥):

Links mentioned:

NeuralNovel/Neural-DPO · Datasets at Hugging Face: no description found


OpenAccess AI Collective (axolotl) ▷ #rlhf (4 messages):


OpenAccess AI Collective (axolotl) ▷ #runpod-help (1 messages):

c.gato: User Error.


OpenAccess AI Collective (axolotl) ▷ #replicate-help (1 messages):

j_sp_r: https://venki.dev/notes/replicate-vs-fly


LangChain AI ▷ #general (52 messages🔥):

Links mentioned:


LangChain AI ▷ #langserve (1 messages):


LangChain AI ▷ #langchain-templates (1 messages):

tumultuous_amicable: wow u def don't want to put your api key in a discord channel


LangChain AI ▷ #share-your-work (7 messages):

Links mentioned:


LangChain AI ▷ #tutorials (6 messages):

Links mentioned:


DiscoResearch ▷ #general (8 messages🔥):


DiscoResearch ▷ #benchmark_dev (1 messages):


DiscoResearch ▷ #embedding_dev (4 messages):

Links mentioned:

Tweet from Jina AI (@JinaAI_): Introducing jina-colbert-v1-en. It takes late interactions & token-level embeddings of ColBERTv2 and has better zero-shot performance on many tasks (in and out-of-domain). Now on @huggingface under Ap...


DiscoResearch ▷ #discolm_german (3 messages):


Skunkworks AI ▷ #compute (1 messages):

bluetyson: that is interesting - kickstarter for crunching?


Skunkworks AI ▷ #off-topic (5 messages):

Links mentioned:


Skunkworks AI ▷ #papers (2 messages):

Links mentioned:


Alignment Lab AI ▷ #general-chat (1 messages):

Since there is only one message provided without additional context or replies, and since there are no explicit links or further points of discussion mentioned in the message, the summary would be:


Alignment Lab AI ▷ #oo (3 messages):


AI Engineer Foundation ▷ #general (1 messages):

Links mentioned:

no title found: no description found


AI Engineer Foundation ▷ #events (2 messages):

Links mentioned: