Frozen AI News archive

FineWeb: 15T Tokens, 12 years of CommonCrawl (deduped and filtered, you're welcome)

**2024** has seen a significant increase in dataset sizes for training large language models, with **Redpajama 2** offering up to **30T tokens**, **DBRX** at **12T tokens**, **Reka Core/Flash/Edge** with **5T tokens**, and **Llama 3** trained on **15T tokens**. **Huggingface** released an open dataset containing **15T tokens** from **12 years** of filtered CommonCrawl data, enabling training of models like **Llama 3** if compute resources are available. On Reddit, **WizardLM-2-8x22b** outperformed other open LLMs including **Llama-3-70b-instruct** in reasoning and math benchmarks. **Claude Opus** demonstrated strong zero-shot code error spotting, surpassing **Llama 3**. Benchmarks revealed limitations in the **LMSYS chatbot leaderboard** due to instruction-tuned models gaming the system, and a new RAG benchmark showed **Llama 3 70B** underperforming compared to **GPT-4**, while **Mistral 8x7B** remained strong. Efficient quantized versions of **Llama 3** models are available on **Huggingface**, with users reporting token generation limits around **9600 tokens** on a 3090 GPU. Safety concerns include a UK sex offender banned from AI tool usage and **GPT-4** demonstrating an **87% success rate** exploiting real vulnerabilities, raising security concerns.

Canonical issue URL

2024 seems to have broken some kind of "4 minute mile" with regard to datasets. Although Redpajama 2 offered up to 30T tokens, most 2023 LLMs were trained with up to 2.5T tokens - but then DBRX came out with 12T tokens, Reka Core/Flash/Edge with 5T tokens, Llama 3 with 15T tokens. And now Huggingface has released an open dataset of 12 years of filtered and deduplicated CommonCrawl data for a total of 15T tokens:

image.png

Notable that Guilherme was previously on the TII UAE Falcon 40B team, and was responsible for their RefinedWeb dataset.

One week after Llama 3's release, you now have the data to train yoru own Llama 3 if you had the compute and code.


Table of Contents

[TOC]


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence, /r/Singularity. Comment crawling works now but has lots to improve!

AI Models and Capabilities

Benchmarks and Leaderboards

Quantization and Performance

Censorship and Safety

Memes and Humor


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

Meta Llama 3 Release

Reactions and Implications

Technical Discussions


AI Discord Recap

A summary of Summaries of Summaries


PART 1: High level Discord summaries

Unsloth AI (Daniel Han) Discord

Llama 3 is the Talk of the Town: Unsloth AI's integration of Llama 3 has sparked discussions on its potential for 2x faster training and 60% less memory usage as detailed on their GitHub Release page. The community eagerly explores 4-bit models and the effects of quantization on model quality, highlighted by significant activity in experimenting with various Llama 3 variants, including those optimized for different languages and shared on platforms like Hugging Face.

Notebook Nudge: AI enthusiasts are encouraged to test Llama 3 via comprehensively prepared notebooks on Google Colab and Kaggle, making way for fine-tuning and experimentation across the board.

Solving Model Mysteries and Sharing Secrets: Candid exchanges revealed struggles and successes from fine-tuning and inferencing issues with LLaMA 3 models to hardware discussions about the NVIDIA Jetson Orin nano. Proposed fixes for looping responses and insights into effective CUDA utilization were shared, indicating a culture of collaborative problem-solving.

Sharing in Showcase: Achievements are on full display with instances such as a LinkedIn post revealing the finesse of fine-tuning Llama3 for Arabic, and the debut of the Swedish model 'bellman.' The Ghost 7B Alpha language model also got attention for its English and Vietnamese optimizations.

Ideas and Input in Suggestions: Dialogue in the #suggestions channel provided valuable takeaways, such as a need for tutorials on model merging and CUDA debugging and the potential for multi-GPU capabilities with Unsloth Studio. Adjustments to server welcome messages for better readability indicated a response to community feedback.


Perplexity AI Discord


Nous Research AI Discord

Puzzling Over Multi-GPU Context Inference: Members are evaluating how to conduct long context inference with models like Jamba using multiple GPUs, exploring tools such as DeepSpeed and Hugging Face's Accelerate without much luck, although vllm's tensor parallel solution seems promising, despite current lack of support for Jamba.

Beat-Dropping Dataset Announcements: A latent CIFAR100 dataset has been shared on Hugging Face, surprising community members with an approximate 19% accuracy using a simple FFN despite most latents not decoding accurately.

DeepMind Drops Penzai for Network Craft: Penzai, a JAX research toolkit for neural network innovation by DeepMind, has garnered attention, while an advanced research assistant and search engine offering trial premium access to models like Claude 3 Opus and GPT-4 Turbo at rubiks.ai seeks beta testers.

WorldSim's Feature-Rich Comeback: The relaunch of WorldSim includes features such as WorldClient and Mind Meld, with a new pay-as-you-go model for tokens, and a selection of models (Opus, Sonnet, Haiku) for different cost profiles.

Scrutinizing LLMs Across the Spectrum: Discussions on the slight margin in performance between Llama 3 8B and Mistral 7B, despite Llama's larger dataset, graced the forum. Meanwhile, evaluations of Llama 3 70B show more promise, and there are varied stances on the relevance of the term 'grokking', particularly in reference to LLMs.


LM Studio Discord


Stability.ai (Stable Diffusion) Discord


CUDA MODE Discord


OpenAccess AI Collective (axolotl) Discord

BOS Token Issue Resolved for LLaMa-3: An important fix was addressed with LLaMa-3's fine-tuning process, as a missing BOS token was causing issues; this has been rectified with a PR in the tokenizer configuration.

Fine-Tunning LLaMa-3 Hits a Snag: While trying to fine-tune LLaMa-3, a user faced a mysterious RuntimeError, noting this issue did not occur with other models like Mistral and LLaMa-2.

Tokenizing Troubles: The LLaMa-3 tokenizer's extensive vocabulary sparked a debate about its necessity and efficiency, some favoring a streamlined approach, others defending its ability to encode large texts with fewer tokens.

VRAM Consumption Detailed for Large LLMs: A clear VRAM usage breakdown was provided for large LLMs, revealing logits and hidden states sizes up to "19.57GiB" and "20GiB" respectively, using a massive "81920 tokens" batch size.

Axolotl's Resources for Dataset Customization: A pointer was given to Axolotl's datasets documentation for those seeking to understand custom dataset structures, offering key examples and formatting for various training tasks.


Eleuther Discord


Modular (Mojo 🔥) Discord

C++ Sneaks Past Python: Discussions revealed a performance advantage for C++ over Python/Mojo interfaces, linked to the bypass of Python runtime calls, potentially impacting inference times.

Frameworks Forge Ahead: Dialogues indicated a bright future for building Mojo frameworks, with anticipation for a time when Python frameworks can be utilized within Mojo, echoing the compatibility seen between JavaScript and TypeScript.

Performance Enigmas and Enhancements: A user reported that a Rust prefix sum computation was significantly slower than Mojo's, spawning a performance mystery. Meanwhile, a separate debate on introducing SIMD aliases in Mojo shows momentum toward refining the language's efficiency and syntax clarity.

Teaser Tweets Tantalize Techies: Modular released a series of teaser tweets suggesting a major announcement. While details remain scarce, anticipation is evident among followers awaiting the revelation.

Video Assistance Request Resonates: A member's request for likes and feedback on their AI evolution video not only seeks community support but also reflects the commitment to AI education and discourse even under tight timelines.


HuggingFace Discord


OpenRouter (Alex Atallah) Discord


Latent Space Discord


LAION Discord

Meta's Mystery Moves: Debate ignited over Meta's unusual practice of restraining LLaMA-3 paper release, signaling a potential shift in their framework for model releases, yet no reason for this divergence was cited.

Ethics and Legality in AI Tooling: The group scrutinized the legal and ethical considerations surrounding Nightshade, mentioning its potential conflict with the Computer Fraud and Abuse Act (CFAA), due to its AI training intervention capabilities.

Boosting Diffusion Model Speed: Research by NVIDIA, University of Toronto, and the Vector Institute introduced "Align Your Steps," an approach to accelerate diffusion models, discussed in their publication, yet a call for the training code release was noted for complete transparency.

Benchmarking Visual Perception in LLMs: A new benchmark named Blink was introduced for evaluating multimodal language models; it particularly measures visual perception, where models like GPT-4V show a gap when compared to human performance. The Blink benchmark is detailed in the research abstract.

Collaborative Development for NLP Coding Assistant: Interest was shown in developing an NLP coding assistant for JavaScript/Rust, with calls for collaboration and knowledge-sharing, suggesting an ongoing pursuit for improved automation tools among engineers.


OpenAI Discord


LlamaIndex Discord

LlamaParse Automates Code Mastery: A collaboration with TechWithTimm enables setup of local Large Language Models (LLMs) using LlamaParse to construct agents capable of writing code; details and a workflow glimpse are on Twitter.

Local RAG Goes Live: Instructions for crafting a RAG application entirely locally using MetaAI's Llama-3 can be found alongside an informative Twitter post, highlighting the move towards self-hosted AI applications.

Tackling AI's Enigma 'Infini Attention': An explainer on Infini Attention’s potential impact on generative AI was introduced along with an insights-rich LinkedIn post.

Geographical AI Data Visualization: The AI Raise Tracking Sheet now includes and displays AI funding by city, inviting community scrutiny via this Google spreadsheet; a celebratory tweet emphasizes the geographical spread of AI companies over the past year.

Enhanced Markdown for LLMs and Knowledge Graph SDK: FireCrawl’s integration with LlamaIndex beefs up LLMs with markdown capabilities, while WhyHow.AI's Knowledge Graph SDK now facilitates building schema-controlled automated graphs; further exploration in respective Medium articles and here.


OpenInterpreter Discord

Fine-Tuning AI with Lightning Speed: Engineers in the guild have been experimenting with quick-learning models such as Mixtral and Llama, noting the small dataset sizes needed for efficient fine-tuning.

Groq's Rocking Performance with Llama3: The Llama3 model shows impressive speed on Groq hardware, sparking interest for its use in practical applications, with discussion on GitHub pinpointing installation bugs specific to OI on Windows.

Bug Hunts and Workarounds in AI Tools: The community discussed various bugs, such as the spacebar issue on M1 Macbooks with O1 and performance issues with Llama 3 70b. Recommended fixes included installing ffmpeg and using conda for alternate Python versions.

Windows Woes and Macbook Mistakes: Issues running Open Interpreter's O1 on Windows signal possible client problems, and voice recognition glitches on M1 Macbooks are causing disruptions when the spacebar is pressed.

Confusions Clarified and Stability Scrutinized: Clarification was made on O1 versus Open Interpret compatibility with Groq. Stability concerns were raised for Llama 3 70b models, suggesting that larger models may have greater instability issues compared to their smaller counterparts.


Cohere Discord

MySQL Connector Confusion Cleared: Integration of MySQL with Cohere LLMs sparked questions regarding the use of Docker and direct database answers. A GitHub repository clarifies reference code, despite issue reports about outdated documentation and malfunctioning create_connector commands.

No Command R for Profit: It was clarified that Command R (and Command R+) is restricted to non-commercial use under the CC-BY-NC 4.0 license, barring usage on edge devices for commercial purposes.

AI Startup Talent Call: An AI startup founder is actively seeking experts with a strong background in AI research and LLMs to assist with model tuning and voice models. Interested candidates are encouraged to connect via LinkedIn.

Alternative Routes after Internship Setback: Advice was shared for pursuing ML/software engineering roles post-internship rejection at Cohere, which included tapping into university networks, seeking companies with non-public intern opportunities, contributing to open-source initiatives, and attending job fairs.

AI Ethical Dilemmas and Tech Updates: Discussions included concerns over the ethical implications of AI "jailbreaks" and their potential to induce unintended agent behaviors, an open-source matchmaking AI application using @cohere Command R+, and the launch of Prompt Mixer, a new IDE for creating and evaluating prompts, available at www.promptmixer.dev.


tinygrad (George Hotz) Discord


DiscoResearch Discord


LangChain AI Discord

LangChain's Endpoint Elusiveness: Engineers sought guidance on locating their LangChain endpoint, a key aspect for engaging with its capabilities, with additional observations on inconsistent latencies in firefunction across various devices.

Pirate-Speak Swagger Lost at Sea: A lone message washed ashore in the #langchain-templates channel in quest of the elusive FstAPI route code for pirate-speak, lacking further engagement or treasure maps to its whereabouts.

Community Creations Cruising the High Seas: Innovators hoisted their colors high, presenting diverse projects like Trip-Planner Bot, LLM Scraper, and AllMind AI. Resources ranged from GitHub repositories for bots and scrapers to soliciting broadsides (support) on Product Hunt for AI stock analysts.

Deciphering the Query Scrolls: An AI sage shed light on the process of refining natural language queries into structured ones using Self-querying retrievers, documenting their wisdom in Rental Apartment Search with LangChain Self-Querying Retriever.

Knowledge Graph Armada Upgrade: WhyHow.AI charted a course toward enriched knowledge graphs with upgraded SDKs, beckoning brave pioneers to join the Beta via a Medium article and add wind to the sails of schema-controlled automatons.


Mozilla AI Discord


Interconnects (Nathan Lambert) Discord


LLM Perf Enthusiasts AI Discord


Skunkworks AI Discord


Datasette - LLM (@SimonW) Discord

Blueprint AI Know-How Wanted: An engineer has expressed interest in AI models to analyze blueprints for ductwork in PDF plans, indicating a practical use-case for image recognition within construction.

AI Previews Before Building: The engineering community discussed the emergence of AI as a preflight check in architecture firms to spot issues and code violations before building, though it has yet to permeate the blueprint design process.

Llama 3 Lands on Laptops: SimonW has updated the llm-gpt4all plugin to support Llama 3 8B Instruct on systems with just 8GB of RAM, a boon for users with devices like the M2 MacBook Pro.

Plugin Ready for Install: Version 0.4 of the llm-gpt4all plugin is now available, enabling the interaction with new models like Llama 3 8B Instruct, as detailed in the latest GitHub release.

Diving Deep with Llama 3: SimonW has provided a comprehensive look at the capabilities of Llama 3, characterized as the leading openly licensed model, via a detailed blog post.


Alignment Lab AI Discord


The AI21 Labs (Jamba) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.


PART 2: Detailed by-Channel summaries and links

Unsloth AI (Daniel Han) ▷ #general (1039 messages🔥🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #announcements (1 messages):

Link mentioned: Google Colaboratory: no description found


Unsloth AI (Daniel Han) ▷ #random (99 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (823 messages🔥🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (54 messages🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #suggestions (67 messages🔥🔥):

Links mentioned:


Perplexity AI ▷ #general (1038 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (29 messages🔥):

Links mentioned:


Perplexity AI ▷ #pplx-api (4 messages):


Nous Research AI ▷ #ctx-length-research (7 messages):


Nous Research AI ▷ #off-topic (12 messages🔥):

Links mentioned:


Nous Research AI ▷ #interesting-links (2 messages):

Link mentioned: GitHub - google-deepmind/penzai: A JAX research toolkit for building, editing, and visualizing neural networks.: A JAX research toolkit for building, editing, and visualizing neural networks. - google-deepmind/penzai


Nous Research AI ▷ #announcements (2 messages):

Link mentioned: world_sim: no description found


Nous Research AI ▷ #general (594 messages🔥🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (56 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #project-obsidian (7 messages):

Links mentioned:


Nous Research AI ▷ #rag-dataset (61 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #world-sim (660 messages🔥🔥🔥):

Links mentioned:


LM Studio ▷ #💬-general (722 messages🔥🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (358 messages🔥🔥):

Links mentioned:


LM Studio ▷ #announcements (1 messages):

Link mentioned: Tweet from LM Studio (@LMStudioAI): Model search / download within LM Studio may be impacted by this Hugging Face downtime. Stay tuned for updates ↘️ Quoting Hugging Face Status (@hf_status) We're experiencing some downtime on h...


LM Studio ▷ #🧠-feedback (18 messages🔥):


LM Studio ▷ #📝-prompts-discussion-chat (1 messages):


LM Studio ▷ #🎛-hardware-discussion (34 messages🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (24 messages🔥):


LM Studio ▷ #autogen (20 messages🔥):


LM Studio ▷ #rivet (1 messages):


LM Studio ▷ #memgpt (1 messages):


LM Studio ▷ #avx-beta (4 messages):


LM Studio ▷ #amd-rocm-tech-preview (73 messages🔥🔥):

Links mentioned:


LM Studio ▷ #model-announcements (1 messages):

Link mentioned: lmstudio-community/Meta-Llama-3-70B-Instruct-GGUF · Hugging Face: no description found


Stability.ai (Stable Diffusion) ▷ #general-chat (1003 messages🔥🔥🔥):

Links mentioned:


CUDA MODE ▷ #general (24 messages🔥):

Links mentioned:


CUDA MODE ▷ #triton (34 messages🔥):

Links mentioned:


CUDA MODE ▷ #cuda (9 messages🔥):


CUDA MODE ▷ #torch (3 messages):

Link mentioned: GitHub - openai/triton: Development repository for the Triton language and compiler: Development repository for the Triton language and compiler - openai/triton


CUDA MODE ▷ #announcements (1 messages):


CUDA MODE ▷ #algorithms (1 messages):

andreaskoepf: https://x.com/AliHassaniJr/status/1766108184630943832


CUDA MODE ▷ #beginner (25 messages🔥):

Link mentioned: Join the PMPP UI lectures timezones Discord Server!: Check out the PMPP UI lectures timezones community on Discord - hang out with 28 other members and enjoy free voice and text chat.


CUDA MODE ▷ #pmpp-book (2 messages):


CUDA MODE ▷ #youtube-recordings (1 messages):

.bexboy: I suppose that this one session will be uploaded too?


CUDA MODE ▷ #jax (1 messages):

Link mentioned: equinox/equinox/internal/_loop/common.py at main · patrick-kidger/equinox: Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/ - patrick-kidger/equinox


CUDA MODE ▷ #off-topic (4 messages):


CUDA MODE ▷ #triton-puzzles (1 messages):

stygiansonic: You can also use something like this for relu: z = tl.where(z > 0, z, 0)


CUDA MODE ▷ #hqq (12 messages🔥):

Links mentioned:


CUDA MODE ▷ #llmdotc (615 messages🔥🔥🔥):

Links mentioned:


CUDA MODE ▷ #massively-parallel-crew (23 messages🔥):

Link mentioned: lecture-15.mov: no description found


OpenAccess AI Collective (axolotl) ▷ #general (653 messages🔥🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (16 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general-help (22 messages🔥):


OpenAccess AI Collective (axolotl) ▷ #runpod-help (37 messages🔥):


OpenAccess AI Collective (axolotl) ▷ #axolotl-phorm-bot (22 messages🔥):

Links mentioned:


Eleuther ▷ #general (326 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (293 messages🔥🔥):

<ul>
  <li><strong>Debate on "Megalodon" Architecture's Superiority</strong>: Discussions involved considerations about <strong>Megalodon</strong>, a new architecture from Meta boasting efficiency with long contexts, which was noted to outperform Llama-2 in controlled tests. Skepticism remains regarding how it compares to other hybrid attention mechanisms and its potential broad acceptance.</li>
  <li><strong>Exploring Task Vectors for Model Steering</strong>: A method called <strong>task vectors</strong> is proposed for steering the behavior of a pre-trained model, allowing modification through arithmetic operations like negation and addition. This could enable the addition of specialized knowledge to models like Llama3 without direct fine-tuning (as per <a href="https://arxiv.org/abs/2212.04089">arXiv:2212.04089</a>).</li>
  <li><strong>New Benchmark for RAG Models Proposed</strong>: <strong>Stella Athena</strong> shared an idea for a benchmark targeting Retrieval-Augmented Generation (RAG) models, where questions require synthesizing information from multiple documents. The challenge is significant due to potential dataset contamination when choosing sources present in common training collections.</li>
  <li><strong>Attention Mechanism Approximation for Inference</strong>: <strong>Carson Poole's</strong> query about approximating attention mechanisms to compress token length during inference sparked references to several papers (e.g., <a href="https://arxiv.org/abs/2401.03462">arXiv:2401.03462</a>, <a href="https://arxiv.org/abs/2401.06104">arXiv:2401.06104</a>) that discuss related concepts like Activation Beacon, TOVA, and dynamic FLOPs allocation.</li>
  <li><strong>Potential and Limitations of Transformer Context Extensions</strong>: A discussion emerged about the feasibility of extending the context length for transformers, with references to Gemini Pro 1.5's context length and challenges in quadratic compute scaling, highlighting that enormous context lengths (e.g., 10 million tokens) likely indicate an architecture beyond simple context-length fine-tuning.</li>
</ul>

Links mentioned:


Eleuther ▷ #scaling-laws (47 messages🔥):

Link mentioned: Tweet from Kyo (@kyo_takano): You ARE rounding the original estimate lol Try inspecting the TeX source like you did PDF figures. To be more specific, you rounded: - E from exp(0.5267228) to 1.69 - A from exp(6.0073404) to 406.4 ...


Eleuther ▷ #interpretability-general (2 messages):

Link mentioned: [Summary] Progress Update #1 from the GDM Mech Interp Team — AI Alignment Forum: Introduction This is a progress update from the Google DeepMind mechanistic interpretability team, inspired by the Anthropic team’s excellent monthly…


Eleuther ▷ #lm-thunderdome (5 messages):

Link mentioned: MMLU - Alternative Prompts: MMLU (Prompt Variation) Example Input Prompt Input Prompt,Format 01,{{question.strip}} 02,Q: {{question.strip}}\nA: 03,Question: {{question.strip}}\nAnswer: Llama-2-7b-hf,Mistral-7B-v0.1,falcon-7b,py...


Modular (Mojo 🔥) ▷ #general (77 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (7 messages):


Modular (Mojo 🔥) ▷ #ai (1 messages):

Link mentioned: The Rise of AI: (Hidupkan Closed Caption)(Turn on the Closed Caption)Bergabunglah bersama kami dalam perjalanan melalui evolusi cepat Kecerdasan Buatan, mulai dari kemuncula...


Modular (Mojo 🔥) ▷ #🔥mojo (279 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #community-projects (19 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #performance-and-benchmarks (5 messages):


Modular (Mojo 🔥) ▷ #🏎engine (24 messages🔥):


Modular (Mojo 🔥) ▷ #nightly (37 messages🔥):

Links mentioned:


HuggingFace ▷ #general (324 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (8 messages🔥):

Links mentioned:


HuggingFace ▷ #cool-finds (11 messages🔥):

Links mentioned:


HuggingFace ▷ #i-made-this (27 messages🔥):

Links mentioned:


HuggingFace ▷ #computer-vision (7 messages):

Links mentioned:


HuggingFace ▷ #NLP (11 messages🔥):

Link mentioned: GitHub - gnp/minbpe-rs: Port of Andrej Karpathy's minbpe to Rust: Port of Andrej Karpathy's minbpe to Rust. Contribute to gnp/minbpe-rs development by creating an account on GitHub.


HuggingFace ▷ #diffusion-discussions (4 messages):


OpenRouter (Alex Atallah) ▷ #announcements (5 messages):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #app-showcase (5 messages):

Link mentioned: no title found: no description found


OpenRouter (Alex Atallah) ▷ #general (353 messages🔥🔥):

Links mentioned:

📙Release Blog:…": no description found


Latent Space ▷ #ai-general-chat (201 messages🔥🔥):

Links mentioned:


Latent Space ▷ #ai-announcements (3 messages):


Latent Space ▷ #llm-paper-club-west (66 messages🔥🔥):

Links mentioned:


Latent Space ▷ #ai-in-action-club (71 messages🔥🔥):

Links mentioned:


LAION ▷ #general (247 messages🔥🔥):

Links mentioned:


LAION ▷ #research (72 messages🔥🔥):

Links mentioned:


LAION ▷ #learning-ml (6 messages):


OpenAI ▷ #ai-discussions (193 messages🔥🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (32 messages🔥):


OpenAI ▷ #prompt-engineering (29 messages🔥):


OpenAI ▷ #api-discussions (29 messages🔥):


LlamaIndex ▷ #blog (6 messages):


LlamaIndex ▷ #general (205 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (4 messages):

Links mentioned:


OpenInterpreter ▷ #general (75 messages🔥🔥):

Links mentioned:


OpenInterpreter ▷ #O1 (18 messages🔥):


Cohere ▷ #general (64 messages🔥🔥):

Links mentioned:


Cohere ▷ #project-sharing (5 messages):

Links mentioned:


Cohere ▷ #collab-opps (1 messages):


tinygrad (George Hotz) ▷ #general (21 messages🔥):


tinygrad (George Hotz) ▷ #learn-tinygrad (38 messages🔥):

Links mentioned:


DiscoResearch ▷ #mixtral_implementation (1 messages):


DiscoResearch ▷ #general (9 messages🔥):


DiscoResearch ▷ #discolm_german (49 messages🔥):

Links mentioned:


LangChain AI ▷ #general (47 messages🔥):

Links mentioned:


LangChain AI ▷ #langchain-templates (1 messages):


LangChain AI ▷ #share-your-work (7 messages):

Links mentioned:


LangChain AI ▷ #tutorials (1 messages):

Link mentioned: Building a Rental Apartment Search with Langchain's Self-Querying Retriever: In this blog post, we delve into the capabilities of Langchain's self-querying retriever, a powerful tool for bridging the gap between natural language and structured data retrieval. This retriev...


Mozilla AI ▷ #llamafile (41 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (10 messages🔥):

Link mentioned: Tweet from Dylan Patel (@dylan522p): LLAMA 3 8B was amazing but will be overshadowed Phi-3 mini 4b, small 7b, medium 14b this week, and the benchmarks are fucking insane Synthetic data pipelines are massive improvements over internet dat...


Interconnects (Nathan Lambert) ▷ #ml-questions (9 messages🔥):


Interconnects (Nathan Lambert) ▷ #random (11 messages🔥):

Link mentioned: no title found: no description found


Interconnects (Nathan Lambert) ▷ #rlhf (5 messages):

Link mentioned: From $r$ to $Q^*$: Your Language Model is Secretly a Q-Function: Reinforcement Learning From Human Feedback (RLHF) has been a critical to the success of the latest generation of generative AI models. In response to the complex nature of the classical RLHF pipeline,...


Interconnects (Nathan Lambert) ▷ #sp2024-history-of-open-alignment (3 messages):


LLM Perf Enthusiasts AI ▷ #general (7 messages):

Link mentioned: Falling Falling Down Stairs GIF - Falling Falling Down Stairs Stairs - Discover & Share GIFs: Click to view the GIF


LLM Perf Enthusiasts AI ▷ #speed (3 messages):


Skunkworks AI ▷ #finetuning (6 messages):

Links mentioned:


Datasette - LLM (@SimonW) ▷ #ai (2 messages):


Datasette - LLM (@SimonW) ▷ #llm (3 messages):

Links mentioned:


Alignment Lab AI ▷ #ai-and-ml-discussion (1 messages):

Link mentioned: Learn How LLAMA 3 Works Now: The Complete Beginner’s Guide: Dive into the fascinating world of the LLAMA 3 model, a cutting-edge transformer architecture that is setting new standards in machine learning. This guide i...