Frozen AI News archive

Evals-based AI Engineering

**Hamel Husain** emphasizes the importance of comprehensive evals in AI product development, highlighting evaluation, debugging, and behavior change as key iterative steps. **OpenAI** released a voice engine demo showcasing advanced voice cloning from small samples, raising safety concerns. Reddit discussions introduced new models like **Jamba** (hybrid Transformer-SSM with MoE), **Bamboo** (7B LLM with high sparsity based on Mistral), **Qwen1.5-MoE** (efficient parameter activation), and **Grok 1.5** (128k context length, surpassing GPT-4 in code generation). Advances in quantization include **1-bit Llama2-7B** models outperforming full precision and the **QLLM** quantization toolbox supporting GPTQ/AWQ/HQQ methods.

Canonical issue URL

Evals are the "eat your vegetables" of AI engineering - everyone knows they should just do more of it:

image.png

Hamel Husain has yet another banger in his blog series: Your AI Product Needs Evals:

Like software engineering, success with AI hinges on how fast you can iterate. You must have processes and tools for:

  1. Evaluating quality (ex: tests).
  2. Debugging issues (ex: logging & inspecting data).
  3. Changing the behavior or the system (prompt eng, fine-tuning, writing code)

Many people focus exclusively on #3 above, which prevents them from improving their LLM products beyond a demo. Doing all three activities well creates a virtuous cycle differentiating great from mediocre AI products (see the diagram below for a visualization of this cycle).

We are guilty of this at AINews - our loop is slow and hence the product improvement pace has also been much slower than we would want to see. Hamel proposes a mental model to center on evals:

image.png

Excerpts we liked:

The post has a lot of practical advice on how to make these "sensible things" easy, like using spreadsheets for hand labeling or hooking up LangSmith (which doesn't require LangChain).

image.png


Obligatory AI Safety PSA: OpenAI today released some samples of their rumored Voice Engine taking a 15s voice samples and successfully translating to different domains and languages. It's a nice demo and is great marketing for HeyGen, but more importantly they are trying to warn us that very very good voice cloning from small samples is here. Take Noam's word for it (who is at OpenAI but not on the voice team):

image.png

Alec Radford does not miss. We also enjoyed Dwarkesh's pod with Sholto and Trenton.


Table of Contents

[TOC]


AI Reddit Recap

Across r/LocalLlama, r/machinelearning, r/openai, r/stablediffusion, r/ArtificialInteligence. Comment crawling still not implemented but coming soon.

New Models and Architectures:

Quantization and Optimization:

Stable Diffusion Enhancements:

Humor and Memes:


AI Twitter Recap

all recaps done by Claude 3 Opus, best of 4 runs. We are working on clustering and flow engineering with Haiku.

AI Models and Architectures

AI Alignment and Factuality

AI Applications and Demos

AI Community and Events


PART 0: Summary of Summaries of Summaries

1) New AI Model Releases and Architectures:

2) Open Source Collaboration and Community Projects:

3) Model Evaluation, Benchmarking and Datasets:

4) Local LLM Deployment and Hardware Optimization:


PART 1: High level Discord summaries

Stability.ai (Stable Diffusion) Discord


Perplexity AI Discord

DBRX Takes the Limelight: DBRX, a newfound heavyweight in language models by Databricks, steals the show at Perplexity Labs, outshining GPT-3.5 and neck-to-neck with Gemini 1.0 Pro on tasks demystifying math and coding challenges. A green light is given to test-run DBRX for free at Perplexity Labs, throwing the gauntlet down for AI connoisseurs.

Mark Your Calendars for Copy AI and Perplexity Alliance: The fusion of Copy AI's platform with Perplexity's state-of-the-art APIs points to an uprising in go-to-market strategies, lighting the path for real-time market acumen. For users leveraging Copy AI, a half-year access to Perplexity Pro cuts the ribbon, and it's all spelled out in their co-authored blog post.

In Search of Perfection with Perplexity: Users are scratching their heads over the hit-or-miss performance of academic focus mode in Perplexity's search capabilities, puzzled by intermittent outages. Improvement in the Pro Search spaces and conflicting tales of file sources dominated discussions, with a spotlight on possibly employing RAG or GPT-4-32k technology for diverse file processing.

Tuning into Enhanced Scratchpad Tactics: The community exchanges notes on drawing out the best from Perplexity; one user gives a hands-on demo using <scratchpad> XML tags, and space enthusiasts fling questions at the AI about Starship and astronautics. Users also threw in finance-flavored queries, probing into Amazon's monetary moves and the FTX conundrum.

API Adventures and Misadventures: Queries abound regarding the Perplexity AI API's unpredictable behavior, where search results are sometimes lost in the web/API rift while hunting for answers, veering off from the steadiness promised on the web interface. For those thirsting for beta feature participation, including coveted URL citations, a beeline can be made to Apply for beta features, keeping API fanatics at the edge of their seats.


Unsloth AI (Daniel Han) Discord


LM Studio Discord


OpenAI Discord

Voice Cloning Sparks Heated Debate: Discussions emerged around OpenAI's Voice Engine, with some excited by its potential to generate natural-sounding speech from a 15-second audio sample, while others raised ethical concerns about the tech's misuse.

Confused Consumers and Missing Models: Confusion reigns among users regarding different versions of GPT-4 implemented in various applications, with contradictory reports about model stability and cutoff dates. Meanwhile, anticipation for GPT-5 is rife, yet no concrete information is available.

Encounters with Errant Equations: Users across multiple channels grappled with transferring LaTeX equations into Microsoft Word, proposing MathML as a potential solution. The intricacies of proper prompt structuring for specific AI tasks, like translations maintaining HTML tags, also took center stage.

Meta-Prompting Under the Microscope: AI enthusiasts debated the merits of metaprompting over direct instructions, with experiences suggesting inconsistent results. Precise prompts were underscored as pivotal for optimized AI performance.

Roleplay Resistance in GPT: A peculiar behavior was noted with the gpt-4-0125-preview model regarding roleplay prompts, with the AI refusing to role-play when an example format was given, yet complying when the example was omitted. Users shared workarounds and tactics to guide the AI's responses.


Eleuther Discord

New Fine-Tuning Frontiers: LISA, a new fine-tuning technique, has outshined LoRA and full-parameter training in instruction following tasks, capable of tuning 7B parameter models on a 24GB GPU. LISA's details and applications can be explored through the published paper and its code.

Chip Chat heats up: AI21 Labs revealed Jamba, a model fusing the Mamba architecture and Transformers, with a claim of 12B active parameters from a 52B total. Meanwhile, SambaNova introduced Samba-1, a Composition of Experts (CoE) model alleging reduced compute needs and higher performance, though transparency concerns persist. Details about Jamba can be found on their official release page, and scrutiny over Samba-1's performance is encouraged via SambaNova's blog.

Sensitive Data Safety Solutions Discussed: Techniques for safeguarding sensitive data in training, including SILO and differential privacy methods, formed a topic of serious discussion. Researchers interested in these topics can examine the SILO paper and [differential privacy papers](https://arxiv.org/abs/1607.00133, https://arxiv.org/abs/2110.05679) for more insights.

Discrepancy Detective Work in Model Weights: Discordants untangled differences in model weight parameters between Transformer Lens (tl) and Hugging Face (hf). The debugging process involved leveraging from_pretrained_no_processing to avoid preset weight modifications by Transformer Lens, as elucidated in this GitHub issue.

MMLU Optimization Achieved: Efficiency in MMLU tasks has been boosted, enabling extraction of multiple logprobs within a single forward call. A user reported memory allocation issues when attempting to load the DBRX base model on incorrect GPU configurations, corrected upon realizing the node configuration error. Further, a pull request aimed at improving context-based task handling in the lm-evaluation-harness awaits review and feedback after the CoLM deadline.


Nous Research AI Discord

A Peek at AI21's Transformer Hybrid: AI21 Labs has launched Jamba, a transformative SSM-Transformer model with a 256K context window and performance that challenges existing models, openly accessible under the Apache 2.0 license.

LLMs Gearing Up with MoE: The engineering community is charged up about microqwen, speculated to be a more compact version of Qwen, and the debut of Qwen1.5-MoE-A2.7B, a transformer-based MoE model that promises high performance with fewer active parameters.

LLM Training Woes and Wins: Engineers are troubleshooting issues with the Deepseek-coder-33B's full-parameter fine-tuning, exploring structured approaches for a large book dataset, and peeking at Hermes 2 Pro's multi-turn agentic loops. Meanwhile, they're diving into the significance of 'hyperstition' in expanding AI capacities and clarifying heuristic versus inference engines in LLMs.

RAG Pipelines and Data Structuring Strategies: To boost performance and efficiency in retrieval tasks, AI engineers are exploring structured XML with metadata and discussing RAG models. A mention of a ragas GitHub repository indicates ongoing enhancements to RAG systems.

Worldsim, LaTeX, and AI's Cognitive Boundaries: Tips and resources, like the gist for LaTeX papers, are being exchanged on the Worldsim project. Engineers are considering the potential of AI to delve into alternate history scenarios, while carefully differentiating between large language model use-cases.

With these elements converged, engineers are evidently navigating the challenges and embracing the evolving landscape of AI with a focus on efficiency, structure, and the constant sharing of knowledge and resources.


Modular (Mojo 🔥) Discord

Mojo Gets Juiced with Open Source and Performance Tweaks: Modular has cracked open the Mojo standard library to the open-source community under the Apache 2 license, showcasing this in the MAX 24.2 release. Enhancements include implementations for generalized complex types, workshop sessions on NVIDIA GPU support, and a focus on stabilizing support for MLIR with the syntax set to evolve.

Hype Train Gathers Steam for Modular's Upcoming Reveal: Modular is stoking excitement through a series of cryptic tweets, signaling a new announcement with emojis and a ticking clock. Community members are keeping a keen eye on the official Twitter handle for details on the enigmatic event.

MAX Engine's Leaps and Bounds: With the MAX Engine 24.2 update, Modular introduces support for TorchScript models with dynamic input shapes and other upgrades, as detailed in their changelog. A vivid discussion unfolded around performance benchmarks using the BERT model and GLUE dataset, showcasing the advancements over static shapes.

Ecosystem Flourishing with Community Contributions and Learning: Community projects are syncing up with the latest Mojo version 24.2, with an expressed interest in creating deeper contributions through understanding MLIR dialects. Modular acknowledges this enthusiasm and plans to divulge more on internal dialects over time, adapting a progressive disclosure approach towards the complex MLIR syntax.

Teasers and Livestreams Galore: Modular is shedding light on their recent developments with a livestream on YouTube covering the open sourcing of Mojo's stdlib and MAX Engine support, whereas tantalizing teasers in the form of tweets here sustain high anticipation for impending announcements.


HuggingFace Discord

Quantum Leaps in Hugging Face Contributions: New advancements have been made in AI research and applications: HyperGraph Representation Learning provides novel insights into data structures, Perturbed-Attention Guidance (PAG) boosts diffusion model performance, and the Vision Transformer model is adapted for medical imaging applications. The HyperGraph paper is discussed on Hugging Face, while PAG's project details are on its project page and the Vision Transformer details on Hugging Face space.

Colab and Coding Mettle: Engineers have been sharing tools and tips ranging from the use of Colab Pro to run large language models to the HF professional coder assistant for improving coding. Another shared their experience with AutoTrain, posting a link to their model.

Model Generation Woes and Image Classifier Queries: Some are facing challenges with models generating infinite text, prompting suggestions to use repetition penalty and StopCriterion. Others are seeking advice on fine-tuning a zero-shot image classifier, sharing issues and soliciting expertise in channels like #NLP and #computer-vision.

Community Learning Announcements: The reading-group channel's next meeting has a confirmed date, strengthening community collaboration. Interested parties can find the Discord invite link to participate in the group discussion.

Real-Time Diffusion Innovations: Marigold's depth estimation pipeline for diffusion models now includes a LCM function, and an improvement allows real-time image transitions at 30fps for 800x800 resolution. Questions on the labmlai diffusion repository indicate ongoing interest in optimizing these models.


LAION Discord


LlamaIndex Discord


OpenInterpreter Discord

International Shipping Hacks for O1 Light: Engineers explored workarounds for international delivery of the O1 light, including buying through US contacts. It was noted that O1 devices built by users are functional globally.

Local LLMs Cut API Expenses: There's active engagement around using Open Interpreter in offline mode to eliminate API costs. Contributions for running it with local models such as LM Studio were detailed, including running commands like interpreter --model local --api_base http://localhost:1234/v1 --api_key dummykey, and can be referenced in the official documentation.

Calls for Collaboration on Semantic Search: A call to action was issued for improving local semantic search within the OpenInterpreter/aifs GitHub repository. This highlights a community-driven approach to enhancing the project.

Integrating O1 Light with Arduino's Extended Family: Technical discussions looked at merging O1 Light with Arduino hardware for greater utility. While ESP32 is standard, there's eagerness to experiment with alternatives like Elegoo boards.

O1 Dev Environment Installation Windows Woes: Members reported and discussed issues with installing the 01 OS on Windows systems. A GitHub pull request aims to provide solutions and streamline the setup process for Windows-based developers.


CUDA MODE Discord

Compilers Confront CUDA: While the debate rages on the merits of using compiler technology like PyTorch/Triton versus manual CUDA code creation, members also sought guidance on CUDA courses, including recommendations for the CUDA mode on GitHub and Udacity's Intro to Parallel Programming available on YouTube. A community-led CUDA course by Cohere titled Beginners in Research-Driven Studies (BIRDS) was announced, starting April 5th, advertised on Twitter.

Windows Walks with WSL: Several members provided ease-of-use solutions for running CUDA on Windows, emphasizing Windows Subsystem for Linux (WSL), particularly WSL2, supported by a helpful Microsoft guide.

Circling the Ring-Attention Revolution: In the #[ring-attention] channel, a misalignment of fine-tuning experiments with ring-attention goals halted progress, but insights on resolving modeling_llama.py loss issues spearheaded advancements. The successful training of tinyllama models with extended context lengths up to 100k on substantial A40 VRAM was a hot topic, alongside a Reddit discussion on the hefty VRAM needs for Llama 7B models with QLoRA and LoRA.

Triton Tangle Untangled: The #[triton-puzzles] channel was abuzz with a sync issue in triton-viz linked to a specific pull request, and an official fix was provided, though some still faced installation woes. The use of Triton on Windows was also clarified, pointing to alternative environments like Google Colab for running Triton-based computations.

Zhihu Zeal Over Triton: A member successfully pierced the language barrier on the Chinese platform Zhihu to unearth a trove of Triton materials, stimulating a wish for a glossary of technical terms to aid in navigating non-English content.


OpenRouter (Alex Atallah) Discord

Model Mania: OpenRouter Introduces App Rankings: OpenRouter launched an App Rankings feature for models, exemplified by Claude 3 Opus, to showcase utilization and popularity based on public app usage and tokens processed.

Databricks and Gemini Pro Stir Excitement, But Bugs Buzz: Engineers shared enthusiasm for Databricks' DBRX and Gemini Pro 1.5, although issues like error 429 suggested rate-limit challenges, and downtime coupled with error 502 and error 524 signaled areas for reliability improvements in model availability.

Claude's Capabilities and API Discussed: The community clarified that Claude in OpenRouter doesn't support prefill features and explored error fixing for Claude 2.1. A side conversation praised ClaudeAI via OpenRouter for better handling roleplay and sensitive content with fewer false positives, noting standardized access and cost parity with official ClaudeAI API.

APIs and Clients Get a Tune-Up: OpenRouter has simplified their API to /api/v1/completions and shunned Groq for Nitro models due to rate limitations, alongside improvements in OpenAI's API client support.

Easing Crypto Payments for OpenRouter Users: OpenRouter is slashing gas costs of cryptocurrency transactions by harnessing Base chain, an Ethereum L2 solution, aiming for more economical user experiences.


AI21 Labs (Jamba) Discord


tinygrad (George Hotz) Discord


Latent Space Discord


OpenAccess AI Collective (axolotl) Discord

Jamba Sets the Bar High: AI21 Labs has introduced a new model, Jamba, featuring a 256k token context window, optimized for performance with 12 billion parameters active. Their accelerated progress is underlined by the quick training time, with knowledge cutoff on March 5, 2024, and details can be found in their blog post.

Pushing the Boundaries of Optimization: Discussions have highlighted the effectiveness of bf16 precision in torchTune leading to substantial memory savings over fp32, with these optimizations being applied to SGD and soon to the Adam optimizer. Skepticism remains over whether Axolotl provides the same level of training control as torchTune, particularly in the context of memory optimization.

The Cost of Cutting-Edge: Conversations around the GB200-based server prices revealed a steep cost of US$2-$3 million each, prompting a consideration of alternative hardware solutions by the community due to the high expenses.

Size Matters for Datasets: The hunt for long-context datasets prompted sharing of resources including one from Hugging Face's collections and the MLDR dataset on Hugging Face, which cater to models requiring extensive sequence training.

Fine-Tuning Finesse and Repetition Debate: The community has been engaging in detailed discussions about model training, with a focus on strategy sharing like and usage in prompts and debates over dataset repetition's utility, referencing a paper on data ordering to support repetition. New fine-tuning approaches for larger models like Galore are also being experimented with, despite some memory challenges.


LangChain AI Discord

OpenGPTs: DIY Food Ordering System: A resourceful engineer integrated a custom food ordering API with OpenGPTs, capturing the adaptability and potential of LangChain's open-source platform, showcased in a demonstration video. They encouraged peer reviews to refine the innovation.

A Smarter SQL AI Chatbot: Members explored methods to enable an SQL AI Chatbot to remember previous interactions, enhancing the bot’s context-retaining abilities for more effective and coherent dialogues.

Gearing Up for Product Recommendations: Engineers discussed the development of a bot that would suggest products using natural language queries, considering the use of vector databases for semantic search or employing an SQL agent to parse user intents like "planning to own a pet."

Upgrade Your Code Reviews With AI: A new AI pipeline builder designed to automate code review tasks including validation and security checks was introduced, coupled with a demo and a product link, poised to streamline the code review process.

GalaxyAI Throws Down the Gauntlet: GalaxyAI is providing free access to elite AI models such as GPT-4 and Gemini-PRO, presented as an easy-to-adopt option for projects via their OpenAI-compatible API service.

Nurturing Engineer Dialogues: The creation of the <#1222928565117517985> channel fosters concentrated discussion on OpenGPTs and its growth, as evidenced by its GitHub repository.


Interconnects (Nathan Lambert) Discord


DiscoResearch Discord

AI21 Labs Cooks Up Jamba: AI21 Labs has launched Jamba, a model blending Structured State Space models with Transformer architecture, promising high performance. Check out Jamba through its Hugging Face deployment, and read about its groundbreaking approach on the AI21 website.

Translation Titans Tussle: Members are gearing up for a translation battle among DiscoLM, Occiglot, Mixtral, GPT-4, DeepL, and Azure Translate, using the first 100 lines from a dataset like Capybara to compare performance.

Course to Conquer LLMs: A GitHub repository offering a course for Large Language Models with roadmaps and Colab notebooks was shared, aiming to educate on LLMs.

Token Insertion Tangle Untangled: A debugging success was shared regarding unexpected token insertions believed to be caused by either quantization or the engine; providing a added_tokens.json resolved the anomaly.

Training Data Transparency Tremors: The community has asked for more information on the training data used for a certain model, with specific interest in the definition and range of "English data" as stated in the model card or affiliated blog post.


Skunkworks AI Discord


PART 2: Detailed by-Channel summaries and links

Stability.ai (Stable Diffusion) ▷ #general-chat (936 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #announcements (2 messages):

Link mentioned: Copy.ai + Perplexity: Purpose-Built Partners for GTM Teams | Copy.ai: Learn more about how Perplexity and Copy.ai's recent partnership will fuel your GTM efforts!


Perplexity AI ▷ #general (728 messages🔥🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (22 messages🔥):


Perplexity AI ▷ #pplx-api (8 messages🔥):

Link mentioned: pplx-api form: Turn data collection into an experience with Typeform. Create beautiful online forms, surveys, quizzes, and so much more. Try it for FREE.


Unsloth AI (Daniel Han) ▷ #general (351 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #random (34 messages🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #help (201 messages🔥🔥):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #showcase (6 messages):

Links mentioned:


Unsloth AI (Daniel Han) ▷ #suggestions (1 messages):

starsupernova: ooo very cool!


LM Studio ▷ #💬-general (237 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (39 messages🔥):

Links mentioned:


LM Studio ▷ #announcements (2 messages):

Links mentioned:


LM Studio ▷ #🧠-feedback (13 messages🔥):


LM Studio ▷ #🎛-hardware-discussion (111 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🧪-beta-releases-chat (80 messages🔥🔥):

Links mentioned:


LM Studio ▷ #langchain (2 messages):

Link mentioned: OllamaPaperBot/simplechat.py at main · eltechno/OllamaPaperBot: chatbot designed to interact with PDF documents based on OpenSource LLM Models - eltechno/OllamaPaperBot


LM Studio ▷ #amd-rocm-tech-preview (54 messages🔥):

Links mentioned:


LM Studio ▷ #crew-ai (2 messages):


OpenAI ▷ #annnouncements (1 messages):

Link mentioned: Navigating the Challenges and Opportunities of Synthetic Voices: We’re sharing lessons from a small scale preview of Voice Engine, a model for creating custom voices.


OpenAI ▷ #ai-discussions (75 messages🔥🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (14 messages🔥):


OpenAI ▷ #prompt-engineering (153 messages🔥🔥):


OpenAI ▷ #api-discussions (153 messages🔥🔥):


Eleuther ▷ #general (272 messages🔥🔥):

Links mentioned:


Eleuther ▷ #research (46 messages🔥):

Links mentioned:


Eleuther ▷ #interpretability-general (40 messages🔥):

Links mentioned:


Eleuther ▷ #lm-thunderdome (8 messages🔥):

Link mentioned: Build software better, together: GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.


Nous Research AI ▷ #off-topic (6 messages):

Link mentioned: Cohere int8 & binary Embeddings: Cohere int8 & binary Embeddings - Scale Your Vector Database to Large Datasets#ai #llm #ml #deeplearning #neuralnetworks #largelanguagemodels #artificialinte...


Nous Research AI ▷ #interesting-links (6 messages):

Links mentioned:


Nous Research AI ▷ #general (145 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (26 messages🔥):

Links mentioned:


Nous Research AI ▷ #project-obsidian (1 messages):

night_w0lf: Did it work?


Nous Research AI ▷ #rag-dataset (57 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #world-sim (88 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #general (127 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #💬︱twitter (6 messages):


Modular (Mojo 🔥) ▷ #📺︱youtube (1 messages):

Link mentioned: Modular Community Livestream - New in MAX 24.2: MAX 24.2 is now available! Join us on our upcoming livestream as we discuss everything new in MAX - open sourcing Mojo standard library, MAX Engine support f...


Modular (Mojo 🔥) ▷ #✍︱blog (3 messages):

Links mentioned:


Modular (Mojo 🔥) ▷ #announcements (1 messages):

Links mentioned:


Modular (Mojo 🔥) ▷ #🔥mojo (91 messages🔥🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #community-projects (13 messages🔥):

Links mentioned:


Modular (Mojo 🔥) ▷ #community-blogs-vids (1 messages):

Link mentioned: Use locally built standard library in Mojo: Mojo standard library (stdlib) was open-sourced yesterday. It is exciting that the community can now contribute directly to the codebase. After spending some time with the stdlib repository, I want to...


Modular (Mojo 🔥) ▷ #🏎engine (4 messages):


HuggingFace ▷ #announcements (1 messages):

Links mentioned:

**tl;dr: do not depend on benchmark leaderboards…": no description found@xiaotianhan on Hugging Face: "🎉 🎉 🎉 Happy to share our recent work. We noticed that image resolution…": no description found@banghua on Hugging Face: "Have we really squeezed out the capacity of a compact chat model? Thrilled to…": no description found


HuggingFace ▷ #general (73 messages🔥🔥):

Links mentioned:


HuggingFace ▷ #cool-finds (4 messages):


HuggingFace ▷ #i-made-this (26 messages🔥):

Links mentioned:


HuggingFace ▷ #reading-group (3 messages):

Link mentioned: Discord - A New Way to Chat with Friends & Communities: Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.


HuggingFace ▷ #computer-vision (16 messages🔥):

Links mentioned:


HuggingFace ▷ #NLP (21 messages🔥):

Link mentioned: Evaluate Retrieval Augmented Generation (RAG) Systems: Retrieval Augmented Generation is a powerful framework which improves the quality of responses that you get from LLMs. But if you want to create RAG systems ...


HuggingFace ▷ #diffusion-discussions (4 messages):


LAION ▷ #general (108 messages🔥🔥):

Links mentioned:


LAION ▷ #research (31 messages🔥):

Links mentioned:


LlamaIndex ▷ #blog (5 messages):

Link mentioned: RSVP to LLM x Law Hackathon @Stanford #3 | Partiful: As artificial intelligence (AI) continues to revolutionize industries across the globe, the legal sector is no exception. LLMs, a foundation model capable of understanding and generating natural langu...


LlamaIndex ▷ #general (107 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (2 messages):

Links mentioned:


OpenInterpreter ▷ #general (59 messages🔥🔥):

Links mentioned:


OpenInterpreter ▷ #O1 (54 messages🔥):

Links mentioned:


CUDA MODE ▷ #general (8 messages🔥):

Link mentioned: Tweet from Cohere For AI (@CohereForAI): Our community-led Beginners in Research-Driven Studies (BIRDS) group is kicking off it’s first mini-cohort learning group focused on CUDA Programming for Beginners, beginning on Friday, April 5th 🎉


CUDA MODE ▷ #cuda (9 messages🔥):

Links mentioned:


CUDA MODE ▷ #beginner (8 messages🔥):

Link mentioned: Install WSL: Install Windows Subsystem for Linux with the command, wsl --install. Use a Bash terminal on your Windows machine run by your preferred Linux distribution - Ubuntu, Debian, SUSE, Kali, Fedora, Pengwin,...


CUDA MODE ▷ #ring-attention (57 messages🔥🔥):

Links mentioned:


CUDA MODE ▷ #off-topic (5 messages):

Link mentioned: no title found: no description found


CUDA MODE ▷ #triton-puzzles (24 messages🔥):

Link mentioned: [TRITON] Sync with triton upstream by Jokeren · Pull Request #19 · Deep-Learning-Profiling-Tools/triton-viz: no description found


OpenRouter (Alex Atallah) ▷ #announcements (1 messages):

Links mentioned:


OpenRouter (Alex Atallah) ▷ #general (107 messages🔥🔥):

Links mentioned:


AI21 Labs (Jamba) ▷ #announcements (1 messages):

Links mentioned:


AI21 Labs (Jamba) ▷ #jamba (40 messages🔥):

Links mentioned:


AI21 Labs (Jamba) ▷ #general-chat (56 messages🔥🔥):

Link mentioned: no title found: no description found


tinygrad (George Hotz) ▷ #general (56 messages🔥🔥):


tinygrad (George Hotz) ▷ #learn-tinygrad (39 messages🔥):

Links mentioned:


Latent Space ▷ #ai-general-chat (93 messages🔥🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (45 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (16 messages🔥):

Link mentioned: torchtune/recipes/configs/llama2/7B_full_single_device_low_memory.yaml at main · pytorch/torchtune: A Native-PyTorch Library for LLM Fine-tuning. Contribute to pytorch/torchtune development by creating an account on GitHub.


OpenAccess AI Collective (axolotl) ▷ #general-help (5 messages):


OpenAccess AI Collective (axolotl) ▷ #community-showcase (25 messages🔥):

Link mentioned: In-Context Pretraining: Language Modeling Beyond Document Boundaries: Large language models (LMs) are currently trained to predict tokens given document prefixes, enabling them to directly perform long-form generation and prompting-style tasks which can be reduced to do...


LangChain AI ▷ #announcements (1 messages):

Link mentioned: GitHub - langchain-ai/opengpts: Contribute to langchain-ai/opengpts development by creating an account on GitHub.


LangChain AI ▷ #general (57 messages🔥🔥):

Links mentioned:


LangChain AI ▷ #share-your-work (7 messages):

Links mentioned:


LangChain AI ▷ #tutorials (2 messages):

Link mentioned: Hack OpenGPT to Automate Anything: Welcome to the future of custom AI applications! This demo showcases the incredible flexibility and power of OpenGPTs, an open source project by LangChain. W...


Interconnects (Nathan Lambert) ▷ #news (28 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (25 messages🔥):

Links mentioned:


DiscoResearch ▷ #general (5 messages):

Links mentioned:


DiscoResearch ▷ #discolm_german (3 messages):