Frozen AI News archive

Dia de las Secuelas (StarCoder, The Stack, Dune, SemiAnalysis)

**HuggingFace/BigCode** has released **StarCoder v2**, including the **StarCoder2-15B** model trained on over **600 programming languages** using the **The Stack v2** dataset. This release marks a state-of-the-art achievement for models of this size, with opt-out requests excluded from training data. A detailed technical report is available, highlighting the model's capabilities and training methodology. Additionally, a live event featuring **Dylan Patel** discussing GPU economics is announced for San Francisco.

Canonical issue URL


Onetime IRL callout: If you're in SF, join Dylan Patel (aka "that semianalysis guy" who wrote the GPU Rich/Poor essay) for a special live Latent Space special event tomorrow. Our first convo was one of last year's top referenced eps.


As hinted last year, HuggingFace/BigCode has finally released StarCoder v2 and The Stack v2. Full technical report here.

StarCoder 2: SOTA for size (3B and 15B)

Since it was only just released, best source on evals is BigCode for now:

image.png

The Stack v2: 10x bigger raw, and 4.5x bigger deduped (900B Tokens)

image.png


We are experimenting with removing Table of Contents as many people reported it wasn't as helpful as hoped. Let us know if you miss the TOCs, or they'll be gone permanently.


AI Twitter Summary

AI and Machine Learning Discussions

Executive Shifts and Leadership

Technology Industry Updates

Innovation and Technical Insights

Memes/Humor

Miscellaneous Observations

AI Development and Infrastructure

AI Twitter Narrative

The technical and engineer-oriented Twitter ecosystem has been buzzing with significant discussions spanning AI, blockchain, leadership transitions in tech, and some light-hearted humor.

Regarding AI and Machine Learning, François Chollet’s reflection on LLMs as mirrors to our inputs, alongside Daniele Grattarola's deep dive into diffusion distillation, underscore critical thinking about the essence and future of AI technologies. Reinforcing the importance of diversified safeguarding of machine learning models, Stas Bekman’s proposition for a secondary hub for model weights has caught the community's attention, highlighting the community's resilience in facing practical challenges.

In the leadership and innovation arena, the leadership transition at $SNOW garnered significant engagement, reflecting the continuous evolution and admiration for leadership within tech organizations.

Humor and memes remain a vital part of the discourse, with tweets like Cristóbal Valenzuela’s observation about the non-competition between airplanes and bicycles bringing a light-hearted perspective to innovation and disruption.

On various miscellaneous observations, Margaret Mitchell’s call for more diverse perspectives in tech reporting highlights the importance of inclusivity and varied viewpoints in shaping our understanding of tech events.

Lastly, discussions around AI development and infrastructure saw practical considerations taking the forefront, as noted by abacaj’s preparation for possible future outages by backing up model weights. This operational resilience mirrors the broader strategic resilience seen across the technical and engineering community.


PART 0: Summary of Summaries of Summaries


PART 1: High level Discord summaries

TheBloke Discord Summary

Links to consider:


Mistral Discord Summary

The conversations reveal technical discernment among the users, highlighting both enthusiasm for AI's advancements and practical discussions on AI model limitations and ideal deployment scenarios.


OpenAI Discord Summary


LM Studio Discord Summary

Model Compatibility Queries Spark GPU Discussions: Engineers engaged in detailed explorations of LLMs, such as Deepseek Coder 6.7B and StarCoder2-15B, and their compatibility with Nvidia RTX 40 series GPUs, discussing optimization strategies for GPUs like disabling certain features on Windows 11. A focus on finding the best-fitting models for hardware specifications was observed, underlined by the launch news of StarCoder2 and The Stack v2, with mentions of LM Studio's compatibility issues, especially on legacy hardware like the GTX 650.

Hugging Face Outage Disrupts Model Access: An outage at Hugging Face caused network errors for members trying to download models, affecting their ability to search for models within LM Studio.

Qualcomm Unveils 80 Open Source Models: Qualcomm released 80 open source AI models on Hugging Face, targeting vision, audio, and speech applications, potentially enriching the landscape for AI modeling and development.

LLM Functionality Expansions: Users exchanged insights on enhancing functionalities within LM Studio, such as implementing an accurate PDF chatbot with Llama2 70B Q4 LLM, seeking guidance on adding image recognition features with models like PsiPi/liuhaotian_llava-v1.5-13b-GGUF/, and expressing desires for simplified processes in downloading vision adapter models.

Hardware Hubris and Hopes: Discussions thrived around user experiences with hardware, from reminiscing about older GPUs to sharing frustrations over misrepresented specs in an e-commerce setting. One user advised optimizations for Windows 11, while TinyCorp announced a new hardware offering, TinyBox, found here. There was also speculation about the potential for Nvidia Nvlink / SLI in model training compared to inference tasks.


HuggingFace Discord Summary


LAION Discord Summary


Nous Research AI Discord Summary


Latent Space Discord Summary


Perplexity AI Discord Summary

Links shared:


Eleuther Discord Summary


LangChain AI Discord Summary

(Note: The above summary integrates topics and resources from various channels within the Discord guild, focusing on points of interest most relevant to an engineer audience looking for technical documentation, coding integration, and advancement in AI hardware and applications.)


OpenAccess AI Collective (axolotl) Discord Summary


LlamaIndex Discord Summary


OpenRouter (Alex Atallah) Discord Summary


CUDA MODE Discord Summary


Interconnects (Nathan Lambert) Discord Summary


LLM Perf Enthusiasts AI Discord Summary


DiscoResearch Discord Summary


Datasette - LLM (@SimonW) Discord Summary


Skunkworks AI Discord Summary

An Unexpected Recruitment Approach: User .papahh directly messaged @1117586410774470818, indicating a job opportunity and showing enthusiasm for their potential involvement.


Alignment Lab AI Discord Summary


PART 2: Detailed by-Channel summaries and links

TheBloke ▷ #general (1070 messages🔥🔥🔥):

Links mentioned:


TheBloke ▷ #characters-roleplay-stories (511 messages🔥🔥🔥):

Links mentioned:


TheBloke ▷ #training-and-fine-tuning (86 messages🔥🔥):

Links mentioned:

cogbuji/Mr-Grammatology-clinical-problems-Mistral-7B-0.5 · Hugging Face: no description found


TheBloke ▷ #model-merging (6 messages):


TheBloke ▷ #coding (8 messages🔥):

Links mentioned:

Modular: Announcing MAX Developer Edition Preview: We are building a next-generation AI developer platform for the world. Check out our latest post: Announcing MAX Developer Edition Preview


Mistral ▷ #general (992 messages🔥🔥🔥):

Links mentioned:


Mistral ▷ #models (12 messages🔥):


Mistral ▷ #deployment (174 messages🔥🔥):

Links mentioned:


Mistral ▷ #ref-implem (76 messages🔥🔥):


Mistral ▷ #finetuning (15 messages🔥):


Mistral ▷ #showcase (7 messages):

Links mentioned:


Mistral ▷ #random (2 messages):


Mistral ▷ #la-plateforme (41 messages🔥):

Links mentioned:


Mistral ▷ #office-hour (1 messages):


Mistral ▷ #le-chat (423 messages🔥🔥🔥):

Links mentioned:


Mistral ▷ #failed-prompts (6 messages):


Mistral ▷ #prompts-gallery (5 messages):


OpenAI ▷ #ai-discussions (58 messages🔥🔥):

Links mentioned:


OpenAI ▷ #gpt-4-discussions (21 messages🔥):


OpenAI ▷ #prompt-engineering (391 messages🔥🔥):

Links mentioned:

Disrupting malicious uses of AI by state-affiliated threat actors: We terminated accounts associated with state-affiliated threat actors. Our findings show our models offer only limited, incremental capabilities for malicious cybersecurity tasks.


OpenAI ▷ #api-discussions (391 messages🔥🔥):

Links mentioned:

Disrupting malicious uses of AI by state-affiliated threat actors: We terminated accounts associated with state-affiliated threat actors. Our findings show our models offer only limited, incremental capabilities for malicious cybersecurity tasks.


LM Studio ▷ #💬-general (484 messages🔥🔥🔥):

Links mentioned:


LM Studio ▷ #🤖-models-discussion-chat (61 messages🔥🔥):

Links mentioned:


LM Studio ▷ #🎛-hardware-discussion (42 messages🔥):

<ul>
<li><strong>Optimization Tips for Windows 11</strong>: `.bambalejo` advised users to disable certain features like microsheet's core isolation and vm platform on Windows 11 for better performance, and to ensure <em>VirtualizationBasedSecurityStatus</em> is set to 0.</li>
<li><strong>TinyBox Announcement</strong>: `senecalouck` shared a link with details on the TinyBox from TinyCorp, a new hardware offering found <a href="https://tinygrad.org">here</a>.</li>
<li><strong>E-commerce GPU Frustrations and Specs</strong>: `goldensun3ds` recounted a negative experience purchasing a falsely advertised GPU on eBay, opting for Amazon for their next purchase, listing their robust PC specs including dual RTX 4060 Ti 16GB.</li>
<li><strong>Old Hardware Nostalgia</strong>: A string of messages from users like `jans_85817`, `nullt3r`, `heyitsyorkie`, and `666siegfried666`, reminisced about older GPUs; the conversation included insights like the GTX 650 being unfit for modern models, and personal stories of past rigs and upgrades.</li>
<li><strong>Discussion on Nvidia Nvlink / SLI</strong>: Users `dub_ex` and `nullt3r` discussed the effectiveness of Nvidia Nvlink / SLI, concluding it is beneficial for model training but not necessarily for inference.</li>
</ul>

LM Studio ▷ #🧪-beta-releases-chat (7 messages):

Links mentioned:


LM Studio ▷ #autogen (7 messages):


LM Studio ▷ #langchain (3 messages):


LM Studio ▷ #memgpt (1 messages):

jans_85817: i am are waiting that lm studio version for linux


HuggingFace ▷ #announcements (1 messages):

Links mentioned:


HuggingFace ▷ #general (491 messages🔥🔥🔥):

Links mentioned:


HuggingFace ▷ #today-im-learning (8 messages🔥):

Links mentioned:


HuggingFace ▷ #cool-finds (9 messages🔥):

Links mentioned:


HuggingFace ▷ #i-made-this (14 messages🔥):

Links mentioned:


HuggingFace ▷ #diffusion-discussions (5 messages):


HuggingFace ▷ #computer-vision (4 messages):


HuggingFace ▷ #NLP (14 messages🔥):

Links mentioned:


HuggingFace ▷ #diffusion-discussions (5 messages):


LAION ▷ #general (314 messages🔥🔥):

Links mentioned:


LAION ▷ #research (48 messages🔥):

Links mentioned:


Nous Research AI ▷ #off-topic (21 messages🔥):

Links mentioned:


Nous Research AI ▷ #interesting-links (6 messages):

Links mentioned:


Nous Research AI ▷ #general (205 messages🔥🔥):

Links mentioned:


Nous Research AI ▷ #ask-about-llms (45 messages🔥):

Links mentioned:


Nous Research AI ▷ #project-obsidian (3 messages):

Here's the summary based on the messages provided:


Latent Space ▷ #ai-general-chat (57 messages🔥🔥):

Links mentioned:


Latent Space ▷ #ai-announcements (3 messages):

Links mentioned:

LLM Paper Club (West Edition!) · Luma: This week we'll be covering the paper - Matryoshka Representation Learning ( https://arxiv.org/abs/2205.13147 ) with two of the co-authors Gantavya Bhatt and Aniket Rege. We have moved...


Latent Space ▷ #llm-paper-club-west (165 messages🔥🔥):

Links mentioned:


Perplexity AI ▷ #general (157 messages🔥🔥):

Links mentioned:


Perplexity AI ▷ #sharing (13 messages🔥):


Perplexity AI ▷ #pplx-api (28 messages🔥):

Links mentioned:

Getting Started with pplx-api: You can access pplx-api using HTTPS requests. Authenticating involves the following steps:Start by visiting the Perplexity API Settings page. Register your credit card to get started. This step will n...


Eleuther ▷ #announcements (1 messages):


Eleuther ▷ #general (34 messages🔥):

Links mentioned:


Eleuther ▷ #research (63 messages🔥🔥):

Links mentioned:


Eleuther ▷ #scaling-laws (3 messages):


Eleuther ▷ #interpretability-general (15 messages🔥):


Eleuther ▷ #lm-thunderdome (19 messages🔥):

Links mentioned:


Eleuther ▷ #multimodal-general (2 messages):


Eleuther ▷ #gpt-neox-dev (2 messages):


LangChain AI ▷ #general (89 messages🔥🔥):

Links mentioned:


LangChain AI ▷ #langserve (3 messages):


LangChain AI ▷ #langchain-templates (2 messages):

Links mentioned:


LangChain AI ▷ #share-your-work (4 messages):

Links mentioned:


LangChain AI ▷ #tutorials (4 messages):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general (44 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #axolotl-dev (9 messages🔥):

Links mentioned:


OpenAccess AI Collective (axolotl) ▷ #general-help (22 messages🔥):

Links mentioned:

axolotl/src/axolotl/core/trainer_builder.py at 6b3b271925b2b0f0c98a33cebdc90788e31ffc29 · OpenAccess-AI-Collective/axolotl: Go ahead and axolotl questions. Contribute to OpenAccess-AI-Collective/axolotl development by creating an account on GitHub.


OpenAccess AI Collective (axolotl) ▷ #community-showcase (11 messages🔥):


LlamaIndex ▷ #blog (4 messages):


LlamaIndex ▷ #general (75 messages🔥🔥):

Links mentioned:


LlamaIndex ▷ #ai-discussion (5 messages):


OpenRouter (Alex Atallah) ▷ #general (49 messages🔥):

Links mentioned:


CUDA MODE ▷ #triton (10 messages🔥):

Links mentioned:


CUDA MODE ▷ #cuda (6 messages):

Links mentioned:

CUDA Math API :: CUDA Toolkit Documentation: no description found


CUDA MODE ▷ #torch (13 messages🔥):


CUDA MODE ▷ #ring-attention (15 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #news (10 messages🔥):

Links mentioned:


Interconnects (Nathan Lambert) ▷ #random (30 messages🔥):

Links mentioned:

Demis Hassabis - Scaling, Superhuman AIs, AlphaZero atop LLMs, Rogue Nations Threat: "scaling is an artform"


LLM Perf Enthusiasts AI ▷ #gpt4 (2 messages):


LLM Perf Enthusiasts AI ▷ #opensource (4 messages):


LLM Perf Enthusiasts AI ▷ #offtopic (1 messages):


LLM Perf Enthusiasts AI ▷ #openai (3 messages):


DiscoResearch ▷ #general (6 messages):

Links mentioned:


DiscoResearch ▷ #discolm_german (1 messages):

Links mentioned:

Build software better, together:): GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.


Datasette - LLM (@SimonW) ▷ #ai (4 messages):

Links mentioned:

Ask Claude for rewrites: If Claude gives a response that is close to, but not quite what you're looking for, you can ask Claude to rewrite it. In Slack this can be as simple as telling Claude to "Try again" aft...


Skunkworks AI ▷ #off-topic (1 messages):

pradeep1148: https://www.youtube.com/watch?v=ikIgy0qlif8&feature=youtu.be


Skunkworks AI ▷ #general (1 messages):


Alignment Lab AI ▷ #looking-for-collabs (1 messages):

Links mentioned:

Uncovering the Origins of Values: A Biology and Cognition-Based Approach for AI Alignment: no description found


AI Engineer Foundation ▷ #general (1 messages):