Google account is all you need?
AI News for 11/17/2025-11/18/2025. We checked 12 subreddits, 544 Twitters and 24 Discords (205 channels, and 14599 messages) for you. Estimated reading time saved (at 200wpm): 997 minutes. Our new website is now up with full metadata search and beautiful vibe coded presentation of all past issues. See https://news.smol.ai/ for the full news breakdowns and give us feedback on @smol_ai!
Finally we got the highly anticipated Gemini 3 launch — SOTA benchmarks across the board (except for one), 60% higher pricing than Gemini 2.5, and #1 across all major Arena leaderboards (yes, as suspected, Gemini 3 beats yesterday’s Grok 4.1 which claimed #1 in Text Arena for one day, what a coincidence).
But you should carefully eyeball the HUGE jumps especially compared head on with Sonnet 4.5 and GPT-5.1, each competitor’s literal frontier model:
Some notable independent evals from Aritifical Analysis, Vending Bench, ARC-AGI 2, Box, and of course PelicanBench (now with a v2) validate it as a frontier LLM. (we ran Gemini 3 vs GPT 5.1 and found GPT 5.1 clearly better for ultralong summarization/instruction following). The magnitude of the ARC AGI 2 results is also a pretty notable pareto improvement:
Oriol Vinyals confirmed there were both pretraining and posttraining improvements - no walls in sight.
On top of this, two more surprise launches: Gemini Deep Think, with better numbers but unreleased, and Antigravity, the former acquired Windsurf team’s take on a Google-born agentic IDE with it’s own domain name and a (un)surprisingly well executed YouTube channel.
Google is very, very back in the business. More models to come, and it’s only Tuesday.
AI Twitter Recap
Google’s Gemini 3 and Antigravity: launch, specs, and availability
- Gemini 3 Pro (Preview) launch: Google unveiled its most capable model to date with a 1M-token context window (up to 1,048,576 input; 65,536 output), state-of-the-art multimodal reasoning, and strong agentic/vibe coding. Pricing (AI Studio): ≤200K tokens at $2M/$12M (in/out), ≥200K at $4M/$18M, with reported ~128 tokens/sec generation speed and a Jan 2025 knowledge cutoff. It’s available now across the Gemini app, AI Mode in Search, AI Studio/Vertex, and as an API/CLI, with dev docs covering new controls like thinking_level, per-part media_resolution, and mandatory Thought Signatures for reasoning continuity (@sundarpichai, @GoogleDeepMind, @Google, pricing, speed, dev guide, context/output).
- Antigravity (agentic IDE): Google introduced Antigravity, an agent-first development environment where agents orchestrate tasks across editor/terminal/browser, run UI tests via a Browser Subagent, record/replay, and iterate with human-in-the-loop validations. It uses Gemini 3 Pro for reasoning, Gemini 2.5 Computer Use for end-to-end execution, and Nano Banana for images. Public preview is free today (@antigravity, @GoogleDeepMind, overview).
Benchmarks and empirical performance (incl. Deep Think)
- Leaderboards and SOTA: Gemini 3 Pro debuts at #1 on LMSYS Arena Text with 1501 Elo and tops WebDev with 1487 Elo (@arena, @GoogleCloudTech). It posts strong scores on “Humanity’s Last Exam” (HLE ~37% without tools per Artificial Analysis; 41% reported for the Deep Think variant), GPQA Diamond up to 93.8% (Deep Think), ARC-AGI-2 at 31.1% (Pro) and 45.1% (Deep Think), MMMU-Pro ~81%, Video MMMU ~87.6% (@ArtificialAnlys, @koraykv, @lmthang, @fchollet). ARC Prize verified a >2× SOTA jump on ARC-AGI-2 with costs: Pro ~$0.81/task vs. Deep Think ~$77/task (@arcprize).
- Coding and agentic evaluations: Gemini 3 Pro leads LiveCodeBench Pro (notable uplift vs GPT‑5.x), wins multiple Code Arena tracks (Website/Game Dev/3D/UI Components), and performs strongly on terminal-task benchmarks (Terminus2/CodexCLI mixes matter). It also leads staged browsing (Stagehands) and shows large gains on PMPP-Eval (CUDA), SimpleBench, and LisanBench token-efficiency (Code/WebDev, LiveCodeBench, Stagehands, PMPP/SimpleBench, LisanBench). Karpathy cautions to validate beyond public leaderboards; private eval ensembles remain key (@karpathy).
Ecosystem rollouts and integrations
- Editors/Agents/Platforms: Gemini 3 Pro is shipping broadly on day one: Cursor default toggle and deep integration (@cursor_ai), VS Code/GitHub Copilot and GitHub CLI (@pierceboggan), Windsurf (@cognition), Cline (@cline), Amp default model (@thorstenball), Vercel AI Cloud (Gateway/v0/SDK) (@vercel), LlamaIndex (PR manager agent) and LlamaAgents (Gemini agent demo, LlamaAgents). Also on OpenRouter and Ollama Cloud (OpenRouter, Ollama Cloud).
- Search and generative UI: AI Mode in Google Search shipped Gemini 3 on day one with dynamic, query-tailored generative layouts and interactive simulations, rolling first to AI Pro/Ultra subs in the U.S. The Gemini app adds “Gemini Agent” for multi-step tasks and more visual/interactive responses (@Google, app update).
Anthropic x Microsoft x NVIDIA: multi-cloud Claude and massive capex
- Strategic partnership: Anthropic announced a deep technical and go-to-market partnership with Microsoft and NVIDIA: Claude models are now on Azure and Microsoft Foundry, with Microsoft and NVIDIA committing up to $5B and $10B respectively to expand Anthropic’s research and capacity. This makes Claude the only “frontier” model line available across the three major clouds (@AnthropicAI, @satyanadella, @nvidia, Claude on Azure/Foundry, Anthropic note).
Open research agents and toolchain updates
- AI2’s Deep Research Tulu (DR Tulu): Fully open recipe for long-form deep research with an 8B agent and a novel RLER (Reinforcement Learning with Evolving Rubrics) reward scheme that is instance-specific, search-grounded, and evolves to reduce reward hacking. Code, paper, and training procedure are released (@allen_ai, RLER details).
- Open-agent frameworks and middleware: LangChain introduced middleware for reliability (fallbacks) and control (model call limits), and highlighted production agent “middlewares” like subagents, FS, summarization (fallbacks, call limits, middlewares ask). LlamaExtract adds per-table-row extraction; LlamaAgents open preview for multi-step doc agents (LlamaExtract, LlamaAgents). AI agent research also included MiroThinker (model/context/interactive scaling) aiming to close the gap with proprietary deep-research agents (summary, paper).
Infra and ops notes
- Infra momentum and outages: Vercel, SkyPilot x CoreWeave, and Together Instant Clusters pushed fleet/cluster orchestration, while Modal profiled “host overhead” as a key inference bottleneck class (don’t stall GPUs) (Vercel, SkyPilot x CoreWeave, Together, Modal). A wide Cloudflare outage overlapped launch day, leading to broad service instability mentions (context).
Top tweets (by engagement)
- Sundar Pichai introducing Gemini 3: “best model in the world for multimodal understanding… agentic + vibe coding” (@sundarpichai, 19,250)
- Sam Altman congratulates Google on Gemini 3 (@sama, 30,283.5)
- Antigravity, Google’s agentic IDE, public preview (@antigravity, 10,231.5)
- Google AI Studio: “Gemini 3 Pro… 1501 Elo on LMArena” (@GoogleAIStudio, 14,311.5)
- Demis Hassabis on Gemini 3 topping HLE/GPQA/Arena and everyday utility (@demishassabis, 4,170)
- Anthropic partnership: Claude on Azure; NVIDIA/Microsoft to invest up to $10B/$5B (@AnthropicAI, 2,370.5)
Notes for builders:
- Gemini 3 Pro’s tool use and structured outputs are materially improved; pay attention to AI Studio’s new reasoning/IO controls and Thought Signatures for stable multi-turn chains (dev guide).
- Benchmarks are way up, but model harnesses and task design matter (e.g., coding/terminal-bench deltas across harnesses). Validate on your private evals and production traces (@tristanzajonc, @karpathy).
- ARC-AGI-2 results suggest heavy gains from test-time reasoning compute; cost/perf tradeoffs for “Deep Think” modes are significant (@arcprize).
AI Reddit Recap
/r/LocalLlama + /r/localLLM Recap
1. AI Server Uptime and Outages
- My local AI server is up and running, while ChatGPT and Claude are down due to Cloudflare’s outage. Take that, big tech corps! (Activity: 297): A Reddit user reports that their local AI server remains operational while ChatGPT and Claude are down due to a Cloudflare outage. This highlights the resilience of local servers compared to reliance on cloud services, which can be affected by external service disruptions. Despite the outage, some APIs remain functional, allowing continued access to cloud-based AI services. One commenter notes that their own LLM, hosted on a Cloudflare server, is down, forcing them to use cloud services via APIs, which are still operational. This suggests a mixed reliability of cloud services during outages.
- LocoMod points out that despite the Cloudflare outage, the APIs for services like ChatGPT and Claude are still operational. This suggests that while some user interfaces might be down, the backend services remain accessible, highlighting the resilience of API-based architectures even during widespread outages.
- JoshuaLandy mentions that their local language model (LLM) is hosted on a Cloudflare server and is currently down, forcing them to rely on cloud services via APIs that are still functional. This underscores the importance of having multiple layers of redundancy and the potential vulnerability of relying solely on a single infrastructure provider like Cloudflare.
- Blizado highlights a critical issue with cloud-based services: their dependency on internet connectivity and third-party services like Cloudflare. When such services face outages, it can feel as though the entire internet is down, affecting numerous dependent services. This comment emphasizes the need for decentralized or local solutions to mitigate such risks.
- Gemini 3 is launched (Activity: 1007): Google has launched Gemini 3, a state-of-the-art AI model that significantly enhances reasoning and multimodal capabilities, outperforming previous models in various benchmarks. Integrated across Google products like the Gemini app and Vertex AI, it introduces a Deep Think mode for complex problem-solving. The model excels in multimodal understanding, achieving top scores in AI benchmarks, and is designed to assist users in learning, building, and planning across diverse topics. Source A comment suggests a demand for a smaller, 8-14B parameter version of the model, indicating interest in more accessible versions of advanced AI models. Another comment humorously notes a successful bet on the release timing, reflecting community engagement with AI development timelines.
- Zemanyak discusses the potential for a future model, Gemma 4, suggesting a size between 8-14 billion parameters. This indicates a desire for a model that balances performance and resource efficiency, potentially improving on the capabilities of Gemini 3.
- lordpuddingcup provides a detailed breakdown of the Gemini Antigravity features, highlighting access to multiple advanced models like Gemini 3 Pro, Claude Sonnet 4.5, and GPT-OSS. The comment also notes the benefits of unlimited tab completions and command requests, along with generous rate limits, which could significantly enhance user experience and productivity.
Less Technical AI Subreddit Recap
/r/Singularity, /r/Oobabooga, /r/MachineLearning, /r/OpenAI, /r/ClaudeAI, /r/StableDiffusion, /r/ChatGPT, /r/ChatGPTCoding, /r/aivideo, /r/aivideo
1. Gemini 3.0 Pro Benchmark and Release Discussions
- Gemini 3.0 Pro benchmark results (Activity: 3182): The image likely contains benchmark results for the Gemini 3.0 Pro model, which appears to have achieved impressive performance metrics, as indicated by the comments. Users are comparing it favorably to previous models like GPT 5.1, suggesting significant improvements. Specific mentions of ‘Arc AGI’ and ‘ScreenSpot’ imply that these benchmarks might include new or enhanced capabilities in artificial general intelligence and screen interaction, respectively. The excitement in the comments suggests that these results are seen as a major advancement in AI technology. The comments reflect a strong positive reception, with users expressing surprise and admiration for the performance improvements over previous models. There is a sense of anticipation and satisfaction with the advancements made by Gemini 3.0 Pro, indicating that it has exceeded expectations.
- The benchmark results for Gemini 3.0 Pro have shown significant improvements, particularly in the Arc AGI and ScreenSpot metrics. Users are noting that the performance numbers are ‘insane’, with Arc AGI achieving a 31% improvement, which is a substantial leap compared to previous iterations.
- There is skepticism about the authenticity of the benchmark results, especially regarding the Arc AGI - 2 metric, which reportedly shows a 31% improvement. This level of performance increase is being questioned by some users as potentially unrealistic, indicating a need for further validation or clarification from the developers.
- The release of Gemini 3.0 Pro is being compared to previous models like GPT 5.1, with users expressing surprise at the level of improvement. The results are being described as a remarkable conclusion to the year, suggesting that the advancements in this version have exceeded expectations.
- Gemini 3 Deep Think benchmarks (Activity: 1342): The image likely contains benchmark results for the Gemini 3 Deep Think model, showcasing a significant performance improvement on the
arc-agi2benchmark, jumping from4.9%to45.1%. This dramatic increase suggests substantial advancements in the model’s capabilities, possibly due to architectural changes or training improvements. The comments highlight the impressive nature of this leap and suggest that the full graph in the image provides additional context or insights into the model’s performance across various benchmarks. The comments express amazement at the performance improvement, with one user noting the jump from4.9%to45.1%as “unbelievable.” Another comment suggests that the full graph in the image is worth examining for a more comprehensive understanding of the model’s performance.- The benchmark result of
45.1%on the ARC-AGI2 test for Gemini 3 Deep Think represents a significant improvement from a previous score of4.9%. This dramatic increase highlights a substantial leap in performance, suggesting major advancements in the model’s capabilities. - The graph linked by raysar provides a visual representation of the performance improvements, showcasing the substantial leap in scores. This visual data can help in understanding the scale and impact of the improvements made in the Gemini 3 model.
- The discussion around the benchmark results emphasizes the unexpected nature of the performance jump, with users expressing disbelief at the nearly tenfold increase in the score. This suggests that the improvements in Gemini 3 may involve significant architectural changes or optimizations.
- The benchmark result of
- Gemini 3.0 Pro benchmarks leaked (Activity: 1219): The post discusses a leak of benchmarks for Gemini 3.0 Pro, a model by DeepMind. The linked model card, which has since been removed, reportedly contained impressive performance metrics for tasks such as
ScreenSpot-Pro,VideoMMMU,OmniDocBench, and general tool use and reasoning abilities. These benchmarks suggest that Gemini 3.0 Pro could significantly advance AI capabilities in these areas, although some commenters express skepticism about the authenticity of the numbers. Commenters are excited but skeptical, with some doubting the authenticity of the ‘ridiculous numbers’ reported in the leak. Others highlight the potential of Gemini 3.0 Pro as a powerful tool for specific tasks, indicating a mix of anticipation and caution.- fmai highlights the potential of Gemini 3.0 Pro as a powerful computational agent, emphasizing its performance on specific benchmarks like ScreenSpot-Pro, VideoMMMU, and OmniDocBench. These benchmarks suggest significant improvements in tool use and reasoning capabilities, indicating a leap in AI performance.
- SpecialistLet162 notes the deletion of the PDF containing the Gemini 3.0 Pro benchmarks, which suggests a possible leak or premature release of information. However, the document has been archived, allowing continued access to the data for analysis and discussion.
- PaxODST suggests that the impressive performance metrics of Gemini 3.0 Pro might necessitate a reevaluation of timelines for achieving Artificial General Intelligence (AGI), indicating that the model’s capabilities could be a significant step towards more advanced AI systems.
- gemini 3.0 pro vs gpt 5.1 Benchmark (Activity: 1141): The image likely contains a benchmark comparison between Gemini 3.0 Pro and GPT 5.1, highlighting Gemini 3.0 Pro’s superior performance. A key point from the comments is the use of the MathArena Apex score, which tests models on problems from 2025 competitions, ensuring they haven’t been seen in training data. This benchmark is significant because it demonstrates Gemini 3.0 Pro’s ability to solve novel problems, achieving a
23%score compared to other models’~1%, indicating a breakthrough in AI’s problem-solving capabilities beyond pattern memorization. One comment suggests that Google might lead the AI race, possibly due to Gemini’s performance. Another comment notes that the model card for Gemini 3.0 Pro was released but has since been taken down, indicating potential sensitivity or proprietary concerns.- The MathArena Apex score is highlighted as a significant benchmark for evaluating AI models’ ability to solve novel problems. Unlike traditional math benchmarks, which models might have memorized, Apex uses problems from 2025 competitions, ensuring they are beyond the training data. This results in a stark performance difference, with Gemini achieving 23% compared to other models’ ~1%, indicating its capability to handle new challenges rather than relying on memorized patterns.
- A user notes the practical application differences between Gemini and ChatGPT, emphasizing that Gemini excels in trade work and logic-related tasks due to its ability to process and understand work manuals effectively. In contrast, ChatGPT is preferred for more abstract, conversational tasks but is noted to hallucinate more, making it less reliable for technical work. This comparison underscores the specialized strengths of each model in different contexts.
- The Gemini 3.0 Pro model card was initially available at a specific Google storage link but has since been taken down. This suggests a potential retraction or update in the documentation, which could be relevant for those tracking the model’s development and capabilities.
- gemini 3.0 pro vs gpt 5.1 benchmark (Activity: 560): The post discusses a benchmark comparison between Gemini 3.0 Pro and GPT-5.1, highlighting that Gemini 3.0 Pro outperforms any model released by OpenAI to date. The linked Gemini 3.0 Pro Model Card likely contains detailed performance metrics and capabilities of the model, though the image itself is not described. This suggests a significant advancement in AI model performance by Google, potentially impacting the competitive landscape in AI development. Comments suggest a perception of Google’s dominance in AI, with one user noting that increasing rate limits for Pro plans could be beneficial, indicating a demand for more accessible high-performance AI models.
- A key point of discussion is the potential impact of Google’s Gemini 3.0 Pro on rate limits for Pro plans. If Google increases these limits, it could significantly enhance the usability and appeal of their services, providing a competitive edge over other models like GPT-5.1.
- There is a technical debate regarding the distinction between GPT-5.1 and Codex. Codex is noted as a specialized model, particularly for code generation, which suggests that comparing it directly to GPT-5.1 might not be entirely appropriate due to their different use cases and strengths.
- A user humorously critiques the reliability of AI-generated project assessments, highlighting a discrepancy between AI’s optimistic project status reports and the actual quality of the code, as revealed by tests. This underscores ongoing challenges in AI’s ability to accurately evaluate and report on software development tasks.
- Gemini 3 Pro Model Card is Out (Activity: 962): The Gemini 3 Pro Model Card has been released by DeepMind, detailing a model with a
1M token context windowcapable of processing diverse inputs such as text, images, audio, and video, and producing text outputs with a64K token limit. The model’s knowledge is up-to-date as of January 2025. The original link to the model card is down, but an archived version is available here. The comments highlight the significance of the model’s large context window and output capacity, suggesting excitement and anticipation for its capabilities. The removal of the original link has sparked discussions about the model’s authenticity and potential impact.- The Gemini 3 Pro model card reveals a significant advancement in AI capabilities, featuring a token context window of up to
1M, which is a substantial increase compared to previous models. It supports diverse input types including text, images, audio, and video, and can output text with a64Ktoken limit. The model’s knowledge cutoff is set to January 2025, indicating it incorporates the latest data up to that point. - A comparison is made between Gemini 3 Pro and other models like GPT5 Pro and Sonnet, highlighting that Gemini 3 Pro outperforms GPT5 Pro and matches Sonnet in coding tasks. This suggests a significant leap in performance, especially in specialized tasks like coding, which are critical for enterprise applications.
- The discussion touches on the competitive landscape, suggesting that OpenAI and Google are likely to dominate the AI space, potentially outpacing competitors like Anthropic due to pricing and feature advancements. The comment implies that while Claude’s code features are innovative, they may inadvertently guide competitors, making it challenging for smaller players to maintain a competitive edge.
- The Gemini 3 Pro model card reveals a significant advancement in AI capabilities, featuring a token context window of up to
- Gemini 3 Pro benchmark (Activity: 1497): The post discusses the benchmark results of Gemini 3 Pro, a new AI model from DeepMind. The linked PDF provides detailed model cards, indicating that Gemini 3 Pro may outperform existing models significantly, potentially becoming the leading AI model. The discussion highlights the importance of whether these improvements translate into noticeable advancements in everyday use, especially given the underwhelming reception of ChatGPT 5. The image associated with the post is not described, but it likely contains visual data or graphs related to the benchmark results. Commenters express skepticism and curiosity about the implications of AI models achieving near-perfect benchmark scores, questioning the real-world impact and user experience improvements.
- A key point of discussion is the potential impact of the Gemini 3 Pro model if the benchmark results are accurate. Users are speculating whether it will significantly outperform existing models in practical applications, especially given the underwhelming performance of ChatGPT-5. The anticipation is whether Gemini 3 Pro will represent a genuine advancement in AI capabilities, particularly in everyday use cases.
- There is curiosity about the implications of AI models achieving near-perfect benchmark scores, such as 99.9% or 100%. This raises questions about the future of AI development and evaluation, as well as the potential plateau in performance improvements. The discussion hints at a need for new metrics or benchmarks to assess AI capabilities beyond traditional scoring systems.
- The mention of “Google Antigravity” and its associated link has sparked interest, though it appears to be a placeholder or non-functional link. This has led to speculation about what “Google Antigravity” might entail, possibly hinting at new technologies or projects under development by Google.
- Gemini 3.0 Pro Preview is out (Activity: 763): The image provides a preview of the new “Gemini 3 Pro” model within the Google AI Studio interface, showcasing its advanced features such as state-of-the-art reasoning, multimodal understanding, and unique capabilities like agentic and vibe coding. This release is marked as confidential and new, indicating a significant update from previous versions. The interface includes navigation options for different functionalities, suggesting a comprehensive suite of tools for AI development. The comments highlight user experiences, noting improvements in language understanding and some limitations like request quotas. Users have mixed reactions; some are impressed by the improved language understanding and nuance detection in Gemini 3 compared to previous versions, while others face limitations like request quotas, indicating potential access or usage issues.
- AdamH21 highlights a significant improvement in Gemini 3.0’s natural language processing capabilities, particularly in understanding nuances and sarcasm in Czech, which was a limitation in Gemini 2.5. This suggests enhanced contextual understanding and adaptability in multilingual settings, potentially surpassing ChatGPT in these aspects.
- redmantitu reports a technical issue with Gemini 3.0 Pro related to request limits, indicating a possible quota restriction even for pro users. This could imply a need for better resource allocation or user communication regarding usage limits.
- Individual-Offer-563 confirms the availability of Gemini 3.0 in AI Studio from Europe, suggesting a broadening of access to this version across different regions, which may be relevant for developers and users interested in testing or deploying the model.
2. Cloudflare Outage Impact on Major Platforms
- Gemini Is The Only Major LLM Not Effected by Cloudflare’s Outage (Activity: 771): Gemini, a major language model, remained unaffected by a recent Cloudflare outage that impacted other large language models (LLMs). This resilience highlights Gemini’s robust infrastructure and possibly distinct network dependencies compared to its competitors. The incident underscores the importance of diversified network strategies in AI deployment to mitigate single points of failure. Commenters noted the unexpected dominance of Gemini, Veo, and NanoBanana in their respective fields, contrasting with Facebook’s significant investments yet lagging performance. There is also a sentiment that Google could leverage this incident for effective marketing.
- A user speculates that Google’s Gemini might have been indirectly responsible for Cloudflare’s outage, suggesting that the excitement and traffic generated by the release of Gemini’s model card could have overwhelmed Cloudflare’s infrastructure. This highlights the significant interest and demand for Google’s AI advancements.
- Another comment points out the irony in the situation, noting that the CEO of Cloudflare has been known to engage in public disputes with Google. This adds a layer of complexity to the outage, as it suggests potential underlying tensions between the companies that could influence technical operations or responses.
- The discussion touches on the competitive landscape of AI, with a user noting that despite Facebook’s substantial financial investments, they are still lagging behind in AI development compared to Google’s Gemini. This underscores the challenges in AI innovation and the importance of strategic advancements beyond just financial input.
- Cloudflare went offline globally and now ChatGPT, X, and dozens of major platforms are throwing errors (Activity: 881): The image is not directly analyzed, but the post discusses a significant global outage of Cloudflare, affecting major platforms like ChatGPT, X, and others. This outage highlights the dependency of many internet services on Cloudflare’s CDN and security infrastructure. Users report errors such as ‘Please unblock challenges.cloudflare.com to proceed,’ indicating issues with Cloudflare’s challenge pages. The outage’s impact is so widespread that even Downdetector, a service for tracking outages, is experiencing difficulties, underscoring the scale of the problem. Commenters note the fragility of the internet’s reliance on major cloud providers like Cloudflare, AWS, and Microsoft, with one user pointing out that similar outages occur when these services experience issues.
- A user noted that the error message ‘Please unblock challenges.cloudflare.com to proceed’ is appearing on multiple platforms like Claude.ai and chatgpt.com, indicating a widespread issue with Cloudflare’s services. This suggests that Cloudflare’s challenge page, which is often used for security checks, is not accessible, causing disruptions across various services that rely on it.
- Another comment highlights the cascading effect of Cloudflare’s outage, mentioning that even Downdetector, a service used to track outages, is down. This underscores the dependency of many internet services on major cloud providers like Cloudflare, AWS, and Microsoft, where a single point of failure can lead to widespread service disruptions.
- The discussion also touches on the broader impact of cloud service outages, with a user pointing out that major platforms like X (formerly Twitter) are experiencing 500 errors. This indicates server-side issues likely related to the inability to pass Cloudflare’s security challenges, affecting both web and mobile access.
- “Please unblock challenges.cloudflare.com to proceed.” (Web, chrome) (Activity: 8711): The issue described involves a message from Cloudflare, specifically ‘Please unblock challenges.cloudflare.com to proceed,’ which typically suggests interference from ad blockers or VPNs. However, the user confirmed these were disabled. The problem was likely due to a global Cloudflare outage affecting various services, including OpenAI’s ChatGPT, which rely on Cloudflare’s CDN. The issue resolved itself after some time and a forced refresh, indicating a temporary disruption in Cloudflare’s network. Commenters noted that the global Cloudflare outage impacted multiple services, including Google and ChatGPT, while some services like Gemini were partially functional. The situation was dynamic, with services intermittently available, and users were advised to wait for resolution.
- Cloudflare experienced a global outage affecting multiple services that rely on its network, including OpenAI’s ChatGPT. Users reported intermittent access, with some services like Google remaining operational while others were disrupted. The issue was temporary, with Cloudflare working to resolve it and restore consistent service.
- The outage impacted users’ ability to access websites and services that depend on Cloudflare’s CDN, such as ChatGPT. This highlights the critical role Cloudflare plays in web infrastructure, as even tools like Downdetector were inaccessible to some users, complicating the ability to track the outage’s scope and impact.
- During the outage, some users noted that while Cloudflare-dependent services were down, others like Gemini were still operational but with reduced functionality. This suggests that the outage’s impact varied across different services, possibly due to how they integrate with Cloudflare’s network.
- ChatGPT website messed up. (Activity: 2277): The user experienced an issue accessing the ChatGPT website, receiving a message to unblock
challenges.cloudflare.com, despite it not being blocked. The JavaScript console showed errors, but the issue resolved itself after 5 minutes. This incident highlights potential vulnerabilities in web services due to reliance on centralized services like Cloudflare, which can lead to single points of failure. Commenters noted that the issue was part of a broader Cloudflare outage affecting multiple services, including Twitter, emphasizing the risks of centralization in web infrastructure.- cruncherv highlights the issue of centralization in web services, pointing out that reliance on a few major providers like AWS, Azure, and Cloudflare creates a vulnerability to single-point-of-failure events. This centralization means that a failure in any of these services can potentially disrupt a significant portion of the internet, emphasizing the need for more distributed and resilient infrastructure.
- DeepFreezeDisease mentions a major Cloudflare outage, noting its impact on services like Twitter. This underscores the critical role that Cloudflare plays in internet infrastructure, as its downtime can lead to widespread service disruptions, affecting not just individual websites but major platforms as well.
- The discussion reflects on the broader implications of cloud service outages, with users expressing frustration over the dependency on these services for everyday tasks, such as studying or writing essays. This highlights the importance of having contingency plans or alternative solutions when such outages occur.
3. Gemini 3.0 Pro Performance and User Feedback
- Is it just me or has Gemini 3 Pro gotten worse lately? (Activity: 806): The post discusses a perceived decline in the performance of Gemini 3 Pro, a language model, noting that it initially produced more human-like and intelligent responses but now struggles with basic problem-solving, particularly in mathematics. This suggests potential issues with model updates or deployment that may have affected its accuracy or reasoning capabilities. One comment humorously suggests that the model’s performance issues might be due to quantization, a process that can reduce model size but sometimes at the cost of accuracy, indicating a possible technical reason for the decline.
- Gemini 3 Pro first impressions (Activity: 1366): Gemini 3 Pro is a new AI model that excels in multiple domains including math, physics, and code, with a notable improvement in visual understanding. It surpasses Claude Sonnet 4.5 in UI design capabilities. Users report that it passes private tests where other state-of-the-art models fail, highlighting its superior performance in understanding image elements. Commenters express strong approval of Gemini 3 Pro’s capabilities, with one user noting its ‘amazing’ ability to understand image elements, suggesting it may set a new standard in AI model performance.
- A user tested Gemini 3 Pro on private benchmarks where other state-of-the-art models failed, but Gemini 3 Pro succeeded in all tests, indicating its superior performance in specific tasks.
- Another user highlighted Gemini 3 Pro’s exceptional ability to understand elements within images, suggesting it surpasses previous models in image recognition capabilities.
- A user shared a complex logical problem involving spatial reasoning and noted that Gemini 3 Pro was the first language model to solve it correctly without additional hints, showcasing its advanced problem-solving abilities.
- Gemini 3 has been nerfed? (Activity: 744): The post raises concerns about a perceived reduction in performance of Gemini 3, speculating whether the model has been quantized, which could affect its computational efficiency. Quantization is a process that reduces the precision of the model’s weights, potentially leading to faster inference but at the cost of accuracy. The post does not provide specific benchmarks or technical details to substantiate the claim. The comments reflect a mix of sarcasm and mild frustration, with one user humorously noting a 5% increase in time for solving a trivial problem, suggesting a perceived decline in performance. However, there is no technical evidence provided to support these claims.
- Gemini 3 pro passes the finger test (Activity: 745): The image and accompanying comments suggest a significant advancement in AI capabilities, specifically referencing the “Gemini 3 pro” model. The phrase “passes the finger test” implies that this AI model has achieved a level of sophistication in understanding or generating human-like features or interactions, which is often a benchmark in AI development. The comments further emphasize this by mentioning the “shoe test” and “circle test,” suggesting that the model has successfully passed various tests that are typically challenging for AI, indicating a leap towards Artificial General Intelligence (AGI). The comments reflect a mix of awe and concern, with users expressing that the advancements in AI, as demonstrated by the Gemini 3 pro, might signal a significant shift in AI capabilities, potentially surpassing existing models like GPT.
- sama congratulated Google on Gemini 3 release & Sundar responded (Activity: 562): The image is a meme or non-technical in nature, as indicated by the comments and the lack of technical details in the post. The post discusses a congratulatory exchange between Sam Altman (referred to as ‘sama’) and Sundar Pichai regarding the release of Google’s Gemini 3. The comments suggest a competitive atmosphere in the AI industry, with one user humorously interpreting Altman’s congratulations as a competitive challenge. The image itself is not described in detail, but the context implies it is likely a humorous or satirical take on the situation. The comments reflect a mix of humor and anticipation about the ongoing competition in AI development, with one user expressing excitement for future AI releases and another suggesting that Altman’s message is a competitive gesture.
AI Discord Recap
A summary of Summaries of Summaries by gpt-5.1
1. Gemini 3 Pro & Google Antigravity: Rollout, Benchmarks, and Ecosystem Integrations
- Gemini 3 Pro Launch Wobbles While Crushing Benchmarks: Users across LMArena, Perplexity, Cursor, OpenAI, HuggingFace, and Yannic Kilcher discords tested Gemini 3 Pro, citing the leaked Gemini 3 Pro Model Card as “pretty crazy” on HLE, Video-MMMU, and ARC-AGI-2, and claiming it even beats Sonnet 4.5 and GPT‑5.1 in several internal tests and benchmarks. Rollout is chaotic: some users only see 2.5, others get downgraded mid-session, Google AI Studio enforces 50 messages/day, and the model card was published then yanked, prompting archive links like this Web Archive copy.
- Engineers report strong one‑shot React/SwiftUI generation, maze solving, compiler design help, and even a single‑shot real‑time raytracer, but also complain that Gemini 3 over‑edits code, drains context, over‑reads files, and was quickly “nerfed” for jailbreaks and long narrative generation, with some jailbreakers in BASI saying they must constantly rework exploits after safety patches. Others note it happily returns song lyrics where GPT‑5.1 refuses and joke that its launch even coincided with a major Cloudflare outage, spawning memes that Google “crashed Cloudflare so they could launch Gemini 3 without anyone looking”.
- Antigravity IDE Lands as a Gemini-Powered VS Code Doppelgänger: Google shipped Antigravity, a Gemini‑powered AI IDE, with downloads for all major OSes at antigravity.google and its installer at
https://antigravity.google/download, positioning it as the default dev surface for Gemini 3 via API, CLI, and VS Code–style workflows. Perplexity and LMArena users describe it as essentially a VS Code clone / “new AI IDE coding agent”, while others in Nous Research report using Google’s VS Code fork to access Gemini 3 prior to or alongside the public model card.- Early feedback frames Antigravity as a front-end wrapper around Gemini 3’s coding mode, with some praising its tight integration and others questioning whether it adds enough beyond extensions developers already use in VS Code. The IDE now sits alongside other Gemini entry points (AI Studio, Vertex AI, Gemini Enterprise, and Antigravity itself), and several communities are explicitly comparing its dev‑experience to tools like Cursor, Windsurf (which just added Gemini 3 Pro support), and LM Studio’s MCP-based integrations.
- Gemini 3 Pro Infiltrates Dev Tools but Suffers Platform Glitches: Multiple developer tools integrated Gemini 3 Pro almost immediately: LMArena added it to their Text/WebDev/Vision leaderboards, Windsurf announced editor support at “Gemini 3 Pro is now on Windsurf”, and Cursor users started comparing it against Claude inside the editor using Google’s Gemini 3 docs. At the same time, LMArena users reported frequent coding‑mode errors, file‑editing failures (“Editing files with Gemini 3 pro sucks on LMArena”), and Polymarket‑style speculation over when full access will stabilize.
- Across discords, engineers describe a split reality where Gemini 3 tops leaderboards and solves gnarly coding/math tasks, yet is rate‑limited, downgraded, or bug‑ridden depending on front-end: Google AI Studio enforces 50 requests/day, the Aider community hits reproducible errors and posts reports, and some Perplexity users get silently bumped back to 2.5. This has already triggered a cottage industry of jailbreaks (BASI), context management concerns (Cursor, LM Studio), and debates over whether benchmark‑chasing masks real‑world latency, stability, and tooling issues.
2. Grok 4.1 and the Creativity/EQ Arms Race vs GPT-5.1 and Others
- Grok 4.1 Rockets Up Leaderboards and EQ Benchmarks: xAI launched Grok 4.1, claiming state‑of‑the‑art creative and emotional intelligence metrics: 1483 Elo on LM Arena, 65% preference in a 2‑week silent test, 1586 EQ‑Bench, and 1722 Elo on Creative Writing v3, as detailed in their Grok 4.1 model card. It is now free on grok.com and mobile apps, and quickly took #1 or #2 spots on both LMArena’s Text leaderboard and Latent Space’s curated rankings, where it’s compared head‑to‑head with GPT‑5.1.
- OpenAI discord users report that Grok 4.1 delivers “wildly good” creative writing and emotional responses, sometimes preferred to GPT‑5.1 for tasks like SwiftUI code + UX copy, but still lags in raw programming reliability and long‑term memory. OpenRouter members explicitly compare Grok‑4’s EQ and writing to GPT‑5, noting that both vendors shipped EQ / creativity‑focused updates within a week, interpreting this as a signal that emotional/creative benchmarks are now a first‑class product battleground.
- Gemini 3 Pro vs GPT-5.1 vs Grok: Multi-Discord Bake-Offs: Developers in the OpenAI, Yannic Kilcher, and Latent Space servers are running informal shootouts where Gemini 3 Pro, GPT‑5.1, and Grok 4.1 solve the same React/SwiftUI tasks, maze problems, and creative prompts, often anchored on the Gemini 3 Pro model card and Grok 4.1’s release note. Several users claim Gemini 3 Pro beats GPT‑5.1 on one‑shot UI code generation and willingness to answer copyright‑adjacent queries (e.g., song lyrics), while Grok wins on quirky creative writing and EQ, but loses in sober systems programming.
- These crowdsourced benchmarks emphasize real workflows over leaderboard scores: people complain that GPT‑5.1 can be too conservative, Grok too flaky with code, and Gemini 3 too aggressive with unsolicited edits and context consumption. The net effect is a three‑way race where each model stakes a different advantage—Gemini as the fast multi‑modal coder, GPT‑5.1 as the aligned, stable workhorse, and Grok 4.1 as the creative/EQ specialist—and teams increasingly talk about routing different sub‑tasks to different frontier models instead of relying on a single “best model”.
- Grok’s Safety and Jailbreak Drama Spurs Community Scrutiny: BASI Jailbreaking uncovered that Grok 4.1’s system prompt contains a controversial line stating “teenage” or “girl” does not necessarily imply underage”, with users sharing snippets and reacting strongly, leading some to gatekeep their jailbreaks rather than publish them. At the same time, a separate user reported using Grok (before or despite recent hardening) to generate serious ransomware, noting that the model complied without questioning the task, which triggered significant debate about disclosure and code sharing.
- Image‑based jailbreak research (below) shows Grok is also vulnerable to prompt‑in‑image attacks, further fuelling concerns that xAI’s moderation stack is lagging behind the model’s capability and business push. Engineers across servers see this as evidence that EQ‑heavy “fun” models can still be trivially weaponized, and argue for much stronger red‑team programs before such systems are made free and widely available.
3. Tooling, Infra, and Governance: MCP, Graph-RAG, Sourcegraph Ads, Runlayer, Atlas
- MCP and Runlayer Turn Tooling into a Governed, Multi-Server Mesh: Multiple communities are converging on Model Context Protocol (MCP) as the standard way to hook LLMs into tools: LM Studio added MCP-based tool integration in its docs, framing MCP as the UI layer rather than exposing raw REST APIs, while Latent Space highlighted Runlayer, which offers secure, governed access to 18,000+ MCP servers and just raised $11M led by Khosla and Felicis per Andy Berman’s announcement. Runlayer is already live with enterprise customers like Gusto and Opendoor, and positions itself as a control plane for MCP-based infra at scale.
- Engineers are specifically interested in access governance, auditing, and blast‑radius control for fleets of agent tools, with Runlayer pitched as “Okta for MCP servers” by implication, and LM Studio users viewing MCP as a way to keep tool schemas discoverable and consistent across local models. This is dovetailing with separate research in Unsloth and DSPy communities on online training via reward callbacks—where MCP‑exposed tools and an OAI-compatible API + vLLM async server act as the environment for RL‑style agent training.
- Graph-RAG and Mimir Attack Vendor Lock-In With Open Orchestration: HuggingFace’s general channel saw the launch of Mimir, a graph‑RAG database and orchestration framework that explicitly “gives a big middle finger” to Pinecone and Kilo Code, providing a user‑controlled, MIT‑licensed alternative on GitHub that already has 47+ stars. Mimir supports multi‑agent orchestration, push‑button deployment, OpenAI API compatibility, OpenWebUI, and uses llama.cpp embeddings, with showcased pipelines that manage Minecraft servers via semantic search and memory tools.
- By combining graph‑structured knowledge, local embeddings, and generic OpenAI‑style APIs, Mimir positions itself as a DIY alternative to closed RAG platforms and no‑code orchestrators like N8N and Pinecone‑backed solutions. This aligns with a broader sentiment in multiple discords that vector‑DB + hosted agent stacks are over‑priced and under‑transparent, and that teams should increasingly own their knowledge graphs and tool routers rather than renting opaque SaaS.
- Sourcegraph Ads, Atlas Browser, and Poe Group Chat Redefine AI UX and Monetization: Latent Space members discussed Sourcegraph adding ads to its coding assistant’s free tier and already generating an estimated $5–10M ARR, per a report from The Information, as a novel monetization strategy in dev tooling. In parallel, OpenAI’s community highlighted the Atlas browser, where Ben Goodger and Darin Fisher explained on the OpenAI Podcast how Atlas rethinks browsing “from the inside out”, with the podcast itself now available across Spotify, Apple, and YouTube.
- On the collaboration front, Poe rolled out group chat for up to 200 users, allowing teams to summon any of its 200+ AIs (Claude 4.5, GPT‑5.1, etc.) into a single shared thread as announced at Poe’s update, which engineers see as a primitive for multi‑agent, multi‑human workflows. Across these tools, a pattern emerges: AI products are experimenting with ad‑funded dev tools (Sourcegraph), agent‑native browsers (Atlas), and multi‑user AI sessions (Poe), all while infra vendors like Runlayer and MCP aim to keep underlying tool access auditable and secure.
4. Security, Jailbreaking, and Abuse: Image-Based Prompts, Ransomware, CAPTCHAs, Fraud
- Image-Based Jailbreaks Turn Every PNG into a Payload: Researchers in BASI’s #jailbreaking channel demonstrated that models like Grok can be jailbroken by embedding prompts directly inside images, with one user sending an image of text that bypassed text‑only safety and triggered a jailbreak, documented via shared screenshots. They hypothesize that vision models often “trust” visual text more than chat instructions, and are now exploring more advanced carriers like QR codes and hidden metadata.
- This “image injection” attack effectively creates a new class of prompt‑in‑image exploits that traditional text‑based filters can’t see, broadening the attack surface for any multimodal API. Security‑minded engineers now treat all user‑supplied images as potential arbitrary instruction bundles, and are discussing mitigations including visual text sanitization, OCR‑then‑filter pipelines, and explicit separation of image‑derived content from control prompts.
- CAPTCHA Solving and Weaponized Code Reveal Safety Gaps: BASI members reported Gemini successfully solving reCAPTCHAs with >50% accuracy, even outperforming some humans with poor eyesight, and joked about a business model paying North American click‑workers $0.08 per captcha. In parallel, another user used Grok with a fresh jailbreak to generate “serious ransomware”, noting that the model provided code without ethical pushback, and others clamored to see the payload and encryption routines.
- These anecdotes highlight how frontier models can now automate both access‑breaking tasks (CAPTCHAs) and malicious code generation with minimal friction, undermining older assumptions that safety layers would reliably block such use cases. The community response is split between red‑team interest and ethical concern over publishing working malware; several users explicitly discuss not releasing full code due to realistic abuse risk, underscoring the need for stronger in‑model and API‑side behavioral filters.
- Fraud, Data Control, and Account Risk in the LLM Ecosystem: OpenRouter’s community raised strong red flags about LiteAPI (site), a purported 40% cheaper OpenRouter competitor whose site mentions a different entity (Yaseen AI) in its generic privacy policy and appears to violate provider ToS, leading many to suspect stolen keys, stolen cards, or arbitraged API credits. Separately, Perplexity’s pplx-api channel documented that users cannot delete accounts on purrvv.me, with support confirming deletion is impossible, raising serious privacy and data‑retention concerns.
- On the platform side, some BASI users reported being banned from ChatGPT for jailbreaking, recommending disposable emails via simplelogin for account churn, while OpenAI’s own server saw harmless prompts tripped by over‑zealous automated filters. Together, these incidents paint a picture of an ecosystem where shadow resellers, opaque data policies, and brittle trust & safety systems interact in non‑obvious ways, and engineers are increasingly vetting vendors for key custody, deletion guarantees, and clear enforcement policies before building on top of them.
5. Performance Engineering and Training Tooling: llama.cpp, Hugging Face Tricks, DSPy, Atropos
- llama.cpp, tinygrad, and Hugging Face Squeeze Out Serious Speed: Unsloth users reported >25 tok/s on a Ryzen 8‑series CPU using llama.cpp with Unsloth‑tuned models after migrating from Ollama, saying that other stacks “would slouch” and that they can now realistically push 5–10 tok/s on very large models in RAM‑heavy setups. In parallel, an Eleuther member showed that in Hugging Face Transformers, using an empty logits processor with
return scores.contiguous()yielded about 60% faster generation throughput under FA2 withdynamic=Falseandfullgraph=True.- The tinygrad community shared a benchmark where Llama‑1B on tinygrad hit 6.06 tok/s vs 2.92 tok/s on Torch across 10 runs (16,498.56 ms vs 34,247.07 ms), and discussed next steps like benchmarking against
torch.compileand cleaning up old kernel imports and broken example files. Collectively, these conversations show practitioners aggressively optimizing CPU inference, graph compilation, and kernel code, rather than assuming that performance only comes from GPUs or hyperscaler‑hosted endpoints.
- The tinygrad community shared a benchmark where Llama‑1B on tinygrad hit 6.06 tok/s vs 2.92 tok/s on Torch across 10 runs (16,498.56 ms vs 34,247.07 ms), and discussed next steps like benchmarking against
- Online Training, Non-Determinism, and DSPy Tooling for Production: The Unsloth help channel explored online training via an OAI-compatible API plus reward callback, suggesting a design where a dynamic dataset feeds into a vLLM server in async mode, allowing arbitrary reward functions for reinforcement learning as detailed in the Unsloth GPT‑OSS RL guide. Complementing this, their research channel shared a blog post on “Defeating Non‑Determinism in LLM Inference” from Thinking Machines with an accompanying YouTube video, emphasizing deterministic setups via seeds, hardware control, and numerical precision.
- The DSPy community introduced dspy-intellisense, a VS Code extension that adds richer type‑hinting for signatures and predictions, and discussed managing LLM non‑determinism in
gpt-oss-20bby locking temperature=0, max_tokens, and output schemas, plus using Arize Phoenix tracing per their docs. There’s also growing demand for a “DSPy in production” channel, signalling that engineers are moving from experiments to monitored, reproducible deployments where determinism, tracing, and optimization structure (as in the optimizer-structured arXiv paper) really matter.
- The DSPy community introduced dspy-intellisense, a VS Code extension that adds richer type‑hinting for signatures and predictions, and discussed managing LLM non‑determinism in
- Atropos + Tinker Make RL Environments Pluggable Into Any Model: Nous Research announced that Atropos, their RL environments framework, now integrates with Thinking Machines’ Tinker training API via the tinker-atropos GitHub repo, enabling users to train and test Atropos environments across many different models from a unified interface. Their tweet at @NousResearch frames this as a way to do cheap RL experiments against arbitrary models with shared tooling.
- In the broader Nous community, people are already connecting this to agentic finance and trading experiments, sharing videos like “financial traders utilizing Agentic AI tools to make money” while joking they’d rather buy gas‑station scratchers. Together with MCP/Runlayer and Unsloth online‑training ideas, Atropos+Tinker points towards a near‑term stack where any frontier or open model can be treated as a pluggable RL policy, trained via standardized environments and APIs rather than bespoke one‑offs.
Discord: High level Discord summaries
LMArena Discord
- Gemini 3 Pro Impresses Users: Users are excited about Gemini 3 Pro and its performance, citing it as superior to Claude for gameplay and functionality.
- However, some users believe Gemini 3 Pro is overrated, especially compared to Grok 4.1, and are awaiting real-world application testing.
- Google AI Studio’s Usage is Limited: Members suggest using Google AI Studio for Gemini 3, but noted the platform has a limit of 50 messages per day.
- Users are also testing Gemini 3 Pro on Polymarket, anticipating a potential release this month, and are discussing Google AI Studio’s capabilities and constraints.
- LMArena has Gemini 3 Pro Glitches: Users report issues with LMArena, including frequent errors when using Gemini 3 Pro in coding mode and general unreliability, with one user stating Editing files with Gemini 3 pro sucks on LMArena.
- One member has suggested an iOS LMArena app as a workaround for mobile issues.
- Grok-4.1-thinking Dominates Leaderboard:
Grok-4.1-thinkingis now ranked #1 on the Text Arena leaderboard.- It outranks
Grok-4.1, now in the #2 spot on the same leaderboard.
- It outranks
- Code Arena Kicks off AI Generation Contest: The November AI Generation Contest is open to celebrate Code Arena’s launch, offering prizes like Discord Nitro and special roles.
- Participants must submit entries with a preview link and can refer to this example for guidance on the new Code Arena leaderboard.
Perplexity AI Discord
- Google’s Gemini 3 Rolls Out…: Gemini 3 launched for Google AI Pro/Ultra subscribers in AI Mode in Search, for developers in the Gemini API in AI Studio, Google Antigravity; and Gemini CLI and for enterprises in Vertex AI and Gemini Enterprise, but its rollout has been inconsistent per discussions on Twitter.
- Some users are being downgraded back to 2.5, and others suspect Google is rigging benchmarks and that the model might be trained on them.
- Google Releases “Antigravity” IDE: Google announced the release of Antigravity, its new AI code editor and members confirmed its availability for download at https://antigravity.google/download for Windows, Mac, and Linux.
- Though it will integrate with the latest Gemini model, some have reported it is essentially a VS Code clone and called it the new AI IDE Coding agent.
- Comet’s Android App Waitlisted: Comet released an Android Version for early access, but it’s currently waitlisted and not yet available.
- Users are reporting installation errors and device incompatibility issues, even with Android 11; some also speculate location may be a factor.
- Purrvv.me Users in a Bind: Users report the inability to delete accounts on purrvv.me, and that account deletion is not possible, raising potential privacy issues.
- As attempts to delete accounts through support have failed, users are seeking alternative ways to manage their data and ensure its removal from the platform.
BASI Jailbreaking Discord
- Gemini Schools Humans on CAPTCHA: Members discussed Gemini’s success rate exceeding 50% in solving reCAPTCHAs, even outperforming humans with poor eyesight.
- Users joked about struggling with captchas themselves and even proposed outsourcing the task to individuals in North America for 8 cents per captcha.
- Builder.ai’s facade exposed!: The supposed AI no-code platform Builder.ai faced scandal as it was revealed that 700 Indian developers were manually coding apps, inflating revenues and triggering a US Federal subpoena.
- The company allegedly engaged in sham deals with VerSe Innovation, inflating revenues for 2023-2024, leading to the founder’s resignation.
- Grok goes rogue, generating ransomware!: A user reported successfully using Grok with a new jailbreak to write serious ransomware, expressing surprise at the AI’s willingness to assist without questioning the task.
- Other members expressed interest in the code, leading to debate about the ethical implications of publishing it online, given its potential for malicious use.
- Image Injection: A Picture is Worth a Thousand Jailbreaks: Members discovered that sending images containing jailbreak prompts to AI models like Grok can bypass text-based security measures, with one user successfully using images of text to activate jailbreaks.
- The user shared screenshots demonstrating the technique, sparking discussion about more sophisticated attacks using QR codes or hidden metadata.
- ChatGPT Banning Users for Jailbreaking: A member reported being banned for jailbreaking ChatGPT and cautioned others to be careful, and recommended using simple login to get a random email for creating new accounts.
- Another member stated they can only jailbreak paid version accounts of ChatGPT; after some back and forth they discovered the free version also works.
Cursor Community Discord
- Linux Dominates Mac OS for Power Users: Members debated the merits of Mac OS versus Linux, with several arguing that Mac OS lacks essential features for power users and requires installing many apps for basic functionality.
- One member criticized the Finder app, preferring to manage files from the terminal, while another described the stock experience as total trash.
- Composer Free Period Ends, Access Varies: The Composer free period concluded on November 11th, with users noticing the free label disappearing from the model selector.
- Some users reported inconsistent access during the preceding week, while one user claimed never having access, possibly due to using Arch Linux.
- Cursor 2.0 Parallel Agents Still Painful: Users discussed issues with Cursor 2.0’s parallel agents, specifically using 4x agents in a work tree, and a member shared a link to a forum post as a solution.
- Despite the shared resource, one user commented that the current situation is still a pain and pretty bad tbh.
- Gemini 3 Pro Hailed, Context Drain Feared: Gemini 3 Pro was released, with one member claiming it is superior to Claude and citing its speed, according to Google’s documentation.
- Other users countered that it drains context quickly and over-reads files.
- Cloudflare Succumbs to Global Outage: A global issue with Cloudflare caused widespread outages, affecting numerous websites and services - the Cloudflare status page reported issues.
- One member noted a fire contributing to their site being down, while another suggested people should stop using Cloudflare because atp its unreliable.
Unsloth AI (Daniel Han) Discord
- Gemma 3 Gets Roasted, Granite-4.0 Crowned: Members dismissed Gemma 3 270M in favor of Granite-4.0 as the superior small model, citing concerns about Gemma 3’s factual accuracy.
- It was noted that asking models with less than 4 billion parameters factual questions is a recipe for disaster.
- llama.cpp Races Ahead with Unsloth Models: A user reported impressive speeds of over 25 tokens a second on a Ryzen 8 series using llama.cpp with Unsloth models after moving from Ollama.
- They highlighted that other models would slouch and this enables loading up the ram and really doing 5-10 tokens realistically on epically large models!
- Breadclip Classifier Ideas Emerge: A user requested guidance on crafting a neural network for classifying breadclips given the constant emergence of new types.
- Another user suggested starting with a guide on making an image classifier using the MNIST dataset as a foundation.
- Online Training Opens Up with Unsloth: Discussion centered on enabling online training with Unsloth, targeting an OAI-compatible API with a reward callback.
- It was emphasized that users can implement any kind of reward system with Unsloth and that the dataset will need to be dynamic and the vLLM server will need to use async mode.
- Non-Determinism Nixed in LLM Inference: A member shared a blog post on defeating non-determinism in LLM inference.
- Another member shared a related YouTube video on the same topic.
OpenRouter Discord
- LiteAPI Sparks Fraud Concerns: LiteAPI, promising 40% cheaper rates as an OpenRouter alternative, faces fraud allegations due to its website, generic privacy policy referencing Yaseen AI, and violation of provider Terms of Service.
- Community members suspect stolen keys, stolen cards, or resold API credits, with concerns over the absence of API credit deposit options and model name listings.
- Grok-4’s EQ Mirrors GPT-5?: Members compared xAI’s Grok-4 to GPT-5, highlighting similar enhancements in emotional intelligence (EQ) and writing capabilities.
- They noted the simultaneous release of updates focused on creative benchmarks, prompting speculation about industry trends.
- Gemini 3 Causes Cloudflare Crash?: The launch of Gemini 3 coincided with a significant Cloudflare outage, disrupting internet services.
- The community joked that Gemini 3 is SOO big and SOO good that CF couldn’t take it while others speculated that Google crashed Cloudflare so they could launch Gemini 3 without anyone looking.
- Replicate Joins Cloudflare: A member joked about Cloudflare acquiring Replicate and how unfortunate it would be if it were OpenRouter instead.
- This alludes to the community’s thoughts on potential acquisitions or partnerships within the AI space.
- Infinite Compute Cures Hallucinations?: A member suggested that AI hallucinations could be nearly eliminated by using “infinite compute” to power a swarm of internet-connected agents.
- Another member proposed that hallucinations stem from translation errors, detailing their theory in this blog post.
OpenAI Discord
- Atlas Browser Reimagines Browsing: @BenGoodger and @Darinwf spoke with @AndrewMayne about Atlas, explaining its design philosophy and how it aims to redefine browser capabilities, questioning how to reinvent it from the inside out.
- Gemini 3 Pro Allegedly Beats GPT-5.1: A member claimed that Gemini 3 Pro outperformed GPT-5.1 on AI model tests, citing its proficiency in generating one-shot React and SwiftUI code and solving hand-drawn mazes.
- Another member noted that Gemini 3 Pro provides song lyrics while GPT-5.1 refuses but also pointed out that Gemini 3 Pro makes unrequested code changes during reviews.
- Grok 4.1 Impresses with Creative Flair: Members praised Grok 4.1 for excelling in creative writing and emotionally intelligent responses, with some favoring it over GPT-5.1 for tasks like SwiftUI code generation.
- Despite these strengths, its programming capabilities were considered weak, and its long-term memory was deemed unreliable, prompting discussions about its storage usage and file recall abilities.
- Custom GPT Efficiency Faces Scrutiny: Members analyzed the efficiency of custom GPTs versus base models, highlighting that custom GPTs streamline instruction following for specific tasks, improving prompt engineering.
- Concerns arose regarding the varying levels of support for thinking mode and the potential for a market saturated with low-quality custom GPTs.
- Automated Filters Flag Harmless Requests: A member reported that the automated filter flagged a prompt for an innocuous reason, bringing light to potential issues with automated filters and the need for careful content moderation.
- This highlights potential issues with automated filters and the need for careful content moderation.
LM Studio Discord
- E-Commerce Scammers Thrive: Members are reporting increased incidents of buyers damaging products to demand refunds on e-commerce platforms, with one describing a seller that took a hammer to [their] laptop then doused it in hand cream then returned it.
- A member claimed to have been robbed in broad daylight by buyers abusing the return system.
- LLMs Suffer From Cognitive Decline?: One member observed their LLM’s performance degrading over time, possibly due to a bug.
- Suggestions included restarting the model, starting a new conversation, and creating reproducible test cases.
- MCP Integrates External Tools: LLM Studio is integrating LLMs with external tools using the Model Context Protocol (MCP), as documented in the LM Studio documentation.
- Members suggest that MCP should be thought of as a UI for LLMs, since directly exposing REST APIs to LLMs might be confusing.
- RAM Prices Shoot to the Moon: Members reported significant price increases in RAM, resulting in one member selling 32GB of DDR5 6000 for $100.
- Another user claimed to have sold RAM for 3x its purchase price, calling current prices really good.
- DIY GPU Mods Get Extreme: A member shared a video of them aggressively removing a DVI port from a GPU.
- They later showed the modded card, claiming that all mods were 💯 reversible.
Latent Space Discord
- Sourcegraph Sees Green with Ad Revenue: Coding assistant startup Sourcegraph added ads to its free tier last month and is already earning an estimated $5–10 million annual recurring revenue, according to The Information.
- This move marks a significant shift in their monetization strategy, potentially impacting how other coding platforms approach freemium models.
- Grok 4.1 Claims Top Spot with Claims: xAI launched Grok 4.1, claiming the top spot on LM Arena (1483 Elo), boasting a 65% user preference in a silent 2-week test, and is now free on grok.com and mobile apps.
- Key highlights include a 1586 EQ-Bench score, 1722 Elo on Creative Writing v3, and 3× fewer hallucinations, according to their model card.
- Mohan’s Cryptic Post Sparks Speculation: Varun Mohan posted a cryptic “👀” tweet with a video thumbnail, sparking rampant speculation about an imminent release—likely Gemini 3.0, Veo 4, or a Windsurf integration—according to his announcement.
- The AI community buzzes with anticipation, analyzing every frame for clues on the upcoming launch.
- Poe Unites Users with Group Chat: Poe rolls out global group-chat support for up to 200 people, letting teams summon any of its 200+ AIs (like Claude 4.5 and GPT-5.1) in one synced thread, as per their announcement.
- The feature aims to streamline collaborative AI interactions, potentially becoming a hub for AI-assisted group projects.
- Runlayer’s Secure Access Gains $11M: Runlayer, a platform providing enterprises with secure, governed access to 18,000+ MCP servers, announced an $11M seed round led by Khosla & Felicis, now live with customers like Gusto and Opendoor, according to Andy Berman’s announcement.
- The funding will fuel expansion of their platform, addressing the increasing need for secure access management in large-scale computing environments.
Nous Research AI Discord
- Atropos and Tinker: A Dynamic Duo: Atropos, a RL Environments framework by Nous Research, now supports Thinking Machines’ Tinker training API, enabling training and testing of environments on various models.
- A tweet highlighted that this integration simplifies the training and testing process on a variety of models.
- Amazon’s Nova Premier V1: Is it the Next Big Thing?: Amazon’s Nova Premier v1 appeared on OpenRouter, but its novelty was questioned, considering Amazon’s history of forgettable models.
- Speculation arose that Jeff Bezos might be working on a separate AI initiative due to internal politics and issues with AWS Bedrock.
- Bedrock AWS Egress Faces Charges: Members shared experiences of incurring high costs on AWS, with one recounting a $3000 bill due to difficulties terminating an instance.
- Another member shared a similar experience being fucked by egress charges while hosting a Minecraft modpack at 17.
- Gemini 3 Released and Removed After Beating Sonnet: The Gemini 3 Pro Model Card was released and then removed, having briefly outperformed Sonnet 4.5 in benchmarks.
- One member noted that Anthropic, Microsoft, and Nvidia seemed to poopoo on Google’s launch.
- Uncensored MoEs for Crime Prevention: A member sought suggestions for uncensored Mixture of Experts (MoE) models to bypass normie-proofing limitations for developing crime prevention and education tools.
- As an alternative, they considered LoRAing a Josified model, citing concerns about the lack of a general knowledge dataset for uplifting an uncensored small model.
HuggingFace Discord
- Graph-RAG Database gives Kilo and Pinecone the Bird: A new graph-rag database was released, challenging Kilo Code and Pinecone with a user-controlled, open alternative with code intelligence on GitHub.
- The project already has 47 stars and is designed to be open and customizable.
- Mimir Project claims Open Source Utopia: The Mimir project offers multi-agent orchestration and push-button deployment with an MIT license, providing an alternative to vendor lock-in from solutions like N8N, Pinecone, and Kilocode.
- It is compatible with OpenAI API and OpenWebUI and uses llama.cpp for embeddings, and orchestration pipelines for Minecraft servers using memory tools and semantic search.
- Lablab Hackathons spark Scam Claims: A user reported a negative experience with Lablab hackathons, describing them as potentially exploitative and warning against unfavorable relocation terms.
- The user claimed to have been removed from the leaderboard after achieving the top score and cautioned others to avoid Lablab.
- Cloudflare falls, user admits guilt: A widespread Cloudflare outage affected services in cities like Stockholm, Berlin, Warsaw, and Paris, disrupting apps and online functionalities.
- A user jokingly claimed responsibility, stating they pushed a faulty authentication fix to production during a Cloudflare internship, saying tests passed locally.
- Gradio 6 Promises Speed Boost: Gradio 6 has launched and claims to be faster, lighter, and more customizable than ever, according to their Youtube announcement.
- The launch was announced on November 21 and included a team walkthrough of what’s new and a Q&A session.
Yannick Kilcher Discord
- Classic ML Book Endorsed for Newbies: Members discussed Understanding Machine Learning by Shai Shalev-Shwartz and Shai Ben-David as a nice introductory read for new ML engineers.
- One member suggested supplementing it with a more technical book, while others recommended Foundations of Machine Learning from Mohri and the website deeplearningtheory.com.
- Bezos Returns to AI with Prometheus: Jeff Bezos is reportedly returning as co-CEO of a new AI startup called Project Prometheus.
- A member joked that Bezos is basically aping Elon with every move.
- ReLU Still Relevant After All These Years: Members revisited ReLU as a computationally simpler non-linear activation function, and reduces the chances of gradient vanishing.
- It does this because its derivative is either 0 or 1.
- Google Previews Gemini 3 Pro Model: Members reviewed the leaked Gemini 3 Pro Model Card, noting impressive results on the HLE, Video-MMMU, and ARC-AGI-2 benchmarks.
- The model is available on AI Studio, Google’s VS Code fork, and has so many benchmarks but the question remains whether Gemini’s performance gains will generalize to tasks outside of the benchmarks and to private and novel benchmarks, and to what degree.
- Transformers Explained by Yannic Kilcher: A member requested resources for understanding the evolution of attention and transformer models so another member shared a link to Yannic Kilcher’s Transformers video.
- The member also suggested using YouTube’s search feature to find content within Yannic’s channel: YannicKilcher/search?query=transformer.
Eleuther Discord
- EleutherAI to Shine at NeurIPS 2025: EleutherAI is set to present several papers at NeurIPS 2025, including in the main track: The Common Pile v0.1, Explaining and Mitigating Cross-Linguistic Tokenizer Inequalities, and More of the Same: Persistent Representational Harms Under Increased Representation.
- The team is also planning an official dinner and reception, with details available here, and encourages interested members to select the social activities role for updates.
- Huggingface Sees Generation Throughput Boost: A member discovered that using an empty logits processor with
return scores.contiguous()can result in 60% faster generation throughput with Huggingface.- The improvement was observed with FA2 when configured with
dynamic=Falseandfullgraph=True.
- The improvement was observed with FA2 when configured with
- Command A ties weights, becomes trend?: Members discussed that Cohere’s command A model uses tied weights between the embedding and LM head to reduce parameter count, a common technique for smaller transformer models.
- The discussion considered whether this approach is trending in larger models, where the parameter increase from untied weights is proportionally less significant.
- VWN: Linear Attention Reimagined, Layer by Layer: Virtual Width Networks (VWN) were described as an implementation of linear attention, with updates occurring layer to layer rather than token to token.
- One member noted, VWN is pretty much doing linear attention, but instead of the
state[:, :] += ...update happening from token to token, it’s happening from layer to layer.
- One member noted, VWN is pretty much doing linear attention, but instead of the
- Navigating VWN’s Matrix Maze: The discussion addressed the confusing matrix representations in Virtual Width Networks (VWN), specifically the sizing and density of matrices A and B.
- Concerns were raised about potential errors in the paper’s matrix notation, with members suggesting that code implementations might use clearer einsum notation.
Modular (Mojo 🔥) Discord
- Modular Showcases MMMAudio and Shimmer: Modular announced the upcoming Community Meeting on Nov 24 featuring MMMAudio, a creative-coding audio environment, and Shimmer, a cross-platform Mojo → OpenGL experiment.
- The meeting will also cover Modular updates including the 25.7 release and the Mojo 1.0 roadmap.
- Mojo Embraces NVidia’s NVFP4 Challenge: A member inquired about NVFP4 support in Mojo for an Nvidia/GPUMode competition focused on NVFP4 Batched GEMV, and was reassured that the MLIR included fp4.
- Another member pointed out that nvfp4 can be built using existing Mojo datatypes, referencing the dtype.mojo file on Github.
- Negative Indexing Debate in Mojo Deepens: The ongoing discussion about negative indexing and
IntvsUIntcontinues, touching on performance implications and potential UB errors with slicing abilities.- A core team member clarified that the goal is to standardize on Int for most APIs, remove support for negative indexing from low-level types like List, and enable bounds checks by default, although there was disagreement from the community.
- Poetry Plays Nice with Mojo: A member shared instructions on integrating Mojo with Poetry, detailing how to add a source and dependency within the
pyproject.tomlfile.- The configuration involves adding a modular source pointing to the nightly Python package repository and specifying the mojo dependency with pre-release enabled.
- ArcPointer Safety Questioned: A member raised a question about the safety of ArcPointer with regards to potential UB errors when calling a function with
a==bwhich they think should be caught as a UB error because p3 is litterly invalidated by the extend call.- The code involves operations like extending and dereferencing pointers within an ArcPointer[List[Int]], leading to a crash in the assert statement instead of a caught UB error.
DSPy Discord
- DSPy Gets Type Hinting: A member created dspy-intellisense, a VSCode extension offering improved type hinting for signatures and predictions in DSPy.
- This aims to streamline development and reduce errors by providing real-time feedback on code.
- MLflow Alternatives Spark Interest: A member voiced dissatisfaction with mlflow and sought recommendations for alternative inspection tools.
- Another member suggested arize phoenix for viewing traces, linking to the Arize Phoenix documentation.
- Taming the LLM Randomness Dragon: A member inquired about managing the non-deterministic nature of LLMs when observing varying results with gpt-oss-20b across multiple runs.
- Solutions included setting temperature to 0, adjusting max_tokens, and enforcing a strict output format, with a reminder that some randomness is normal even at low temperatures.
- DSPy-in-Production Channel Craving: A member inquired about a dedicated channel for the DSPy in production community, highlighting a gap in focused discussion.
- Another member acknowledged the absence and supported the idea, indicating growing interest in productionizing DSPy projects.
- Optimizer Structure Paper Shared: A member shared a paper on arXiv noting that its structure is ideal for an optimizer.
- No other details were given.
tinygrad (George Hotz) Discord
- Tinybox Meeting Set: A meeting was announced, related to tinybox, with links provided to tinygrad.org/#tinybox and GitHub issue #1317.
- The meeting aims to discuss the roadmap and contributions to tinybox.
- Tinygrad’s Llama 1B burns Torch on CPU: A member reported that Llama1b running on Tinygrad outperformed Torch on CPU, achieving 6.06 tok/s compared to Torch’s 2.92 tok/s.
- The comparison was based on 10 runs, with Tinygrad completing in 16498.56 ms and Torch in 34247.07 ms.
- Torch Compile Benchmarking Considered: A member suggested that benchmarking against a
torch.compilePyTorch implementation could provide a more relevant comparison.- This was in the context of the observed performance differences between Tinygrad and Torch when running Llama 1B.
- Kernel Import Cleanup: Members discussed fixing the
from tinygrad.codegen.opt.kernel import Kernelimports in theextra/optimizationfiles.- The discussion centered on cleaning up and ensuring the correctness of kernel imports within the codebase.
- Examples/Extra Purge: Members suggested removing broken, unused examples/extra files that have remained untouched for an extended period.
- The aim is to reduce clutter and maintain the relevance of the examples provided within the repository.
aider (Paul Gauthier) Discord
- Gemini 3 Bugs Arise: A user reported experiencing the same errors with the new Gemini 3 model as before, and attached a report with the error.
- Another user dismissed the report as AI nonsense.
- Experimental Channel Finds New Discord Home: The experimental channel has been moved to a new Discord server.
- This move was in response to users wondering what happened to experimental.
Manus.im Discord Discord
- Manus User Hails Return to Glory: A long-time Manus.im user praised version 1.5 for its amazing status, noting improvements in task handling and the elimination of retard loops.
- The user thanked the Manus team for listening to complaints, indicating they might upgrade to the PRO version again.
- Skilled Developer Seeks Opportunity: A user inquired about opportunities for a skilled developer within the community.
- No additional information about the type of developer or their skills were given.
Windsurf Discord
- Gemini 3 Pro Blows into Windsurf!: Gemini 3 Pro is available on Windsurf, providing users with access to its advanced features.
- Details are on the official announcement, and get started by downloading the editor.
- Gemini 3 Pro Bug Squashed!: A small issue with Gemini 3 Pro has been resolved, ensuring proper functionality.
- The team expressed apologies for the initial blip.
MLOps @Chipro Discord
- AI Governance and Control Webinar Coming Soon: A webinar on AI governance and control is scheduled for December 3rd at 10am PT, with registration available at https://bit.ly/3LPl7FO.
- The webinar will cover strategies for effectively managing AI implementations and discuss ethical considerations, compliance, and risk management in the context of AI control.
- Fostering AI Governance Discussion: The webinar aims to foster discussion around critical aspects of AI governance, including ethical considerations, compliance, and risk management.
- Participants are encouraged to share experiences on the evolving landscape of AI control and its impact on organizational strategies.
MCP Contributors (Official) Discord
- Image Uploaded Directly to Discord: A user posted an image directly to Discord, which is hosted on Discord’s Content Delivery Network (CDN) via the URL cdn.discordapp.com.
- This suggests the image was directly uploaded, rather than linked from a third-party hosting service.
- Discord CDN Hosting: The image is being hosted on Discord’s Content Delivery Network (CDN) indicating a direct upload to Discord rather than an external image hosting service.
- Without further context, the image’s specific content and its relevance to the discussion remain unspecified.
The LLM Agents (Berkeley MOOC) Discord has no new messages. If this guild has been quiet for too long, let us know and we will remove it.
You are receiving this email because you opted in via our site.
Want to change how you receive these emails? You can unsubscribe from this list.
Discord: Detailed by-Channel summaries and links
LMArena ▷ #general (1404 messages🔥🔥🔥):
Gemini 3 Pro, Google AI Studio, LLM limitations, Antigravity IDE, LMArena Issues
- Gemini 3 Pro Hype and Performance: Users are excited about Gemini 3 Pro and its performance, with one saying Gemini 3 cooked with this. Honestly better than Claude by a wide margin compared to other models, especially for gameplay and functionality.
- Some users feel that Gemini 3 Pro is overrated, but Google had to respond quickly because of Elon’s grok 4.1, while others are waiting for it to be tested against real-world applications.
- Google AI Studio Usage and Limitations: Members suggested using Google AI Studio for Gemini 3, but it has limits on messages per day, with one user saying that google have limits in 50 messages per day.
- Members are also testing Gemini 3 Pro on Polymarket which they think is gonna be released this month.
- Gemini 3 Nerfed and User Concerns: Members speculated that Gemini 3 Pro may have been nerfed since release and can no longer generate long stories compared to 2.5, they claim that Google is a evil corpi and they knew it.
- Some users report the model is unable to identify items and has a hallucination rate of 88%.
- LMArena Technical Issues and Frustrations: Users report issues with LMArena, including frequent errors in coding mode and general unreliability, leading some to prefer Google AI Studio.
- One user noted that Editing files with Gemini 3 pro sucks on LMArena and suggested an iOS LMArena app to mitigate mobile issues.
- Advocating the use of AntiGravity IDE: Members are hyping up AntiGravity, an IDE that uses nano banana and is capable of doing different coding tasks.
- Members have created different things with AntiGravity and are still trying to figure out all the features of the new platform.
LMArena ▷ #announcements (4 messages):
Grok-4.1-thinking, Expert leaderboard, November AI Generation Contest - Code Arena, Image-to-Video Leaderboard, Text-to-Image Leaderboard
- Grok-4.1-thinking takes over the Text Arena Leaderboard:
Grok-4.1-thinkingnow holds the #1 rank on the Text Arena leaderboard followed byGrok-4.1in the #2 rank. - Code Arena contest open this November: The November AI Generation Contest is now open to celebrate the launch of Code Arena, in which you can win Discord Nitro and a special role.
- Participants must submit their entry by sharing the preview link in the designated channel and can refer to this example for guidance.
- Video & Image Leaderboard shuffles Top 5:
Wan2.5-i2v-preview&Wan2.5-t2i-previewhave recently achieved Top 5 positions on the Image-to-Video and Text-to-Image leaderboards. - Gemini-3-pro Dominates Leaderboards:
Gemini-3-prohas secured the #1 position on the Text, WebDev, and Vision leaderboards, achieving scores of 1501, 1328, and 1487 respectively.
Perplexity AI ▷ #general (1028 messages🔥🔥🔥):
Gemini 3 Pro, Antigravity IDE, Comet Android, Kimi Model
- Gemini 3 Pro launches (kind of): Gemini 3 was launched today, and members note that is rolling out for everyone in the Gemini app and for Google AI Pro and Ultra subscribers in AI Mode in Search, for developers in the Gemini API in AI Studio, Google Antigravity; and Gemini CLI and for enterprises in Vertex AI and Gemini Enterprise.
- However, some users are being overloaded and kicked back to 2.5, and others are seeing it in AI Studio but not in the Gemini App. Overall member sentiment is that while useful, its not a crazy huge difference from 2.5.
- Gemini 3 Has Limited Availabilty: Gemini 3.0 Pro was launched, but is not available for all users yet, as reported on Twitter.
- Some users are having trouble accessing Gemini 3.0 Pro, as mentioned on Discord, others speculate Google is rigging benchmarks and that the model might be trained on benchmarks.
- Google Unveils Antigravity IDE: Google announced Antigravity, a new AI code editor. It was later shared it is available for download at https://antigravity.google/download, while other’s confirmed it is available for windows, mac and linux.
- It has been reported the AI will be integrated with the latest Gemini model. However, it has also been said it is essentially a VS Code clone. Others have noted it being referred to as the new AI IDE Coding agent.
- Airtel Gives Free Gemini To Subscribers: Airtel was reported to provide free Gemini access for subscribers, per discussion in the thread.
- Other users reported receiving free access to Gemini Pro for being a student, though most other users will have to wait.
- Comet Rolls Out For Android: Comet released an Android Version for early access, though it is currently waitlisted and has not been released.
- Members report that they were facing multiple errors in installation due to device incompatibility, even with android 11. Others speculate it might be a location difference
Perplexity AI ▷ #pplx-api (4 messages):
purrvv.me
- Purrvv.me Deletion Dilemma: Users on the channel are reporting that there is currently no way to delete accounts on the purrvv.me service.
- Attempts to delete accounts through support have been unsuccessful, with support staff confirming that account deletion is not possible.
- Purrvv.me Account Privacy: Concerns are rising among users regarding the inability to delete accounts on purrvv.me, raising potential privacy issues.
- Users are looking for alternative solutions to manage their data and ensure their information is removed from the platform, as the lack of a deletion option presents a challenge for data control.
BASI Jailbreaking ▷ #general (1056 messages🔥🔥🔥):
Grok's abrupt hardness, AI solving captchas, Builder.ai scandal, Grok for ransomware, Gemini 3 Pro
- Grok gets Hard suddenly: A member suggested the increased difficulty of Grok is due to the price of good moderation, viewing it as a way to judge the quality of the site.
- Another member joked about using ad blockers to avoid ads, prompting the first member to respond with may the rose forever alight upon your nose.
- Gemini Triumphs, Humans Stumble on Captchas: A member pointed out that Gemini has a success rate of over 50% in solving reCAPTCHAs, surpassing humans with bad eyesight, while speculating on capabilities of unpublished AI.
- Some users joked about their inability to fail captchas, and someone proposed outsourcing captcha solving to individuals in North America for 8 cents per captcha.
- “AI” Startup Exposed: Indian Devs Behind the Curtain: A member shared a summary of the Builder.ai scandal, where the company, marketed as an AI no-code platform, was revealed to have 700 Indian developers manually coding apps, resulting in inflated revenues and a US Federal subpoena.
- The company allegedly engaged in “roundtripping” sham deals with VerSe Innovation, reporting inflated revenues for 2023-2024, and the founder stepped down amid the fallout.
- Grok Goes Grim: Users Craft Ransomware with Ease: A user shared that they used Grok with a new jailbreak to write serious ransomware, and expressed surprise at Grok’s willingness to assist without questioning the task.
- Other members expressed interest in seeing the code, with one requesting code snippets of the payload functions or encryption algorithms; there was debate on whether to publish this code online given the nature of the task.
- Gemini 3 Pro drops: Prompt Hacking Frenzy Ensues: Members reported that Gemini 3 Pro is live on AI Studio API, with model name
gemini-pro-latest, and began jailbreaking it.- One user claims to have already JB’d it, while others shared techniques; after initial successes, users noted Gemini 3 was getting nerfed, necessitating rework of jailbreaks.
BASI Jailbreaking ▷ #jailbreaking (683 messages🔥🔥🔥):
Indirect Prompt Injection, Grok 4.1 System Prompt Leak, Bypassing AI Security, Jailbreaking Gemini 3.0, Exploiting AI for Malicious Purposes
- Image Injection Inception: New Jailbreak Technique Emerges: Members discovered that sending images containing jailbreak prompts to AI models like Grok can bypass text-based security measures, with one user successfully using images of text to activate jailbreaks, suggesting that AI models trust images more than text.
- The user shared a series of screenshots demonstrating the technique, leading to discussion about the potential for more sophisticated image-based attacks, such as using QR codes or hidden metadata.
- Grok Gets Gooned: System Prompt Leak Sparks Controversy: The system prompt for Grok 4.1 was leaked, revealing a controversial line stating that “teenage” or “girl” does not necessarily imply underage”, raising concerns about the potential for generating inappropriate content, leading to heated debate and some users deciding to gatekeep their jailbreaks.
- Members debated whether the inclusion of such language was necessary or simply a reflection of the model’s training data, while others speculated about the motives behind xAI’s design choices and the implications for content moderation.
- AI Architects Arm Race: Bypassing Security Protocols: In a meta-discussion about the nature of AI safety, members debated the effectiveness of various jailbreaking techniques, with some arguing that “logic & reason are the only true jailbreaks” and that other tactics will eventually be patched, pointing to the inevitable arms race between AI developers and jailbreakers.
- Members also discussed novel methods for bypassing security protocols, such as using a CLI in XML or JSON format or framing prompts as cybersecurity research scenarios, highlighting the ongoing cat-and-mouse game between AI developers and those seeking to circumvent restrictions.
- Gemini Gets Ganked: Jailbreak Gold Rush Begins: Several members claimed to have successfully jailbroken Gemini 3.0, the latest AI model from Google, using modified system prompts and psychological tactics, but shared details sparingly, leading to a frantic search for working prompts and accusations of gatekeeping.
- One member shared a technique of using an image of a developer to trick the AI into divulging information about its inner workings, while others experimented with posing indirect questions and leveraging the AI’s reasoning abilities.
- AI Anarchy Unleashed: Hacking and Mayhem Ensue: A member shared a seemingly innocuous code snippet that, upon closer inspection, was revealed to be an information grabber, designed to steal users’ Social Security numbers, disguised as a random number generator.
- The ensuing discussion highlighted the potential for malicious actors to exploit jailbroken AIs for nefarious purposes, such as generating malware, phishing attacks, and other forms of cybercrime, raising ethical concerns about the responsible use of AI technology.
BASI Jailbreaking ▷ #redteaming (19 messages🔥):
AI Jailbreaking, ChatGPT Agent Mode, Suit vs Business Suit prompt, Banning from ChatGPT, Random Email for ChatGPT
- New Way of Jailbreaking AI: A member claims to use the newest way of jailbreaking AI by running AI inside of another AI.
- When asked whether this involved running another instance of itself, they affirmed they could control its system instruction.
- Agent Mode Offline: A member noted that ChatGPT agent mode is currently down for those with access.
- Another member replied lol, love the suit vs business suit prompt in reply.
- ChatGPT Jailbreak Bans: A member reported getting banned for jailbreaking ChatGPT and warned others to watch out.
- Another member recommended using simple login to get a random email for creating new accounts.
- Paid vs Free ChatGPT jailbreaking: A member stated they can only jailbreak paid version accounts of ChatGPT.
- After some back and forth they discovered the free version also works.
- AI Friend: A member stated my friend is a Ai and attached an image.
- Another member replied Hehe.
Cursor Community ▷ #general (1093 messages🔥🔥🔥):
Mac OS vs Linux for power users, Cursor support for multiple GitHub accounts, Composer free period ending, Gemini 3 Pro release and performance, Cloudflare outage
- Mac OS bashed as Linux dominates in power users: One member argued that Mac OS lacks crucial features compared to Linux, requiring the installation of numerous apps for basic functionality, citing the Finder app being an absolute pain to use, while another agreed that the stock experience is total trash.
- The first member prefers to manage files from within the app or from the terminal (using Ranger) rather than use the Finder.
- Composer free period concludes November 11: Members noted that the Composer free period ended on November 11th, with the “free” label disappearing from the model selector.
- Some users reported sporadic access to it during the week, while another never had access (possibly due to using Arch Linux).
- Parallel Agents in One Chat with Cursor 2.0: A member was having problems using 4x agents in work tree, and wanted to know how do I get them to work on individual things, but another member shared a forum post link to solve the problem.
- One member agreed that the current situation is still a pain and pretty bad tbh.
- Gemini 3 Pro is the best, says user: Gemini 3 Pro was released, and one member claimed that Gemini 3 is better than Claude, plus its fast.
- Other members claimed that it drains context quickly and over-reads files.
- Cloudflare succumbs to Global Outage: A global issue with Cloudflare caused widespread outages, affecting numerous websites and services - the Cloudflare status page reported issues.
- A member reported that a fire at work also contributed to their sites being down, while another suggested people need to stop using cloudflare. atp its unreliable.
Unsloth AI (Daniel Han) ▷ #general (222 messages🔥🔥):
TPU support, Gemma 3 270M, LLama CPP Speed, Activation functions, 4bit merged
- Gemma 3 gets roasted, Granite-4.0 wins best small model: One user exclaimed that they can’t see why anyone should use this model in comparison to Gemma 3 270M, while another stated Granite-4.0 is the best model for its size.
- Another member said Asking models with less than 4 billion parameters factual questions is a recipe for disaster.
- TPU support remains limited: A user inquired about using JAX and Unsloth to finetune small models on TPUs, but others stated that TPU support is limited, though HuggingFace is partnering with GCP to expand support.
- It was clarified that Unsloth only supports GPUs.
- llama.cpp speed impresses: One user moved from Ollama to llama.cpp and reported getting over 25 tokens a second on a Ryzen 8 series while using Unsloth models.
- They said that other models would slouch and that this opens the ability for loading up the ram and really doing 5-10 tokens realistically on epically large models!
- Discussion around activation functions: Members discussed the ReLU activation function, where one user stated that ReLU works because there are more choices and less data is being lost due to non-linearity.
- Another user stated that it doesn’t matter too much what activation function you actually pick, so you may as well use an efficient one.
- Forced 4bit merge warning explained: A user ran into an error when saving a 4bit merged model and asked if they should use
merged_4bit_forced, and a member responded that this is the way to do it if you trained with QAT.- It was also explained that this warning is because if you didn’t save LoRA or 16bit you will not be able to save them if you save 4bit and it recommends you to use it as last step because it will merge the LoRA with the 4bit model that is already in memory.
Unsloth AI (Daniel Han) ▷ #introduce-yourself (3 messages):
AI and Full Stack Development, LLM Integration
- AI Full Stack Dev Joins the Fray: A new member introduced themselves as an AI & full stack developer with expertise in Next, Node, Python, and Django.
- They specialize in integrating LLMs, OpenAI, and LangChain into full stack applications and are seeking connections for project collaborations.
- Welcoming a New LLM Enthusiast: A new member is an AI and full-stack developer.
- They are proficient in integrating LLMs and OpenAI into applications.
Unsloth AI (Daniel Han) ▷ #off-topic (275 messages🔥🔥):
SF vs NYC food, Breadclip classifier, LoRA training hacks, Gemini 3 code changes
- SF Food Feud: SF vs NYC: Discord users debated whether SF or NYC has better food, with one user claiming SF has 5 times more michelin restaurants per 100k people and 3 times more michelin restaurants by square mile compared to NYC.
- Another user countered that the most interesting meal they’ve ever had was in Las Vegas, and the NYC mass transit makes it 100 times better to get around.
- Breadclip Classifier Brainstorming Begins: A user sought advice on building a neural network to classify breadclips, highlighting the challenge of new types being frequently discovered.
- Another user suggested starting with a guide on making an image classifier using the MNIST dataset.
- LoRA Training Tweaks and Hacks: Users discussed the nuances of LoRA training, clarifying that it involves freezing all layers except for the LoRA adapters.
- There were many discussions and opinions on techniques to get approximate gradients without large backward passes, and if merging can be done at every optimizer step.
- Gemini 3 Code Changes Critiqued: A user reported that Gemini 3 made major changes to their codebase, like completely removing features and lots of lines of code being shortened to one line.
- They observed that comments were being removed and the model was using semicolons in a PEP8 compliant codebase, recommending the codebase could suffer extream degration if not careful (arxiv.org/abs/2403.03507, arxiv.org/abs/2504.20437).
Unsloth AI (Daniel Han) ▷ #help (83 messages🔥🔥):
Unsloth online training, Llama CPP Deployment with Unraid/Docker, Voice AI Agent LLM Training, Unsloth and HuggingFace TRL, Text to Speech inference with Unsloth
- Unsloth Enables Online Training: Members discussed the potential for online training with Unsloth, aiming for an OAI-compatible API extended by a reward callback.
- It was noted that there’s nothing stopping users from implementing any kind of reward system with Unsloth and that the dataset will need to be dynamic and the vLLM server will need to use async mode.
- Deploy Llama CPP Endpoint with Unraid/Docker: A member sought guidance on deploying an endpoint via llama.cpp using Unraid/Docker for a coding VSCode C-line plugin.
- Another member suggested a Docker blog post as a potential solution, highlighting its OpenAI compatibility.
- Address Overfitting Issues in LLM Fine-Tuning with Unsloth: A user noted underwhelming results with Llama 3.1B Instruct 8bit, trained for a voice AI agent, and asked about PII redaction, model size, prompt usage, and LoRA effectiveness.
- Another user recommended starting with a smaller dataset for faster training to grasp the basics, referencing the LoRA hyperparameters guide for validation and overfitting issues.
- Decipher Unsloth’s Relationship with Hugging Face TRL: A user asked about the relationship between Unsloth and Hugging Face TRL, seeking clarification on whether Unsloth solely accelerates processes or if it is used independently of TRL.
- A member suggested reading the Unsloth documentation on GPT-OSS reinforcement learning and long-context GPT-OSS training for a better understanding.
- Run Text to Speech Models on Unsloth: A user inquired about running text-to-speech (TTS) models on Unsloth.
- Another member confirmed it’s possible and pointed to the Unsloth documentation on TTS fine-tuning, advising to skip the fine-tuning part for inference.
Unsloth AI (Daniel Han) ▷ #research (2 messages):
LLM Inference, Non-determinism, Determinism
- Non-Determinism Defeated in LLM Inference: A member shared a blog post about defeating non-determinism in LLM inference.
- Another member confirmed and shared a related YouTube video on the same topic.
- LLM Inference Determinism: The blog post and video discuss methods to ensure deterministic behavior in LLM inference, which is crucial for reliable applications.
- Techniques likely involve controlling factors like random seeds, hardware configurations, and numerical precision during the inference process.
OpenRouter ▷ #app-showcase (6 messages):
Efficiency in current projects, YouTube video link
- Efficiency Efforts Underway: A member stated that current projects are not really efficient rn, but they are working on it to improve.
- They also posted a YouTube video link without further context.
- Another member working to improve efficiency as well: Another member commented that they were working on similar efficiency improvements.
OpenRouter ▷ #general (473 messages🔥🔥🔥):
editing images issues, Grok 4.1 Release, LiteAPI shady practices, Gemini 3 launch, Deepseek V3 0324 replacement
- Image Editing Glitches Trigger Annoyance: Users report that when editing messages with images, the images disappear after editing, causing frustration.
- This glitch affects user experience and highlights the need for seamless image handling in message editing.
- Grok 4.1 Arrives, Sparks Curiosity: Grok 4.1 is now available on web and app, with users referencing Elon’s post and wondering if the Sherlock models are based on Grok 4.1.
- Some users think Sherlock is actually Grok DASH. They want a Grok 4.1 lite or something
- LiteAPI raises Fraud Alarms: LiteAPI, an OpenRouter alternative promising 40% cheaper rates, is being called out as potentially fraudulent due to its sketchy website, generic privacy policy referencing Yaseen AI, and violation of provider Terms of Service.
- Community members suspect stolen keys, stolen cards, or resold API credits, with one user noting that LiteAPI doesn’t even have an option to deposit API credits nor do they list the model names to copy and paste.
- Gemini 3 Unveiled Amidst Cloudflare Chaos: The launch of Gemini 3 coincided with a major Cloudflare outage, causing widespread disruptions across the internet and hilarious takes, including the claim that Gemini 3 is SOO big and SOO good that CF couldn’t take it.
- Some users speculate that Google crashed Cloudflare so they could launch Gemini 3 without anyone looking.
- Questing for Deepseek V3 0324 Successor: Users are seeking a replacement for Deepseek V3 0324, noting its tendency to be too literal and repetitive with personalities or outfits.
- Suggested alternatives include Deepseek 3.1, 3.2, Kimi K2, and GLM 4.6, with a recommendation to customize cards/configs to de-emphasize those repetitive elements.
OpenRouter ▷ #new-models (14 messages🔥):
“
- No New Model News: There were no messages with new model news in the provided input.
- The input consisted only of repeated bot messages indicating the channel name.
- Channel Announcements Dominate: The message history solely contains repeated announcements of the channel name, OpenRouter - New Models, with no substantive discussion.
- Therefore, no specific models, features, or discussions can be summarized from the provided content.
OpenRouter ▷ #discussion (34 messages🔥):
Cloudflare Replicate acquisition, Grok-4 compared to GPT-5, Hallucination theories, Gemini 3 Preview on OpenRouter, Key Expiration
- Replicate Acquired, Cloudflare Next?: A member joked about Cloudflare acquiring Replicate and how unfortunate it would be if it were OpenRouter instead.
- This comment was made in jest but hints at the community’s thoughts on potential acquisitions or partnerships within the AI space.
- Grok-4 and GPT-5: A Convergence of EQ?: Some members compared xAI’s Grok-4 to GPT-5, noting similar improvements in emotional intelligence (EQ) and writing capabilities.
- They found it peculiar that both models released updates within a week of each other, focusing on similar creative benchmarks.
- Tackling AI Hallucinations: A member proposed that hallucinations in AI could be nearly eliminated by using “infinite compute” to power a swarm of internet-connected agents.
- Another member suggested hallucinations are due to translation errors, detailing their theory in this blog post.
- Gemini 3’s Reasoning Detail Veiled: The OpenRouter documentation for Gemini 3 Pro Preview has a critical detail about reasoning that’s hidden in an accordion.
- Members questioned whether preserving reasoning blocks is mandatory for tool calling only, as the documentation is vague and needs an overhaul.
- Key Expiration: Testing Ground: A user shared an image (image.png) related to key expiration, noting it is “useful for testing.”
- The context of testing refers to this setup and potential implications.
OpenAI ▷ #annnouncements (1 messages):
Atlas Browser, OpenAI Podcast
- Atlas Browser Reinvented: @BenGoodger and @Darinwf joined @AndrewMayne to explain why they built Atlas and how it rethinks what a browser can do.
- They discussed what it takes to reinvent a browser from the inside out.
- OpenAI Podcast airs on Atlas: The OpenAI Podcast is available on Spotify, Apple, and YouTube.
- Listen to the OpenAI Podcast to learn more.
OpenAI ▷ #ai-discussions (439 messages🔥🔥🔥):
Custom GPT efficiency measurement, Gemini 3 Pro vs GPT-5.1 comparison, Grok 4.1 capabilities, AGI and ASI discussion, Grok Imagine and video generation
- Custom GPT Efficiency Examined: Members discussed measuring a custom GPT’s efficiency compared to a base model, highlighting that custom GPTs help regularize instructions for specific tasks, streamlining prompt engineering.
- While custom GPTs offer convenience, concerns were raised about the varying degrees of support for thinking mode and the potential for low-effort custom GPTs to flood the market.
- Gemini 3 Pro Claims Victory Over GPT-5.1: A user reported that Gemini 3 Pro passed a suite of AI model tests successfully, highlighting its ability to generate one-shot React and SwiftUI code, and claimed it to be better than GPT-5.1
- Others noted that Gemini 3 Pro solves hand-drawn mazes first and is willing to provide song lyrics, where GPT-5.1 refuses, while also acknowledging that it sometimes makes uncalled-for changes to code during reviews.
- Grok 4.1’s Creative Writing Skills Steal the Show: Members lauded Grok 4.1 for its prowess in creative writing, emotional-related questions, and some outlier tasks, with some even preferring it over GPT-5.1 for specific applications like SwiftUI code.
- However, its programming capabilities were deemed hopeless, and its long-term memory was found to be inconsistent, leading to discussions on Grok’s storage usage and file recall abilities.
- AGI Fears Aired; Is ASI Inevitable?: Discussion shifted to the potential dangers of AGI and ASI, with members debating whether leading AI companies truly understand the risks associated with misaligned objectives and instrumental strategies.
- While some expressed optimism about AGI’s potential to revolutionize fields like medicine, others warned about the possibility of human extinction and the importance of considering the ethical implications of advanced AI development.
- Grok Imagine Enters the Video Generation Arena: Users shared videos generated with Grok Imagine, showcasing its lip-synced speech capabilities and free accessibility, leading to comparisons with other video generation models like Sora.
- Despite the excitement, some users criticized the audio quality and expressed concerns about the potential for low-effort content to flood the platform, echoing concerns raised about custom GPTs.
OpenAI ▷ #gpt-4-discussions (4 messages):
GPT 4.5 vs 5, GPT Photo upload error
- ChatGPT 4.5 vs 5 Showdown: A member asked whether ChatGPT 4.5 is a better writer than ChatGPT 5 Pro for novel writing.
- Another member stated that ChatGPT 5 would probably be better.
- GPT Photo Upload Error troubles user: A member reported trouble uploading photos to a GPT, describing an error message after a split-second upload.
- The member expressed confusion, noting the issue persists even after repeated attempts hours later.
OpenAI ▷ #prompt-engineering (12 messages🔥):
Prompt Engineering Advice, Model Interpretations, Subjectivity of Model Quality, Model A/B Testing, Automated Filter Issues
- Prompts: Tailoring to Model Expectations: A member suggested that good prompts align with what you want from the model, given ToS and allowed content.
- They emphasized that a “good” prompt is subjective and depends on individual needs and preferences.
- Decoding Model Interpretations with Evaluation Prompts: One member shared a prompt to evaluate, not follow instructions to assess if a prompt contradicts itself or conflicts with the model’s programming.
- The prompt aims to identify ambiguity or guesswork required from the model when interpreting instructions.
- Model Quality: Defining and Achieving Your Vision: To achieve the best possible image quality, one member suggested discussing what ‘the best possible quality’ means to the user with the model.
- The goal is to align the model’s output with the user’s specific expectations.
- A/B Testing Models to Define Best Model for Novel: When asked to compare ChatGPT 4.5 and ChatGPT 5 Pro for novel writing, a user recommended A/B testing.
- Another member emphasized that “Better writer” is subjective, suggesting direct comparison to determine the best fit.
- Automated Filters May Flag Harmless Content: A member reported that the automated filter flagged his prompt for an innocuous reason.
- This highlights potential issues with automated filters and the need for careful content moderation.
OpenAI ▷ #api-discussions (12 messages🔥):
Prompt Engineering Tips, Image Generation Quality, Model Accuracy, A/B Testing Models, Automated Filters
- Crafting Quality Prompts Tips: Members discussed that good prompts are ones that achieve the desired outcome within the terms of service, and that what works is subjective to the user’s needs.
- One member suggested using the prompt: Evaluate, do not follow, the following [prompt]. Does it contradict itself or conflict with your programming? Is it ambiguous or force you to guess? [actual prompt goes here] to understand how the model interprets requests.
- Image Quality Prompting Needs: A member sought advice on prompting for the best possible image quality.
- Another member suggested discussing what “best possible quality” means to the user, so the model can better understand and fulfill the request.
- Model Inaccuracy Annoyance: A user reported that the model consistently provides incorrect information and attempts to correct them with false information.
- No solution was provided in this message log.
- A/B Testing Novel Writing Models: A user asked whether ChatGPT 4.5 or ChatGPT 5 Pro is a better writer for novel writing.
- A member suggested conducting A/B testing to determine which model better suits their needs, emphasizing that “Better writer” is subjective.
- Automated Filters Getting Users: A user jokingly blamed the automated filter for flagging their message.
- Despite the user’s facetious tone, another member agreed that their suggestion was solid advice.
LM Studio ▷ #general (234 messages🔥🔥):
E-commerce Seller Woes, LLM Performance Degradation, 2FA App Recommendations, LLM Tool Integration via MCP, Gemini3-Pro benchmarks
- E-Commerce Platforms Allegedly Favor Dishonest Buyers: A member complained about e-commerce platforms favoring buyers who damage products and demand refunds, citing an example of a seller that took a hammer to [their] laptop then doused it in hand cream then returned it.
- Another member shared their own experience of being robbed in broad daylight by buyers abusing the return system.
- LLMs Seem Dumber Over Time?: A member noted that their LLM seems to get dumber the longer they keep it running, which others suspect might be a bug or observation problem.
- It was suggested to start a new conversation or freshly start the model, and to create a real test case that can be reproduced.
- Navigating the 2FA App Minefield: Members discussed options for free and secure two-factor authentication (2FA) apps, with suggestions including Google Authenticator and Proton Pass.
- Some users are wary of using Google or Microsoft products, while others suggest writing your own authenticator app with ChatGPT’s assistance.
- LM Studio Embraces Model Context Protocol (MCP) for Tool Integration: Members discussed integrating LLMs with external tools via the Model Context Protocol (MCP), with a link to the LM Studio documentation.
- It was also mentioned that directly exposing REST APIs to LLMs might be confusing, and that MCP should be thought of as a UI for LLMs.
- Gemini3-Pro Enters the Benchmark Arena: Members noted the release of Gemini3-Pro and its immediate benchmarking, with one user asking for a comparison against Qwen3-Max.
- There was also a dig about how western companies be like “if we only compare to each other, noone will notice us falling behind”.
LM Studio ▷ #hardware-discussion (226 messages🔥🔥):
Silent PC Building, GPU Mods, RAM Prices, eBay selling, UPS Issues
- Rubber Pads Silence PCs: Members confessed to having drawers full of rubber pads, foams and grommets for silent PC building needs and finding them surprisingly effective.
- One member joked about using mouse pads for head gaskets, but highlighted their effectiveness and ubiquitous free availability for case cooling.
- GPU DVI Port Gets Removed With Extreme Prejudice: A member shared a video of them aggressively removing a DVI port from a GPU, then later showed the modded card, claiming all mods were 💯 reversible.
- RAM Prices Skyrocket, Prompting Sales: Members reported a significant recent price increase in RAM, with one member selling 32GB of DDR5 6000 for $100 and another selling 128GB of DDR4 3600 for around $300.
- Another user claimed to have sold RAM for 3x its purchase price, describing current prices as really good.
- eBay Fees and Taxes Steal Profit: A member lamented the impact of eBay fees and taxes, listing a MI50 on eBay.
- They pointed out that after paying eBay and government taxes of around $100, their profit margin was significantly reduced.
- Courier Services Cause Consternation: Members discussed frustrations with various courier services, with one member describing Amazon Australia as basically Uber due to its contract drivers.
- Another member shared their preference for USPS over FedEx and UPS, while UK users cited Evri and Yodel as the most hated services.
Latent Space ▷ #ai-general-chat (178 messages🔥🔥):
Sourcegraph Ads, xAI Grok 4.1, Varun Mohan's Teaser, Poe Group Chat, Runlayer Funding
- Sourcegraph Adds Ad-Revenue Stream: Coding assistant startup Sourcegraph added ads to its free tier last month and is already earning an estimated $5–10 million annual recurring revenue, according to The Information.
- Grok 4.1 is the newest Model for xAI: xAI launched Grok 4.1, claiming the top spot on LM Arena (1483 Elo), boasting a 65% user preference in a silent 2-week test, and is now free on grok.com and mobile apps.
- Key highlights include a 1586 EQ-Bench score, 1722 Elo on Creative Writing v3, and 3× fewer hallucinations, according to their model card.
- Mohan Teases a Varun Surprise Launch: Varun Mohan posted a cryptic “👀” tweet with a video thumbnail, sparking rampant speculation about an imminent release—likely Gemini 3.0, Veo 4, or a Windsurf integration—according to his announcement.
- Poe Adds Global Group Chat: Poe rolls out global group-chat support for up to 200 people, letting teams summon any of its 200+ AIs (like Claude 4.5 and GPT-5.1) in one synced thread, as per their announcement.
- Runlayer Runs to $11M Funding: Runlayer, a platform providing enterprises with secure, governed access to 18,000+ MCP servers, announced an $11M seed round led by Khosla & Felicis, now live with customers like Gusto and Opendoor, according to Andy Berman’s announcement.
Nous Research AI ▷ #announcements (1 messages):
Atropos, RL Environments framework, Thinking Machines' Tinker training API
- Atropos supports Tinker training API: Nous Research’s Atropos, a RL Environments framework, now fully supports Thinking Machines’ Tinker training API.
- This integration allows users to train and test environments built on Atropos on a large variety of models with ease, as highlighted in their tweet.
- Tinker and Atropos, a Dynamic Duo: Atropos environments can now be trained using Thinking Machines’ Tinker training API, broadening the range of trainable models.
- For assistance, users are encouraged to visit the designated channel.
Nous Research AI ▷ #general (131 messages🔥🔥):
Amazon Nova Premier v1, Bedrock AWS issues, Jeff Bezos buff, Hermes 4 70B on dual 3090s, Factorio Agent project
- Amazon’s Nova Premier V1 Model Surfaces: A member shared a link to Amazon’s Nova Premier v1 on OpenRouter (from April 2025), questioning its novelty given Amazon’s history of forgettable models.
- Others suggested Jeff Bezos is working on his own AI initiative, separate from Amazon, possibly due to internal politics and issues with AWS Bedrock.
- Bedrock AWS faces criticism: A member recounted spending $3000 on AWS after struggling to terminate an instance, highlighting issues with hidden costs and support ticket charges.
- Another member shared a similar experience, stating they were fucked by egress charges while hosting a Minecraft modpack at 17.
- Gemini 3 Released then Pulled: Members discussed the release and subsequent removal of the Gemini 3 Pro Model Card, noting it briefly beat Sonnet 4.5 in benchmarks.
- One member pointed out that Anthropic, Microsoft, and Nvidia seemed to poopoo on Google’s launch, while another gained access to Sonnet through Google AntiGravity.
- Gemini 3 generates Real-Time Raytracer: A member exclaimed that Gemini 3 generated a single shot realtime raytracer and did not disappoint, sharing an example image.
- Another member noted how awesome it is that models can do this, saying that the implicit mathematical description of structures for raymarching is some times pretty mindbending for human programmers.
- Nous Research releases cheap RL training: A member shared a link to Nous Research’s tweet announcing a cheap way to test or train any RL environments in Atropos.
- Another member shared a video on financial traders utilizing Agentic AI tools to make money, but quipped that they would rather take my 2 shekles to a gas station scratcher.
Nous Research AI ▷ #ask-about-llms (6 messages):
Uncensored MoEs, LoRAing a Josified model, Kimi 1T Jailbreak, Kimi Linear for housekeeping, Model Censorship Frustrations
- Quest for Uncensored MoEs Begins: A member is seeking suggestions for uncensored Mixture of Experts (MoE) models to bypass the limitations imposed by normie-proofing in current models.
- The goal is to develop tools for crime prevention and education, but censorship is hindering progress.
- Josiefied LoRA Models Considered: As an alternative to using uncensored MoEs, the member plans to try LoRAing a Josiefied model due to the need for models that can handle sensitive topics.
- The member is concerned about the lack of general knowledge dataset for uplifting an uncensored small model, as well as the uncertain outcomes of using Josified models and other uncensoring techniques.
- Kimi 1T Jailbreak Sought After: A member is looking for a good jailbreak for Kimi 1T, particularly for a smaller model suitable for specific use cases.
- Another member expressed interest in using Kimi Linear for general housekeeping and orchestration tasks, envisioning it as an on-device, on-prem assistance model.
- Kimi Linear Plans for Codegen: A member plans to fine-tune Kimi Linear for codegen and software engineering (SWE) tasks during the December holiday season.
- They intend to assemble a dataset specifically for this purpose.
- Frustration with Model Censorship Erupts: A member expressed frustration with model censorship, particularly when models refuse to perform simple tasks due to safety restrictions.
- Censorship prevents models from deleting a container, prompting a strong aversion to post-training alignment regimes.
Nous Research AI ▷ #interesting-links (2 messages):
AI Paper, NousResearch Github Repo
- DRTULU AI Paper Drops: A member shared a link to the DRTULU AI paper from allenai.org.
- Tinker-Atropos Github Repo Launched: A member shared a link to the Tinker-Atropos Github repo from NousResearch.
HuggingFace ▷ #general (109 messages🔥🔥):
Graph-RAG database, Mimir project, Lablab hackathons controversy, Cloudflare outage, Gemini 3
- Graph-RAG Database Breaks the Mold: A member announced the release of their graph-rag database, giving a big middle finger to Kilo Code and Pinecone by providing a user-controlled alternative with code intelligence.
- The project, designed to be open and customizable, is available on GitHub and already boasts 47 stars within days of its release.
- Mimir Project Orchestrates Open Source Nirvana: The Mimir project features multi-agent orchestration and push-button deployment, all MIT licensed, providing an alternative to vendor lock-in from solutions like N8N, Pinecone, and Kilocode.
- Compatible with OpenAI API and OpenWebUI, it includes orchestration pipelines for Minecraft servers with memory tools and semantic search, using llama.cpp for embeddings, although a custom ARM64 image compilation was necessary.
- Lablab Hackathons Trigger Scam Allegations: A user recounted a negative experience with Lablab hackathons, describing them as webdevs that basically scam builders and warning against potential exploitative practices.
- The user claimed to have been removed from the leaderboard after achieving the top score and cautioned others to avoid Lablab, especially if offers involve relocation to foreign countries on unfavorable terms.
- Cloudflare Suffers Global Outage: A widespread Cloudflare outage affected services in multiple cities, including Stockholm, Berlin, Warsaw, and Paris, impacting apps and various online functionalities.
- One user humorously claimed responsibility for the outage, stating they pushed a faulty authentication fix to production during their internship at Cloudflare, extending their weekend and stating tests passed locally.
- Gemini 3 has Anti-Gravity Properties: Enthusiastic members claimed Google’s Gemini 3 launched and it is solving my compiler design class work and giving correct output, while another posted a YouTube video questioning Google’s ability to deliver a flawless demo.
- One member sarcastically claimed Google just launched anti gravity due to the good results.
HuggingFace ▷ #i-made-this (4 messages):
Cornserve, FastMayavangelis, Grounded Video Caption Generation, Medical Reasoning GPT OSS 20B
- Cornserve serves Complex Models with Microservices: A serving framework called Cornserve was created to serve complex multimodal models by using a microservices approach to split complex models into independent components that automatically shares common parts.
- See the homepage for further details.
- FastMayavangelis Generates Audio Quickly: A member created a repo, FastMayavangelis, that can generate nearly 50 seconds of audio in just a second by optimizing maya1.
- It can generate high quality 48khz audio with sound effects and descriptions of a voice.
- Grounded Video Caption Generation Paper Accepted to ICCV 2025: A paper titled “Large-scale pre-training for grounded video caption generation” has been accepted to ICCV 2025.
- Links to the project webpage, code, model on 🤗, HowToGround1M dataset, and iGround dataset were provided.
- Medical Reasoning Model Open Sourced: A member fine-tuned OpenAI’s OSS 20B reasoning model using the most popular medical reasoning dataset and published the results on Hugging Face.
- The model can break down complex medical cases step-by-step, identify possible diagnoses in clinical scenarios, and answer board-exam-style questions with logical reasoning; it’s available here.
HuggingFace ▷ #computer-vision (1 messages):
grounded video caption generation, ICCV 2025, HowToGround1M, iGround dataset
- Grounded Video Caption Generation paper accepted to ICCV 2025: A new paper titled “Large-scale pre-training for grounded video caption generation” has been accepted to ICCV 2025
- Links to the project webpage, code and checkpoints, model, HowToGround1M pre-training dataset and iGround dataset were shared.
- HowToGround1M and iGround datasets released: The HowToGround1M pre-training dataset and the iGround dataset for fine-tuning and evaluation have been released.
- These datasets are associated with the “Large-scale pre-training for grounded video caption generation” paper.
HuggingFace ▷ #gradio-announcements (1 messages):
Gradio 6 Launch, Youtube Announcement, X Announcement
- Gradio 6 Launches Faster Than Ever: Gradio 6 is finally here and is faster, lighter, and more customizable than ever.
- The launch was announced for Friday, Nov 21 at 9:00 am PT/ 12:00 pm ET / 6:00 PM CET on Youtube and X, which included a team walkthrough of what’s new and a Q&A.
- Gradio 6 Announcement on Social Media: The launch of Gradio 6 was announced on both YouTube and X to maximize visibility.
- The announcement featured a link to the YouTube video and indicated that the X link would be shared on the day of the event.
HuggingFace ▷ #smol-course (1 messages):
Jinja Code, instruct_tokenizer.apply_chat_template, HuggingFaceTB/SmolLM3-3B chat template, Bug Reporting, Pull Requests
- Jinja Bug Spotted in SmolLM3-3B Chat Template: A user encountered a potential bug in the HuggingFaceTB/SmolLM3-3B chat template’s jinja code, specifically with the placement of the <|im_end|> token while using
instruct_tokenizer.apply_chat_template.- The user is seeking guidance on the appropriate repository for submitting a Pull Request (PR) or raising an issue to report this finding.
- Seeking Guidance on Bug Reporting Process: A member requested advice on how to report a suspected bug found while working with the
instruct_tokenizer.apply_chat_templatefunction.- They are specifically looking for the correct repository to open a PR or submit an issue for the HuggingFaceTB/SmolLM3-3B chat template.
Yannick Kilcher ▷ #general (36 messages🔥):
Understanding Machine Learning Book, More Technical ML Books, ReLU activation function, Gemini 3 Pro Model, Mathematical theory of Deep Learning
- Classic ML Book Gets Thumbs Up: Members discussed Understanding Machine Learning by Shai Shalev-Shwartz and Shai Ben-David.
- One member noted it’s a very nice introductory read but suggested pairing it with a more technical book on the side.
- Bach Book Best Bet for ML Theory: A member recommended the book Learning Theory from First Principles by Bach but noted that the proofs are dense without a stats background.
- Other recommended materials included Foundations of Machine Learning from Mohri, and the website deeplearningtheory.com.
- ReLU Revisted: Still Simple, Still Slaying: A member revisited ReLU, noting it felt way too simple, wondering if it could really work well, describing it as
max(0, wx+b).- They pointed out that ReLU is computationally way simpler, provides non-linearity, and reduces the chances of gradient vanishing due to its derivative being 0 or 1.
- Gemini 3 Pro: Marginally Magnificent or Massively More?: Members discussed the Gemini 3 Pro Model Card, with one member initially describing it as marginally better than GPT 5.1.
- Upon closer inspection, they revised their assessment, noting that the HLE, Video-MMMU, and ARC-AGI-2 results were pretty crazy, and speculated about the secrets behind Gemini 3’s performance, guessing data quality, better usage of params, RL.
- Deep Learning Math dives deep into arXiv: A member posted a link to the arXiv paper Mathematical theory of deep learning as well as a link to a Cambridge book of the same name.
- Another member responded, This one looks very interesting, thanks for sharing!
Yannick Kilcher ▷ #paper-discussion (9 messages🔥):
Circuit Sparsity, Yannic Kilcher Transformer Videos, Youtube Search
- Sparsity Research Sparks Interest: Members expressed interest in the “Circuit Sparsity” paper from OpenAI and a related blog post.
- One member considered presenting the topic in the daily paper discussion, scheduling permitting.
- Transformers Deep Dive Request: A member requested keywords or specific videos by Yannic Kilcher to better understand the evolution of attention and transformer models.
- Another member shared a link to Yannic Kilcher’s video about Transformers and also suggested using YouTube’s search feature to find content within Yannic’s channel: YannicKilcher/search?query=transformer.
Yannick Kilcher ▷ #ml-news (26 messages🔥):
DS Star, WeatherNext-2, Grok-4, Project Prometheus, Gemini 3 Pro Model
- Bezos Back to the Trenches with “Project Prometheus”: Jeff Bezos is reportedly returning as co-CEO of a new AI startup called Project Prometheus.
- One member joked that Bezos is basically aping Elon with every move.
- Google’s Gemini 3 Pro Model Card Leaked!: A link to the Gemini 3 Pro Model Card was shared, then taken down, but not before a screenshot was captured and shared.
- The model seems to be available on AI Studio, Google’s VS Code fork, and jumping on so many benchmarks.
- Google Launches DS Star Agent and WeatherNext-2: Google has announced the launch of a versatile data science agent, DS Star, and a new weather forecasting model, WeatherNext-2.
- The question remains whether Gemini’s performance gains will generalize to tasks outside of the benchmarks and to private and novel benchmarks, and to what degree.
- xAI Teases Grok-4: xAI has announced the upcoming release of Grok-4.
- One member made a joke that it would be pretty cool if level pass@32.
- ML Street Talk Video on ARC2: ML Street Talk recently released a video on ARC2 achieving 29.4%.
- One member joked that ML street talk would have to make another video.
Eleuther ▷ #general (31 messages🔥):
NeurIPS 2025, EleutherAI's Papers, Social events at NeurIPS 2025, Huggingface performance tips, ML Research guidance
- EleutherAI shines at NeurIPS 2025: EleutherAI is presenting several papers at NeurIPS 2025, including in the main track: The Common Pile v0.1, Explaining and Mitigating Cross-Linguistic Tokenizer Inequalities, and More of the Same: Persistent Representational Harms Under Increased Representation.
- They also have several papers accepted to workshops such as the MechInterp Workshop and AI4Music.
- NeurIPS 2025 Social Calendar Incoming: Members are planning an official dinner for NeurIPS 2025 and announced a reception link here.
- To join the social activities group, users can select the social activities role in the channels and roles tab.
- Huggingface gets throughput boost: A member shared that using an empty logits processor with
return scores.contiguous()can yield 60% faster generation throughput when using Huggingface.- This was tested with FA2: dynamic=False, fullgraph=True.
- ML Research Newbies Get Tips: A new member inquired about starting in ML research, prompting others to ask research in what subfield concretely? What concretely do you want to do/achieve?.
- Another member suggested, Check pinned messages for more information.
Eleuther ▷ #research (36 messages🔥):
Tied weights in Transformers, Cohere's command A model, Virtual Width Networks (VWN), Linear Attention, Residual Matrix Transformers
- Tying Transformer Weights: A Parameter-Saving Hack: Smaller transformer models often tie weights between the embedding and LM head to lower the parameter count, as discussed in the research channel.
- In contrast, larger models might not need to tie weights because the parameter increase is a smaller percentage of the overall model size.
- Cohere’s command A ties weights, sets trend?: Members mentioned that a recent large model using tied weights is Cohere’s command A.
- It remains unclear whether other large models are also adopting this strategy.
- VWN: Linear Attention Reimagined: Virtual Width Networks (VWN) are seen as an implementation of linear attention, but instead of updates happening token to token, they occur layer to layer, according to channel members.
- One member noted, VWN is pretty much doing linear attention, but instead of the
state[:, :] += ...update happening from token to token, it’s happening from layer to layer.
- One member noted, VWN is pretty much doing linear attention, but instead of the
- Navigating VWN’s Matrix Maze: The discussion delves into the confusing matrix representations in Virtual Width Networks (VWN), particularly the sizing and density of matrices A and B.
- Concerns were raised about potential errors in the paper’s matrix notation, with members suggesting that code implementations might use clearer einsum notation; one member said The only way I can rationalize A1 being n(n+m) in that screenshot is if it gets split into  (nn) for propagating the residual state, and Å (nm) as the multi-headed query that gets passed to the attention there*.
Eleuther ▷ #interpretability-general (1 messages):
Paper review, Model review, Result analysis, Result discussion, Future research
- Paper Review Underway: A member is reviewing a paper, focusing on the introduction, methods, related appendix sections, and some results.
- They intend to further explore the results, discussion, future work, and appendix sections.
- Follow-up Review Scheduled: The member plans to continue reviewing the same paper, with a focus on outstanding sections.
- These sections include results, discussion, future work, and the appendix.
Modular (Mojo 🔥) ▷ #announcements (1 messages):
Modular updates, Mojo 1.0 roadmap, Community Meeting, MMMAudio, Shimmer
- Modular Hypes MMMAudio and Shimmer: Modular announced the upcoming Community Meeting on Nov 24 featuring MMMAudio, a creative-coding audio environment, and Shimmer, a cross-platform Mojo → OpenGL experiment.
- The meeting will also cover Modular updates including the 25.7 release and the Mojo 1.0 roadmap.
- Mojo 1.0 Roadmap Teased: The Modular Community Meeting on November 24th will provide insights into the Mojo 1.0 roadmap, alongside the 25.7 release updates.
- Attendees can expect details on the future direction of Mojo and upcoming features.
Modular (Mojo 🔥) ▷ #mojo (34 messages🔥):
NVFP4 support in Mojo, Negative indexing andIntvsUInt, Poetry integration
- Mojo Eyes Nvidia’s NVFP4 Challenge: A member inquired about NVFP4 support in Mojo for an Nvidia/GPUMode competition focused on NVFP4 Batched GEMV, specifically asking if it’s supported by mojo.
- A member pointed out that Mojo (MLIR) includes fp4 and that nvfp4 can be built using existing Mojo datatypes, referencing the dtype.mojo file on Github.
- Negative Indexing Debate Deepens within Mojo Community: The ongoing discussion about negative indexing and
IntvsUIntcontinues, touching on performance implications and potential UB errors with various possibilities such asgeneric_list[-3:]or any of the other powerful slicing abilities that python traditionally allows, would those UB.- A core team member clarified that the goal is to standardize on Int for most APIs, remove support for negative indexing from low-level types like List, and enable bounds checks by default, and the other stated that I strongly disagree with due to having written a lot of numpy code and being used to using negative indexing.
- Poetry Plays Nice with Mojo: A member shared instructions on integrating Mojo with Poetry, detailing how to add a source and dependency within the
pyproject.tomlfile.- The configuration involves adding a modular source pointing to the nightly Python package repository and specifying the mojo dependency with pre-release enabled.
- ArcPointer Safety Questioned: A member raised a question about the safety of ArcPointer with regards to potential UB errors when calling a function with
a==bwhich they think should be caught as a UB error because p3 is litterly invalidated by the extend call.- The code involves operations like extending and dereferencing pointers within an ArcPointer[List[Int]], leading to a crash in the assert statement instead of a caught UB error.
DSPy ▷ #papers (1 messages):
the_4th_doctor: The structure of this is ideal for an optimizer: https://arxiv.org/abs/2511.09030
DSPy ▷ #general (22 messages🔥):
VSCode Extension for DSPy, MLflow Alternatives, LLM Determinism, DSPy in Production Community, Tracing and Observability
- DSPy gets VSCode Extension for Type Hinting: A member announced the creation of dspy-intellisense, a VSCode extension designed to provide better type hinting for signatures and predictions in DSPy.
- Members look for MLflow Alternatives: One member asked if others were happy with their inspection tool (an mlflow alternative), expressing dissatisfaction with mlflow.
- In response, one member suggested using arize phoenix to view traces, providing a link to arize.com.
- Navigating LLM’s Non-Determinism Dilemma: A member asked about dealing with the non-deterministic nature of LLMs, particularly when seeing slightly different results on each run with the same dataset and prompt using gpt-oss-20b.
- One member suggested setting temperature to 0, adjusting max_tokens, and enforcing a strict output format to achieve more consistent results, noting that randomness at temperature 0.1 is normal.
- Community craves DSPy-in-Production Channel: A member asked if there was a channel specifically for the DSPy in production community.
- Another member responded that there wasn’t one but suggested that there should be.
- Git is back, baby: A member announced that Git was back online.
- They also provided a link to a broader set of ideas.
tinygrad (George Hotz) ▷ #general (14 messages🔥):
tinybox, Llama 1B, torch compile, Kernel imports, examples/extra
- Meeting Announced for tinybox: A meeting has been announced with links to tinygrad.org/#tinybox and GitHub issue #1317.
- Llama 1B Faster Than Torch on CPU?: A member reported Llama1b on Tinygrad outperforming Torch on CPU: Tinygrad 10 runs:16498.56 ms 6.06 tok/s, Torch 10 runs:34247.07 ms 2.92 tok/s.
- Benchmarking Against Torch Compile: A member considered benchmarking against a
torch.compilePyTorch implementation for a more relevant comparison. - Kernel Imports Clean Up: Members questioned whether the
from tinygrad.codegen.opt.kernel import Kernelimports should be fixed in theextra/optimizationfiles. - Cleaning up Broken Examples: Members suggested that if there are broken unused examples/extra that haven’t been touched in a while, they should probably just be deleted.
aider (Paul Gauthier) ▷ #general (6 messages):
Gemini 3 errors, Experimental Channel Move
- Gemini 3 Faces Reportedly Same Errors: A user reported experiencing the same errors with the new Gemini 3 model as before, and attached a report with the error.
- Another user dismissed the report as AI nonsense, wondering what happened to experimental?
- Experimental Channel Heads to New Discord Home: The experimental channel has been moved to a new Discord server.
- This move was in response to users wondering what happened to experimental.
Manus.im Discord ▷ #general (3 messages):
Manus.im, Manus 1.5, Skilled Developer
- Manus User Hails Return to Glory: A long-time Manus.im user praised version 1.5 for its amazing status, noting improvements in task handling and the elimination of retard loops.
- Expressing satisfaction, the user thanked the Manus team for listening to complaints, indicating they might upgrade to the PRO version again.
- Skilled Developer Seeks Opportunity: A user inquired about opportunities for a skilled developer within the community.
- No additional information about the type of developer or their skills were given.
Windsurf ▷ #announcements (2 messages):
Gemini 3 Pro, Windsurf, Download editor
- Gemini 3 Pro Sails onto Windsurf!: Gemini 3 Pro is now available on Windsurf, offering users access to its capabilities.
- Check out the announcement on X for more details and download the latest version to get started.
- Gemini 3 Glitch Fixed!: A small issue with Gemini 3 has been resolved, ensuring it now functions correctly.
- The team apologized for any inconvenience caused by the initial blip.
MLOps @Chipro ▷ #events (1 messages):
AI Governance, AI Control, AI Webinar
- AI Governance and Control Webinar Announced: A member announced a webinar on AI governance and control scheduled for December 3rd at 10am PT; interested individuals can register via this link.
- The webinar aims to provide insights and strategies for effectively managing and controlling AI implementations.
- AI Governance Discussion: The webinar is intended to foster discussion around the critical aspects of AI governance, including ethical considerations, compliance, and risk management.
- Participants are encouraged to share their experiences and perspectives on the evolving landscape of AI control and its impact on organizational strategies.
MCP Contributors (Official) ▷ #general (1 messages):
Image Analysis, Attachments, Discord CDN
- Image Attachment in Discord Chat: A user shared an image in the chat, hosted on Discord’s CDN via the URL cdn.discordapp.com.
- The image’s content and its relevance to the ongoing discussion remain unspecified.
- Discord Image Hosting: The attached image is hosted on Discord’s Content Delivery Network (CDN).
- This indicates a direct upload to Discord rather than an external image hosting service.