Person: "alexalbert__"

claude claude-code anthropic slack workflow-integration asynchronous-collaboration software-development team-collaboration productivity-tools beta-release _catwu alexalbert__

Anthropic launched Claude Tag, a Slack-native integration enabling asynchronous, teamwide delegation to Claude, positioning it as a "multiplayer, async, and proactive" workflow layer distinct from the solo, synchronous Claude Code. Internally, Claude Tag has been used to write and merge 65% of the product team's code and PRs. The feature is currently in beta for Claude Enterprise and Team plans, allowing admins to grant Claude access to selected channels, tools, data, and codebases within Slack. Product lead Cat Wu highlighted its flexibility with "100s of ways" to customize workflows, framing it as a team management tool rather than a simple AI assistant.

May 06

Anthropic-SpaceXai's 300MW/$5B/yr deal for Colossus I, ARR growth is 8000% annualized

claude claude-code opus colossus-1 anthropic spacex x-ai compute rate-limiting agent-platforms inference api managed-agents safety governance event nottombrown _aidan_clark_ kipperrii theamolavasare alexalbert__

Anthropic announced a new SpaceX compute partnership to significantly increase capacity for Claude products, doubling Claude Code's 5-hour rate limits for Pro, Max, Team, and Enterprise users, removing peak-hour limit reductions, and substantially increasing API rate limits for Opus models. The deal grants Anthropic access to Colossus 1 via SpaceXAI, with Claude inference expected to ramp up on Colossus soon. Anthropic also hosted a "Code with Claude" event featuring updates on Claude Code, GitHub-scale usage, and managed agents. Discussions highlighted compute bottlenecks, user reactions to limit changes, debates on managed-agent features, and ongoing safety/governance discourse around AGI trustworthiness.

Apr 17

not much happened today

claude-opus-4.7 gemini-3.1-pro gpt-5.4 claude-code codex anthropic openai agentic-ai model-benchmarking adaptive-reasoning cost-efficiency computer-use prototyping-tools code-generation model-performance software-integration claudeai yuchenj_uw kimmonismus skirano therundownai arena artificialanlys victortaelin emollick alexalbert__ theo scaling01 reach_vb kr0der hamelhusain mattrickard matvelloso gdb

Anthropic launched Claude Design, a prototyping tool powered by Claude Opus 4.7, targeting design workflows and competing with Figma and others. Benchmarks show Opus 4.7 leading in coding and text tasks, with improved efficiency and adaptive reasoning, though early user feedback noted some regressions and stability issues. Discussions highlighted its cost-efficiency and agentic capabilities compared to Gemini 3.1 Pro and GPT-5.4. Meanwhile, OpenAI's Codex updates introduced advanced computer-use features enabling fast, agentic control of desktop apps and enterprise software, signaling progress toward practical AGI-like agents.

Mar 13

not much happened today

opus-4.6 glm-5 anthropic ibm perplexity-ai llamaindex deepseek google-chrome persistent-memory agent-infrastructure cross-device-synchronization long-context sparse-attention inference-optimization computer-architecture task-completion systems-performance pamelafox tadasayy llama_index bromann dair_ai omarsar0 abxxai teknuim bcherny kimmonismus _catwu alexalbert__ realyushibai

MCP tools remain relevant for deterministic APIs despite ergonomic criticisms, with new web MCP support in Chrome v146 enabling continuous browsing agents. Persistent memory is emerging as a key differentiator for agents, with IBM improving task completion rates and multi-agent memory framed as a computer architecture challenge. Agent UX is evolving towards always-on, cross-device operation, exemplified by Perplexity Computer on iOS and Claude Code session management. Anthropic released Opus 4.6 1M context as default with no extra long-context API charges, achieving 78.3% on MRCR v2 at 1M tokens. Sparse attention optimizations like IndexCache in DeepSeek Sparse Attention yield significant speedups on large models with minimal code changes.

Feb 17

Claude Sonnet 4.6: clean upgrade of 4.5, mostly better with some caveats

claude-3-sonnet-4.6 claude-3-sonnet-4.5 claude-3-opus-4.5 claude-3-opus-4.6 anthropic cursor microsoft perplexity-ai cognition long-context agent-planning knowledge-work benchmarking tokenization model-integration code-execution model-updates aesthetic-quality alexalbert__ scaling01 rishdotblog claudeai kimmonismus artificialanlys

Anthropic launched Claude Sonnet 4.6, an upgrade over Sonnet 4.5, featuring broad improvements in coding, long-context reasoning, agent planning, knowledge work, and design, plus a 1M-token context window (beta). Benchmarks show Sonnet 4.6 leading on GDPval-AA ELO 1633, with significant token usage increases and improved output aesthetics. Integrations include Cursor, Windsurf, Microsoft Foundry, and Perplexity Pro/Max. Early user feedback noted some regression issues that were later fixed. Pricing remains the same as Sonnet 4.5. Tooling enhancements include code execution for filtering results, improving accuracy and efficiency.

Dec 02, 2025

Mistral 3: Mistral Large 3 + Ministral 3B/8B/14B open weights models

mistral-large-3 ministral-3 clara-7b-instruct gen-4.5 claude-code mistral-ai anthropic apple runway moondream sparse-moe multimodality benchmarking open-source model-licensing model-performance long-context inference-optimization instruction-following local-inference code-generation model-integration anjney_midha _akhaliq alexalbert__ _catwu mikeyk

Mistral has launched the Mistral 3 family including Ministral 3 models (3B/8B/14B) and Mistral Large 3, a sparse MoE model with 675B total parameters and 256k context window, all under an Apache 2.0 open license. Early benchmarks rank Mistral Large 3 at #6 among open models with strong coding performance. The launch includes broad ecosystem support such as vLLM, llama.cpp, Ollama, and LM Studio integrations. Meanwhile, Anthropic acquired the open-source Bun runtime to accelerate Claude Code, which reportedly reached a $1B run-rate in ~6 months. Anthropic also announced discounted Claude plans for nonprofits and shared insights on AI's impact on work internally.

Nov 26, 2025

not much happened today

claude-opus-4.5 qwen-3-4b qwen-3-8b qwen-3-14b deepseek-r1 anthropic booking.com perplexity-ai langchain claude scaling01 deepseek qwen prefect agent-systems multi-agent-systems reasoning benchmarking cost-efficiency model-optimization long-context memory-management reinforcement-learning model-performance multi-agent-communication latent-representation inference-cost software-integration jeremyphoward alexalbert__ omarsar0 lingyang_pu dair_ai

Anthropic introduces durable agents and MCP tasks for long-running workflows, with practical engineering patterns and integrations like Prefect. Booking.com deploys a large-scale agent system improving customer satisfaction using LangGraph, Kubernetes, GPT-4 Mini, and Weaviate. Perplexity rolls out user-level memory and virtual try-on features. Claude Opus 4.5 leads on LisanBench and Code Arena WebDev benchmarks with mixed community feedback on its "thinking" and "non-thinking" modes, while improving cost-efficiency and UX with batch APIs and context compaction. Research on multi-agent systems shows LatentMAS reduces communication tokens by 70-84% and improves accuracy using Qwen3 models, and reasoning trace distillation achieves significant token reduction with maintained accuracy, highlighting the importance of reasoning trace style.

Nov 24, 2025

Claude Opus 4.5: 3rd new SOTA coding model in past week, 1/3 the price of Opus

claude-opus-4.5 gemini-3-pro gpt-5.1-codex-max opus-4.1 sonnet-4.5 anthropic amazon google anthropic coding agents tool-use token-efficiency benchmarking api model-pricing model-performance effort-control context-compaction programmatic-tool-calling alexalbert__ btibor91 scaling01 klieret

Anthropic launched Claude Opus 4.5, a new flagship model excelling in coding, agents, and tooling with a significant 3x price cut compared to Opus 4.1 and improved token efficiency using 76% fewer output tokens. Opus 4.5 achieved a new SOTA on SWE-bench Verified with 80.9% accuracy, surpassing previous models like Gemini 3 Pro and GPT-5.1-Codex-Max. The update includes advanced API features such as effort control, context compaction, and programmatic tool calling, improving tool accuracy and reducing token usage. Claude Code is now bundled with Claude Desktop, and new integrations like Claude for Chrome and Excel are rolling out. Benchmarks show Opus 4.5 breaking the 80% barrier on SWE-bench Verified and strong performance on ARC-AGI-2 and BrowseComp-Plus.

Nov 14, 2025

not much happened today

gpt-5.1 sonnet-4.5 opus-4.1 gemini-3 openai anthropic langchain-ai google-deepmind adaptive-reasoning developer-tools prompt-optimization json-schema agent-workflows context-engineering structured-outputs model-release benchmarking swyx allisontam_ gdb sama alexalbert__ simonw omarsar0 abacaj scaling01 amandaaskell

OpenAI launched GPT-5.1 featuring "adaptive reasoning" and developer-focused API improvements, including prompt caching and a reasoning_effort toggle for latency/cost tradeoffs. Independent analysis shows a minor intelligence bump with significant gains in agentic coding benchmarks. Anthropic's Claude models introduced structured outputs with JSON schema compliance in public beta for Sonnet 4.5 and Opus 4.1, enhancing tooling and code execution workflows. Rumors of an Opus 4.5 release were debunked. LangChain released a "Deep Agents" package and context-engineering playbook to optimize agent workflows. The community is eagerly anticipating Google DeepMind's Gemini 3 model, hinted at in social media and upcoming AIE CODE events. "Tickets are sold out, but side events and volunteering opportunities are available."

Oct 23, 2025

not much happened today

gemini-1.5-pro claude-3 chatgpt langchain meta-ai-fair hugging-face openrouter google-ai microsoft openai anthropic agent-ops observability multi-turn-evaluation reinforcement-learning distributed-training api model-stability user-intent-clustering software-development project-management code-generation hwchase17 ankush_gola11 whinthorn koylanai _lewtun bhutanisanyam1 thom_wolf danielhanchen cline canvrno pashmerepat mustafasuleyman yusuf_i_mehdi jordirib1 fidjissimo bradlightcap mikeyk alexalbert__

LangSmith launched the Insights Agent with multi-turn evaluation for agent ops and observability, improving failure detection and user intent clustering. Meta PyTorch and Hugging Face introduced OpenEnv, a Gymnasium-style API and hub for reproducible agentic environments supporting distributed training. Discussions highlighted the importance of provider fidelity in agent coding, with OpenRouter's exacto filter improving stability. Builder UX updates include Google AI Studio's Annotation mode for Gemini code changes, Microsoft's Copilot Mode enhancements in Edge, and OpenAI's Shared Projects and Company Knowledge features for ChatGPT Business. Claude added project-scoped Memory. In reinforcement learning, Meta's ScaleRL proposes a methodology to predict RL scaling outcomes for LLMs with improved efficiency and stability.

Oct 17, 2025

The Karpathy-Dwarkesh Interview delays AGI timelines

claude-haiku-4.5 gpt-5 arch-router-1.5b anthropic openai huggingface langchain llamaindex google epoch-ai reasoning long-context sampling benchmarking data-quality agent-frameworks modular-workflows ide-extensions model-routing graph-first-agents real-world-grounding karpathy aakaran31 du_yilun giffmana omarsar0 jeremyphoward claude_code mikeyk alexalbert__ clementdelangue jerryjliu0

The recent AI news highlights the Karpathy interview as a major event, alongside significant discussions on reasoning improvements without reinforcement learning, with test-time sampling achieving GRPO-level performance. Critiques on context window marketing reveal effective limits near 64K tokens, with Claude Haiku 4.5 showing competitive reasoning speed. GPT-5 struggles with advanced math benchmarks, and data quality issues termed "Brain Rot" affect model reasoning and safety. In agent frameworks, Anthropic Skills enable modular coding workflows, OpenAI Codex IDE extensions enhance developer productivity, and HuggingChat Omni introduces meta-routing across 100+ open models using Arch-Router-1.5B. LangChain and LlamaIndex advance graph-first agent infrastructure, while Google Gemini integrates with Google Maps for real-world grounding.

Oct 16, 2025

Claude Agent Skills - glorified AGENTS.md? or MCP killer?

claude-4.5-haiku claude chatgpt huggingchat-omni anthropic openai microsoft perplexity-ai huggingface groq cerebras togethercompute agent-skills document-processing long-context reasoning multi-model-routing memory-management voice vision simonwillison alexalbert__ mustafasuleyman yusuf_i_mehdi aravsrinivas

Anthropic achieves a rare feat with back-to-back AI news headlines featuring Claude's new Skills—a novel way to build specialized agents using Markdown files, scripts, and metadata to handle tasks like creating and reading PDFs, Docs, and PPTs. Simon Willison calls this a "bigger deal than MCP," predicting a "Cambrian explosion in Skills." Meanwhile, Anthropic launches Claude 4.5 Haiku with strong reasoning and long-context capabilities, priced competitively. Other updates include OpenAI's ChatGPT memory management improvements, Windows 11 Copilot voice and vision features, and HuggingChat Omni routing across 115 open-source models from 15 providers. These developments highlight advances in agent skills, document processing, long-context reasoning, and multi-model routing.

May 28, 2025

not much happened today

deepseek-r1-0528 pali-gemma-2 gemma-3 shieldgemma-2 txgemma gemma-3-qat gemma-3n-preview medgemma dolphingemma signgemma claude-4 opus-4 claude-sonnet-4 codestral-embed bagel qwen nemotron-cortexa gemini-2.5-pro deepseek-ai huggingface gemma claude bytedance qwen nemotron sakana-ai-labs benchmarking model-releases multimodality code-generation model-performance long-context reinforcement-learning model-optimization open-source yuchenj_uw _akhaliq clementdelangue osanseviero alexalbert__ guillaumelample theturingpost lmarena_ai epochairesearch scaling01 nrehiew_ ctnzr

DeepSeek R1 v2 model released with availability on Hugging Face and inference partners. The Gemma model family continues prolific development including PaliGemma 2, Gemma 3, and others. Claude 4 and its variants like Opus 4 and Claude Sonnet 4 show top benchmark performance, including new SOTA on ARC-AGI-2 and WebDev Arena. Codestral Embed introduces a 3072-dimensional code embedder. BAGEL, an open-source multimodal model by ByteDance, supports reading, reasoning, drawing, and editing with long mixed contexts. Benchmarking highlights include Nemotron-CORTEXA topping SWEBench and Gemini 2.5 Pro performing on VideoGameBench. Discussions on random rewards effectiveness focus on Qwen models. "Opus 4 NEW SOTA ON ARC-AGI-2. It's happening - I was right" and "Claude 4 launch has dev moving at a different pace" reflect excitement in the community.

May 23, 2025

not much happened today

claude-4 claude-4-opus claude-4-sonnet gemini-2.5-pro gemma-3n imagen-4-ultra anthropic google-deepmind openai codebase-understanding coding agentic-performance multimodality text-to-speech video-generation model-integration benchmarking memory-optimization cline amanrsanger ryanpgreenblatt johnschulman2 alexalbert__ nearcyan mickeyxfriedman jeremyphoward gneubig teortaxesTex scaling01 artificialanlys philschmid

Anthropic's Claude 4 models (Opus 4, Sonnet 4) demonstrate strong coding abilities, with Sonnet 4 achieving 72.7% on SWE-bench and Opus 4 at 72.5%. Claude Sonnet 4 excels in codebase understanding and is considered SOTA on large codebases. Criticism arose over Anthropic's handling of ASL-3 security requirements. Demand for Claude 4 is high, with integration into IDEs and support from Cherry Studio and FastHTML. Google DeepMind introduced Gemini 2.5 Pro Deep Think and Gemma 3n, a mobile multimodal model reducing RAM usage by nearly 3x. Google's Imagen 4 Ultra ranks third in the Artificial Analysis Image Arena, available on Vertex AI Studio. Google also promoted Google Beam, an AI video model for immersive 3D experiences, and new text-to-speech models with multi-speaker support. The GAIA benchmark shows Claude 4 Opus and Sonnet leading in agentic performance.

May 02, 2025

not much happened today

qwen3-14b qwen3-32b qwen3-235b phi-4-reasoning o3-mini command-a gemini-2.5-pro o4-mini olm-o2-1b o3 alibaba together-ai scaling01 microsoft deepseek cohere google epoch-ai-research inception-labs openai allenai quantization fine-tuning reinforcement-learning benchmarking video-generation diffusion-models model-performance model-evaluation model-release text-generation cline _philschmid iscienceluvr alexalbert__ _lewtun teortaxestex sarahookr reach_vb

Qwen model family released quantized versions of Qwen3 models including 14B, 32B, and 235B parameters, with promising coding capabilities in Qwen3-235B. Microsoft launched Phi-4-reasoning, a 14B parameter model distilled from OpenAI's o3-mini, emphasizing supervised fine-tuning and reinforcement learning, outperforming larger models in some benchmarks. Cohere's Command A leads SQL performance on Bird Bench. Google introduced the TRAJAN eval for video generation temporal consistency and updated the Gemini OpenAI compatibility layer. Inception Labs launched a diffusion LLM API claiming 5x speed improvements over autoregressive models. Community rankings show OpenAI's o3 model debuting strongly in web app-building tasks. Other releases include AllenAI's OLMo2 1B and additional Phi 4 variants. "Qwen3-235B shows promise for coding" and "Phi-4-reasoning tech report emphasizes SFT gains" highlight key advancements.

Feb 27, 2025

lots of small launches

gpt-4o claude-3.7-sonnet claude-3.7 claude-3.5-sonnet deepseek-r1 deepseek-v3 grok-3 openai anthropic amazon cloudflare perplexity-ai deepseek-ai togethercompute elevenlabs elicitorg inceptionailabs mistral-ai voice model-releases cuda gpu-optimization inference open-source api model-performance token-efficiency context-windows cuda jit-compilation lmarena_ai alexalbert__ aravsrinivas reach_vb

GPT-4o Advanced Voice Preview is now available for free ChatGPT users with enhanced daily limits for Plus and Pro users. Claude 3.7 Sonnet has achieved the top rank in WebDev Arena with improved token efficiency. DeepSeek-R1 with 671B parameters benefits from the Together Inference platform optimizing NVIDIA Blackwell GPU usage, alongside the open-source DeepGEMM CUDA library delivering up to 2.7x speedups on Hopper GPUs. Perplexity launched a new Voice Mode and a Deep Research API. The upcoming Grok 3 API will support a 1M token context window. Several companies including Elicit, Amazon, Anthropic, Cloudflare, FLORA, Elevenlabs, and Inception Labs announced new funding rounds, product launches, and model releases.

Dec 31, 2024

not much happened to end the year

deepseek-v3 code-llm o1 sonnet-3.5 deepseek smol-ai reinforcement-learning reasoning training-data mixed-precision-training open-source multimodality software-development natural-language-processing interpretability developer-tools real-time-applications search sdk-generation corbtt tom_doerr cognitivecompai alexalbert__ theturingpost svpino bindureddy

Reinforcement Fine-Tuning (RFT) is introduced as a data-efficient method to improve reasoning in LLMs using minimal training data with strategies like First-Correct Solutions (FCS) and Greedily Diverse Solutions (GDS). DeepSeek-V3, a 671B parameter MoE language model trained on 14.8 trillion tokens with FP8 mixed precision training, highlights advances in large-scale models and open-source LLMs. Predictions for AI in 2025 include growth in smaller models, multimodality, and challenges in open-source AI. The impact of AI on software development jobs suggests a need for higher intelligence and specialization as AI automates low-skilled tasks. Enhancements to CodeLLM improve coding assistance with features like in-place editing and streaming responses. Natural Language Reinforcement Learning (NLRL) offers better interpretability and richer feedback for AI planning and critique. AI hiring is growing rapidly with startups seeking strong engineers in ML and systems. New AI-powered tools such as Rivet, Buzee, and Konfig improve real-time applications, search, and SDK generation using technologies like Rust and V8 isolates.

Nov 01, 2024

The AI Search Wars Have Begun — SearchGPT, Gemini Grounding, and more

gpt-4o o1-preview claude-3.5-sonnet universal-2 openai google gemini nyt perplexity-ai glean nvidia langchain langgraph weights-biases cohere weaviate fine-tuning synthetic-data distillation hallucinations benchmarking speech-to-text robotics neural-networks ai-agents sam-altman alexalbert__ _jasonwei svpino drjimfan virattt

ChatGPT launched its search functionality across all platforms using a fine-tuned version of GPT-4o with synthetic data generation and distillation from o1-preview. This feature includes a Chrome extension promoted by Sam Altman but has issues with hallucinations. The launch coincides with Gemini introducing Search Grounding after delays. Notably, The New York Times is not a partner due to a lawsuit against OpenAI. The AI search competition intensifies with consumer and B2B players like Perplexity and Glean. Additionally, Claude 3.5 Sonnet achieved a new benchmark record on SWE-bench Verified, and a new hallucination evaluation benchmark, SimpleQA, was introduced. Other highlights include the Universal-2 speech-to-text model with 660M parameters and HOVER, a neural whole-body controller for humanoid robots trained in NVIDIA Isaac simulation. AI hedge fund teams using LangChain and LangGraph were also showcased. The news is sponsored by the RAG++ course featuring experts from Weights & Biases, Cohere, and Weaviate.

Jul 17, 2024

SciCode: HumanEval gets a STEM PhD upgrade

gpt-4 claude-3.5-sonnet llama-3-7b llama-3 dolphin-2.9.3-yi-1.5-34b-32k-gguf anthropic hugging-face nvidia benchmarks coding model-training gpu-optimization model-performance synthetic-data compiler-optimization zero-shot-learning yi-tay rohanpaul_ai alexalbert__ tri_dao abacaj

PhD-level benchmarks highlight the difficulty of coding scientific problems for LLMs, with GPT-4 and Claude 3.5 Sonnet scoring under 5% on the new SciCode benchmark. Anthropic doubled the max output token limit for Claude 3.5 Sonnet to 8192 tokens. The Q-GaLore method enables training LLaMA-7B on a single 16GB GPU. The Mosaic compiler now generates efficient code for NVIDIA H100 GPUs. The Dolphin 2.9.3-Yi-1.5-34B-32k-GGUF model on Hugging Face has over 111k downloads. Llama 3 shows strong performance, achieving 90% zero-shot accuracy on the MATH dataset. Discussions continue on the limitations and forms of synthetic data for model training.

May 24, 2024

Ten Commandments for Deploying Fine-Tuned Models

claude-3-opus claude-3 gpt-4o anthropic google openai fine-tuning prompt-engineering model-evaluation feature-alteration benchmarking model-performance open-source-models kyle-corbitt bindureddy alexalbert__

Gemini-in-Google-Slides is highlighted as a useful tool for summarizing presentations. Kyle Corbitt's talk on deploying fine-tuned models in production emphasizes avoiding fine-tuning unless necessary, focusing on prompting, data quality, appropriate model choice, and thorough evaluation. Anthropic showcased feature alteration in Claude AI, demonstrating control over model behavior and increased understanding of large language models. Open-source models like GPT-4o are approaching closed-source performance on benchmarks like MMLU for simple tasks, though advanced models remain necessary for complex automation.

May 17, 2024

Chameleon: Meta's (unreleased) GPT4o-like Omnimodal Model

chameleon gpt-4o gemini-1.5-flash claude-3 meta-ai-fair openai google-deepmind anthropic reddit multimodality early-fusion benchmarking model-training tokenization streaming tool-use vision coding hallucination-detection model-performance armen-aghajanyan sama alexandr-wang abacaj alexalbert__

Meta AI FAIR introduced Chameleon, a new multimodal model family with 7B and 34B parameter versions trained on 10T tokens of interleaved text and image data enabling "early fusion" multimodality that can natively output any modality. While reasoning benchmarks are modest, its "omnimodality" approach competes well with pre-GPT4o multimodal models. OpenAI launched GPT-4o, a model excelling in benchmarks like MMLU and coding tasks, with strong multimodal capabilities but some regression in ELO scores and hallucination issues. Google DeepMind announced Gemini 1.5 Flash, a small model with 1M context window and flash performance, highlighting convergence trends between OpenAI and Google models. Anthropic updated Claude 3 with streaming support, forced tool use, and vision tool integration for multimodal knowledge extraction. OpenAI also partnered with Reddit, raising industry attention.

May 11, 2024

Quis promptum ipso promptiet?

llama-3-70b llama-3-120b llama-3 llama-cpp anthropic openai zoominfo neuralink prompt-engineering chain-of-thought rag quantization cuda-graphs gpu-optimization thought-controlled-devices modeling-consciousness conference sama gdb bindureddy svpino rohanpaul_ai alexalbert__ abacaj

Anthropic released upgrades to their Workbench Console, introducing new prompt engineering features like chain-of-thought reasoning and prompt generators that significantly reduce development time, exemplified by their customer Zoominfo. OpenAI teased a "magic" new development coming soon, speculated to be a new LLM replacing GPT-3.5 in the free tier or a search competitor. The open-source community highlighted Llama 3 70B as "game changing" with new quantized weights for Llama 3 120B and CUDA graph support for llama.cpp improving GPU performance. Neuralink demonstrated a thought-controlled mouse, sparking interest in modeling consciousness from brain signals. The ICLR 2024 conference is being held in Asia for the first time, generating excitement.