All tags
Model: "opus"
not much happened this weekend
o3 o1 opus sonnet octave openai langchain hume x-ai amd nvidia meta-ai-fair hugging-face inference-time-scaling model-ensembles small-models voice-cloning fine-math-dataset llm-agent-framework benchmarking software-stack large-concept-models latent-space-reasoning mechanistic-interpretability planning speech-language-models lisa-su clementdelangue philschmid neelnanda5
o3 model gains significant attention with discussions around its capabilities and implications, including an OpenAI board member referencing "AGI." LangChain released their State of AI 2024 survey. Hume announced OCTAVE, a 3B parameter API-only speech-language model with voice cloning. x.ai secured a $6B Series C funding round. Discussions highlight inference-time scaling, model ensembles, and the surprising generalization ability of small models. New tools and datasets include FineMath, the best open math dataset on Hugging Face, and frameworks for LLM agents. Industry updates cover a 5-month benchmarking of AMD MI300X vs Nvidia H100 + H200, insights from a meeting with Lisa Su on AMD's software stack, and open AI engineering roles. Research innovations include Large Concept Models (LCM) from Meta AI, Chain of Continuous Thought (Coconut) for latent space reasoning, and mechanistic interpretability initiatives.
Google wakes up: Gemini 2.0 et al
gemini-2.0-flash gemini-1.5-pro gemini-exp-1206 claude-3.5-sonnet opus google-deepmind openai apple multimodality agent-development multilinguality benchmarking model-releases demis-hassabis sundar-pichai paige-bailey bindureddy
Google DeepMind launched Gemini 2.0 Flash, a new multimodal model outperforming Gemini 1.5 Pro and o1-preview, featuring vision and voice APIs, multilingual capabilities, and native tool use. It powers new AI agents like Project Astra and Project Mariner, with Project Mariner achieving state-of-the-art 83.5% on the WebVoyager benchmark. OpenAI announced ChatGPT integration with Apple devices, enabling Siri access and visual intelligence features. Claude 3.5 Sonnet is noted as a distilled version of Opus. The AI community's response at NeurIPS 2024 has been overwhelmingly positive, signaling a strong comeback for Google in AI innovation. Key topics include multimodality, agent development, multilinguality, benchmarking, and model releases.
Ways to use Anthropic's Tool Use GA
claude-3-opus haiku opus convnext anthropic amazon google tool-use function-calling agentic-ai streaming vision parallelization delegation debate specialization open-science superintelligence convolutional-networks self-attention ai-research yann-lecun alex-albert sainingxie
Anthropic launched general availability of tool use/function calling with support for streaming, forced use, and vision, alongside Amazon and Google. Alex Albert shared five architectures for agentic tool use: delegation, parallelization, debate, specialization, and tool suite experts. Anthropic also introduced a self-guided course on tool use. Yann LeCun emphasized ethical open science funding, gradual emergence of superintelligence with safety guardrails, and convolutional networks for image/video processing as competitive with vision transformers. He also noted growth in AI researchers across industry, academia, and government.
Lilian Weng on Video Diffusion
wizardlm-2 llama-3 reka-core devin opus sora openai adobe reka-ai diffusion-models video-generation training-free-adaptation multimodality intuition creativity analogy-recognition self-improving-ai model-recognition agi-timelines model-performance startup-competition lilian-weng sam-altman geoffrey-hinton yann-lecun
OpenAI expands with a launch in Japan, introduces a Batch API, and partners with Adobe to bring the Sora video model to Premiere Pro. Reka AI releases the Reka Core multimodal language model. WizardLM-2 is released showing impressive performance, and Llama 3 news is anticipated soon. Geoffrey Hinton highlights AI models exhibiting intuition, creativity, and analogy recognition beyond humans. The Devin AI model notably contributes to its own codebase. Opus demonstrates the ability to recognize its own generated outputs. Sam Altman warns startups about being steamrolled by OpenAI if they don't adapt quickly. Yann LeCun discusses AGI timelines, emphasizing it is inevitable but not imminent or solely from LLMs. Lilian Weng's blog on diffusion models for video generation highlights training-free adaptation as a breakthrough technique.