All tags

Topic: "multimodality"

    not much happened today
    not much happened today
    not much happened today
    not much happened today
    OpenAI buys Jony Ive's io for $6.5b, LMArena lands $100m seed from a16z
    not much happened today
    ChatGPT Codex, OpenAI's first cloud SWE agent
    not much happened today
    Prime Intellect's INTELLECT-2 and PRIME-RL advance distributed reinforcement learning
    Gemini 2.5 Pro Preview 05-06 (I/O edition) - the SOTA vision+coding model
    gpt-image-1 - ChatGPT's imagegen model, confusingly NOT 4o, now available in API
    not much happened today
    not much happened today; New email provider for AINews
    Gemini 2.5 Flash completes the total domination of the Pareto Frontier
    OpenAI o3, o4-mini, and Codex CLI
    not much happened today
    Google's Agent2Agent Protocol (A2A)
    DeepCoder: A Fully Open-Source 14B Coder at O3-mini Level
    Llama 4's Controversial Weekend Release
    not much happened today
    >$41B raised today (OpenAI @ 300b, Cursor @ 9.5b, Etched @ 1.5b)
    OpenAI adopts MCP
    Gemini 2.5 Pro + 4o Native Image Gen
    Every 7 Months: The Moore's Law for Agent Autonomy
    not much happened today
    Cohere's Command A claims #3 open model spot (after DeepSeek and Gemma)
    Gemma 3 beats DeepSeek V3 in Elo, 2.0 Flash beats GPT4o with Native Image Gen
    The new OpenAI Agents Platform
    DeepSeek's Open Source Stack
    not much happened today
    not much happened today
    GPT 4.5 — Chonky Orion ships!
    not much happened today
    The Ultra-Scale Playbook: Training LLMs on GPU Clusters
    X.ai Grok 3 and Mira Murati's Thinking Machines
    LLaDA: Large Language Diffusion Models
    Gemini 2.0 Flash GA, with new Flash Lite, 2.0 Pro, and Flash Thinking
    not much happened today
    DeepSeek #1 on US App Store, Nvidia stock tanks -17%
    OpenAI launches Operator, its first Agent
    Bespoke-Stratos + Sky-T1: The Vicuna+Alpaca moment for reasoning
    small little news items
    Moondream 2025.1.9: Structured Text, Enhanced OCR, Gaze Detection in a 2B Model
    not much happened to end the year
    not much happened today
    OpenAI Voice Mode Can See Now - After Gemini Does
    Meta BLT: Tokenizer-free, Byte-level LLM
    Google wakes up: Gemini 2.0 et al
    $200 ChatGPT Pro and o1-full/pro, with vision, without API, and mixed reviews
    not much happened today
    Olympus has dropped (aka, Amazon Nova Micro|Lite|Pro|Premier|Canvas|Reel)
    not much happened to end the week
    Vision Everywhere: Apple AIMv2 and Jina CLIP v2
    Pixtral Large (124B) beats Llama 3.2 90B with updated Mistral Large 24.11
    Common Corpus: 2T Open Tokens with Provenance
    Tencent's Hunyuan-Large claims to beat DeepSeek-V2 and Llama3-405B with LESS Data
    OpenAI beats Anthropic to releasing Speculative Decoding
    not much happened today
    not much happened today
    not much happened today
    DeepSeek Janus and Meta SpiRit-LM: Decoupled Image and Expressive Voice Omnimodality
    not much happened today
    State of AI 2024
    The AI Nobel Prize
    not much happened this weekend
    Liquid Foundation Models: A New Transformers alternative + AINews Pod 2
    not much happened today
    not much happened today
    Llama 3.2: On-device 1B/3B, and Multimodal 11B/90B (with AI2 Molmo kicker)
    not much happened today
    o1 destroys Lmsys Arena, Qwen 2.5, Kyutai Moshi release
    a quiet weekend
    Pixtral 12B: Mistral beats Llama to Multimodality
    not much happened today
    Ideogram 2 + Berkeley Function Calling Leaderboard V2
    not much happened today
    Gemini Live
    not much happened today
    How Carlini Uses AI
    Mini, Nemo, Turbo, Lite - Smol models go brrr (GPT4o version)
    Nothing much happened today
    Problems with MMLU-Pro
    Gemma 2: The Open Model for Everyone
    Is this... OpenQ*?
    Hybrid SSM/Transformers > Pure SSMs/Pure Transformers
    The Last Hurrah of Stable Diffusion?
    Francois Chollet launches $1m ARC Prize
    Mamba-2: State Space Duality
    1 TRILLION token context, real time, on device?
    ALL of AI Engineering in One Place
    Skyfall
    Chameleon: Meta's (unreleased) GPT4o-like Omnimodal Model
    Cursor reaches >1000 tok/s finetuning Llama3-70b for fast file editing
    Not much happened today
    Google I/O in 60 seconds
    GPT-4o: the new SOTA-EVERYTHING Frontier model (GPT4T version)
    GPT-4o: the new SOTA-EVERYTHING Frontier model (GPT4O version)
    OpenAI's PR Campaign?
    DeepSeek-V2 beats Mixtral 8x22B with >160 experts at HALF the cost
    Evals: The Next Generation
    Perplexity, the newest AI unicorn
    Lilian Weng on Video Diffusion
    Multi-modal, Multi-Aspect, Multi-Form-Factor AI
    Music's Dall-E moment
    not much happened today
    World_sim.exe
    MM1: Apple's first Large Multimodal Model
    DeepMind SIMA: one AI, 9 games, 600 tasks, vision+language ONLY
    Not much happened today
    Stable Diffusion 3 — Rombach & Esser did it again!
    Claude 3 just destroyed GPT 4 (see for yourself)
    The Era of 1-bit LLMs
    Sora pushes SOTA
    CodeLLama 70B beats GPT4 on HumanEval
    Adept Fuyu-Heavy: Multimodal model for Agents
    Google Solves Text to Video
    1/16/2024: ArtificialAnalysis - a new model/host benchmark site
    12/28/2023: Smol Talk updates
    12/26/2023: not much happened today
    12/25/2023: Nous Hermes 2 Yi 34B for Christmas
    12/20/2023: Project Obsidian - Multimodal Mistral 7B from Nous
    Is Google's Gemini... legit?