All tags

Topic: "model-efficiency"

    DeepSeek-OCR finds vision models can decode 10x more efficiently with ~97% accuracy of text-only, 33/200k pages/day/A100
    Claude Haiku 4.5
    not much happened today
    Softbank, NVIDIA and US Govt take 2%, 5% and 10% of Intel, will develop Intel x86 RTX SOCs for consumer & datacenters
    not much happened today
    OpenAI updates Codex, VSCode Extension that can sync tasks with Codex Cloud
    not much happened today
    Qwen-Image: SOTA text rendering + 4o-imagegen-level Editing Open Weights MMDiT
    not much happened today
    not much happened today
    DeepSeek #1 on US App Store, Nvidia stock tanks -17%
    Moondream 2025.1.9: Structured Text, Enhanced OCR, Gaze Detection in a 2B Model
    Genesis: Generative Physics Engine for Robotics (o1-mini version)
    Meta BLT: Tokenizer-free, Byte-level LLM
    Perplexity starts Shopping for you
    BitNet was a lie?
    Tencent's Hunyuan-Large claims to beat DeepSeek-V2 and Llama3-405B with LESS Data
    Contextual Document Embeddings: `cde-small-v1`
    Liquid Foundation Models: A New Transformers alternative + AINews Pod 2
    Summer of Code AI: $1.6b raised, 1 usable product
    not much happened today
    Gemma 2 tops /r/LocalLlama vibe check
    Claude Crushes Code - 92% HumanEval and Claude.ai Artifacts
    HippoRAG: First, do know(ledge) Graph
    5 small news items
    Skyfall
    DBRX: Best open model (just not most efficient)
    RIP Latent Diffusion, Hello Hourglass Diffusion