All tags

Company: "llama"

    not much happened today
    DeepSeek R1: o1-level open weights model and a simple recipe for upgrading 1.5B models to Sonnet/4o level
    ModernBert: small new Retriever/Classifier workhorse, 8k context, 2T tokens,
    not much happened today
    o1: OpenAI's new general reasoning models
    Problems with MMLU-Pro
    Cursor reaches >1000 tok/s finetuning Llama3-70b for fast file editing