All tags

Person: "maximelabonne"

    The Ultra-Scale Playbook: Training LLMs on GPU Clusters
    LLaDA: Large Language Diffusion Models
    not much happened today
    Apple Intelligence Beta + Segment Anything Model 2
    Cursor reaches >1000 tok/s finetuning Llama3-70b for fast file editing
    DeepSeek-V2 beats Mixtral 8x22B with >160 experts at HALF the cost