All tags

Topic: "datasets"

    There's Ilya!
    $100k to predict LMSYS human preferences in a Kaggle contest
    FineWeb: 15T Tokens, 12 years of CommonCrawl (deduped and filtered, you're welcome)