All tags

Topic: "knowledge-distillation"

    Reasoning Models are Near-Superhuman Coders (OpenAI IOI, Nvidia Kernels)
    Nvidia Minitron: LLM Pruning and Distillation updated for Llama 3.1
    Gemma 2 2B + Scope + Shield
    Gemma 2: The Open Model for Everyone
    GPT-4o: the new SOTA-EVERYTHING Frontier model (GPT4T version)