All tags

Topic: "visual-autoregressive-modeling"

    Mixture of Depths: Dynamically allocating compute in transformer-based language models