All tags
Topic: "autoregressive-objectives"
Vision Everywhere: Apple AIMv2 and Jina CLIP v2
aimv2-3b jina-clip-v2 tulu-3 llama-3-1 claude-3-5 llama-3-1-70b apple jina allen_ai autoregressive-objectives vision multilinguality multimodality image-generation model-training model-optimization reinforcement-learning fine-tuning model-benchmarking
Apple released AIMv2, a novel vision encoder pre-trained with autoregressive objectives that achieves 89.5% accuracy on ImageNet and integrates joint visual and textual objectives. Jina launched Jina CLIP v2, a multimodal embedding model supporting 89 languages and high-resolution images with efficient Matryoshka embeddings reducing dimensions by 94% with minimal accuracy loss. Allen AI introduced Tülu 3 models based on Llama 3.1 with 8B and 70B parameters, offering 2.5x faster inference and alignment via SFT, DPO, and RLVR methods, competing with Claude 3.5 and Llama 3.1 70B. These developments highlight advances in autoregressive training, vision encoders, and multilingual multimodal embeddings.