All tags
Company: "arcee-ai"
Not much happened today
mistral-small-3.2 magenta-realtime afm-4.5b llama-3 openthinker3-7b deepseek-r1-distill-qwen-7b storm qwen2-vl gpt-4o dino-v2 sakana-ai mistral-ai google arcee-ai deepseek-ai openai amazon gdm reinforcement-learning chain-of-thought fine-tuning function-calling quantization music-generation foundation-models reasoning text-video model-compression image-classification evaluation-metrics sama
Sakana AI released Reinforcement-Learned Teachers (RLTs), a novel technique using smaller 7B parameter models trained via reinforcement learning to teach reasoning through step-by-step explanations, accelerating Chain-of-Thought learning. Mistral AI updated Mistral Small 3.2 improving instruction following and function calling with experimental FP8 quantization. Google Magenta RealTime, an 800M parameter open-weights model for real-time music generation, was released. Arcee AI launched AFM-4.5B, a sub-10B parameter foundation model extended from Llama 3. OpenThinker3-7B was introduced as a new state-of-the-art 7B reasoning model with a 33% improvement over DeepSeek-R1-Distill-Qwen-7B. The STORM text-video model compresses video input by 8x using Mamba layers and outperforms GPT-4o on MVBench with 70.6%. Discussions on reinforcement learning algorithms PPO vs. GRPO and insights on DINOv2's performance on ImageNet-1k were also highlighted. "A very quiet day" in AI news with valuable workshops from OpenAI, Amazon, and GDM.
minor ai followups: MultiAgents, Meta-SSI-Scale, Karpathy, AI Engineer
gpt-4o afm-4.5b gemma qwen stt-1b-en_fr stt-2.6b-en hunyuan-3d-2.1 openai meta-ai-fair scale-ai huggingface tencent arcee-ai ai-safety alignment ai-regulation memory-optimization scalable-oversight speech-recognition 3d-generation foundation-models sama polynoamial neelnanda5 teortaxestex yoshua_bengio zachtratar ryanpgreenblatt reach_vb arankomatsuzaki code_star
OpenAI released a paper revealing how training models like GPT-4o on insecure code can cause broad misalignment, drawing reactions from experts like @sama and @polynoamial. California's AI regulation efforts were highlighted by @Yoshua_Bengio emphasizing transparency and whistleblower protections. The term "context rot" was coined to describe LLM conversation degradation, with systems like Embra using CRM-like memory for robustness. Scalable oversight research aiming to improve human control over smarter AIs was discussed by @RyanPGreenblatt. New model releases include Kyutai's speech-to-text models capable of 400 real-time streams on a single H100 GPU, Tencent's Hunyuan 3D 2.1 as the first open-source production-ready PBR 3D generative model, and Arcee's AFM-4.5B foundation model family targeting enterprise use, competitive with Gemma and Qwen.
Pixtral 12B: Mistral beats Llama to Multimodality
pixtral-12b mistral-nemo-12b llama-3-1-70b llama-3-1-8b deeps-eek-v2-5 gpt-4-turbo llama-3-1 strawberry claude mistral-ai meta-ai-fair hugging-face arcee-ai deepseek-ai openai anthropic vision multimodality ocr benchmarking model-release model-architecture model-performance fine-tuning model-deployment reasoning code-generation api access-control reach_vb devendra_chapilot _philschmid rohanpaul_ai
Mistral AI released Pixtral 12B, an open-weights vision-language model with a Mistral Nemo 12B text backbone and a 400M vision adapter, featuring a large vocabulary of 131,072 tokens and support for 1024x1024 pixel images. This release notably beat Meta AI in launching an open multimodal model. At the Mistral AI Summit, architecture details and benchmark performances were shared, showing strong OCR and screen understanding capabilities. Additionally, Arcee AI announced SuperNova, a distilled Llama 3.1 70B & 8B model outperforming Meta's Llama 3.1 70B instruct on benchmarks. DeepSeek released DeepSeek-V2.5, scoring 89 on HumanEval, surpassing GPT-4-Turbo, Opus, and Llama 3.1 in coding tasks. OpenAI plans to release Strawberry as part of ChatGPT soon, though its capabilities are debated. Anthropic introduced Workspaces for managing multiple Claude deployments with enhanced access controls.