All tags
Topic: "recurrent-neural-networks"
not much happened today
grok-4 jamba ernie-4.5 claude-4-sonnet claude-4 kontext-dev ai21-labs hugging-face baidu perplexity-ai deepmind anthropic reinforcement-learning fine-tuning energy-based-transformers ssm-transformer context-windows length-generalization recurrent-neural-networks attention-mechanisms 2-simplicial-attention biomedical-ai instruction-following open-weight-models python-package-management _philschmid corbtt jxmnop sedielem _akhaliq slashml alexiglad clementdelangue _albertgu tri_dao theaitimeline deep-learning-ai
Over the holiday weekend, key AI developments include the upcoming release of Grok 4, Perplexity teasing new projects, and community reactions to Cursor and Dia. Research highlights feature a paper on Reinforcement Learning (RL) improving generalization and reasoning across domains, contrasting with Supervised Fine-Tuning's forgetting issues. Energy-Based Transformers (EBTs) are proposed as a promising alternative to traditional transformers. AI21 Labs updated its Jamba model family with enhanced grounding and instruction following, maintaining a 256K context window. Baidu open-sourced its massive 424 billion parameter Ernie 4.5 model, while Kontext-dev became the top trending model on Hugging Face. Advances in length generalization for recurrent models and the introduction of 2-simplicial attention were noted. In biomedical AI, Biomni, powered by Claude 4 Sonnet, demonstrated superior accuracy and rare disease diagnosis capabilities. Additionally, the Python package manager
uv
received praise for improving Python installation workflows. not much happened today
grok-3 deepseek-r1 siglip-2 o3-mini-high r1-1776 llamba-1b llamba-3b llamba-8b llama-3 alphamaze audiobox-aesthetics xai nvidia google-deepmind anthropic openai bytedance ollama meta-ai-fair benchmarking model-releases performance reasoning multimodality semantic-understanding ocr multilinguality model-distillation recurrent-neural-networks visual-reasoning audio-processing scaling01 iscienceluvr philschmid arankomatsuzaki reach_vb mervenoyann wightmanr lmarena_ai ollama akhaliq
Grok-3, a new family of LLMs from xAI using 200,000 Nvidia H100 GPUs for advanced reasoning, outperforms models from Google, Anthropic, and OpenAI on math, science, and coding benchmarks. DeepSeek-R1 from ByteDance Research achieves top accuracy on the challenging SuperGPQA dataset. SigLIP 2 from GoogleDeepMind improves semantic understanding and OCR with flexible resolutions and multilingual capabilities, available on HuggingFace. OpenAI's o3-mini-high ranks #1 in coding and math prompts. Perplexity's R1 1776, a post-trained version of DeepSeek R1, is available on Ollama. The Llamba family distills Llama-3.x into efficient recurrent models with higher throughput. AlphaMaze combines DeepSeek R1 with GRPO for visual reasoning on ARC-AGI puzzles. Audiobox Aesthetics from Meta AI offers unified quality assessment for audio. The community notes that Grok 3's compute increase yields only modest performance gains.