All tags
Model: "ernie-4.5"
not much happened today
grok-4 jamba ernie-4.5 claude-4-sonnet claude-4 kontext-dev ai21-labs hugging-face baidu perplexity-ai deepmind anthropic reinforcement-learning fine-tuning energy-based-transformers ssm-transformer context-windows length-generalization recurrent-neural-networks attention-mechanisms 2-simplicial-attention biomedical-ai instruction-following open-weight-models python-package-management _philschmid corbtt jxmnop sedielem _akhaliq slashml alexiglad clementdelangue _albertgu tri_dao theaitimeline deep-learning-ai
Over the holiday weekend, key AI developments include the upcoming release of Grok 4, Perplexity teasing new projects, and community reactions to Cursor and Dia. Research highlights feature a paper on Reinforcement Learning (RL) improving generalization and reasoning across domains, contrasting with Supervised Fine-Tuning's forgetting issues. Energy-Based Transformers (EBTs) are proposed as a promising alternative to traditional transformers. AI21 Labs updated its Jamba model family with enhanced grounding and instruction following, maintaining a 256K context window. Baidu open-sourced its massive 424 billion parameter Ernie 4.5 model, while Kontext-dev became the top trending model on Hugging Face. Advances in length generalization for recurrent models and the introduction of 2-simplicial attention were noted. In biomedical AI, Biomni, powered by Claude 4 Sonnet, demonstrated superior accuracy and rare disease diagnosis capabilities. Additionally, the Python package manager
uv received praise for improving Python installation workflows. not much happened today
o3-mini o1-mini llama hunyuan-a13b ernie-4.5 ernie-4.5-21b-a3b qwen3-30b-a3b gemini-2.5-pro meta-ai-fair openai tencent microsoft baidu gemini superintelligence ai-talent job-market open-source-models multimodality mixture-of-experts quantization fp8-training model-benchmarking model-performance model-releases api model-optimization alexandr_wang shengjia_zhao jhyuxm ren_hongyu shuchaobi saranormous teortaxesTex mckbrando yuchenj_uw francoisfleuret quanquangu reach_vb philschmid
Meta has poached top AI talent from OpenAI, including Alexandr Wang joining as Chief AI Officer to work towards superintelligence, signaling a strong push for the next Llama model. The AI job market shows polarization with high demand and compensation for top-tier talent, while credentials like strong GitHub projects gain importance. The WizardLM team moved from Microsoft to Tencent to develop open-source models like Hunyuan-A13B, highlighting shifts in China's AI industry. Rumors suggest OpenAI will release a new open-source model in July, potentially surpassing existing ChatGPT models. Baidu open-sourced multiple variants of its ERNIE 4.5 model series, featuring advanced techniques like 2-bit quantization, MoE router orthogonalization loss, and FP8 training, with models ranging from 0.3B to 424B parameters. Gemini 2.5 Pro returned to the free tier of the Gemini API, enabling developers to explore its features.