All tags
Person: "francoisfleuret"
not much happened today
gpt-5 qwen2.5-7b ernie-4.5-vl-28b-a3b-thinking gemini-2.5-pro llamacloud claude-code openai baidu databricks llamaindex togethercompute sakanaailabs reasoning-benchmarks reinforcement-learning fine-tuning multimodality document-intelligence retrieval-augmented-generation agentic-systems persona-simulation code-agents guardrails sakanaailabs micahgoldblum francoisfleuret matei_zaharia jerryjliu0 omarsar0 togethercompute imjaredz theo
GPT-5 leads Sudoku-Bench solving 33% of puzzles but 67% remain unsolved, highlighting challenges in meta-reasoning and spatial logic. New training methods like GRPO fine-tuning and "Thought Cloning" show limited success. Research on "looped LLMs" suggests pretrained models benefit from repeated computation for better performance. Baidu's ERNIE-4.5-VL-28B-A3B-Thinking offers lightweight multimodal reasoning with Apache 2.0 licensing, outperforming Gemini-2.5-Pro and GPT-5-High on document tasks. Databricks ai_parse_document preview delivers cost-efficient document intelligence outperforming GPT-5 and Claude. Pathwork AI uses LlamaCloud for underwriting automation. Gemini File Search API enables agentic retrieval augmented generation (RAG) with MCP server integration. Together AI and Collinear launch TraitMix for persona-driven agent simulations integrated with Together Evals. Reports highlight risks in long-running code agents like Claude Code reverting changes, emphasizing guardrails. Community consensus favors multiple code copilots including Claude Code, Codex, and others.
not much happened today
o3-mini o1-mini llama hunyuan-a13b ernie-4.5 ernie-4.5-21b-a3b qwen3-30b-a3b gemini-2.5-pro meta-ai-fair openai tencent microsoft baidu gemini superintelligence ai-talent job-market open-source-models multimodality mixture-of-experts quantization fp8-training model-benchmarking model-performance model-releases api model-optimization alexandr_wang shengjia_zhao jhyuxm ren_hongyu shuchaobi saranormous teortaxesTex mckbrando yuchenj_uw francoisfleuret quanquangu reach_vb philschmid
Meta has poached top AI talent from OpenAI, including Alexandr Wang joining as Chief AI Officer to work towards superintelligence, signaling a strong push for the next Llama model. The AI job market shows polarization with high demand and compensation for top-tier talent, while credentials like strong GitHub projects gain importance. The WizardLM team moved from Microsoft to Tencent to develop open-source models like Hunyuan-A13B, highlighting shifts in China's AI industry. Rumors suggest OpenAI will release a new open-source model in July, potentially surpassing existing ChatGPT models. Baidu open-sourced multiple variants of its ERNIE 4.5 model series, featuring advanced techniques like 2-bit quantization, MoE router orthogonalization loss, and FP8 training, with models ranging from 0.3B to 424B parameters. Gemini 2.5 Pro returned to the free tier of the Gemini API, enabling developers to explore its features.
not much happened today
deepseek-r1 qwen-2.5 qwen-2.5-max deepseek-v3 deepseek-janus-pro gpt-4 nvidia anthropic openai deepseek huawei vercel bespoke-labs model-merging multimodality reinforcement-learning chain-of-thought gpu-optimization compute-infrastructure compression crypto-api image-generation saranormous zizhpan victormustar omarsar0 markchen90 sakanaailabs reach_vb madiator dain_mclau francoisfleuret garygodchaux arankomatsuzaki id_aa_carmack lavanyasant virattt
Huawei chips are highlighted in a diverse AI news roundup covering NVIDIA's stock rebound, new open music foundation models like Local Suno, and competitive AI models such as Qwen 2.5 Max and Deepseek V3. The release of DeepSeek Janus Pro, a multimodal LLM with image generation capabilities, and advancements in reinforcement learning and chain-of-thought reasoning are noted. Discussions include GPU rebranding with NVIDIA's H6400 GPUs, data center innovations, and enterprise AI applications like crypto APIs in hedge funds. "Deepseek R1's capabilities" and "Qwen 2.5 models added to applications" are key highlights.