All tags
Person: "sainingxie"
not much happened today
glm-4.7 claude-code z.ai meta-ai-fair manus replit agentic-architecture context-engineering application-layer code-generation agent-habitats ai-native-llm ipo inference-infrastructure programming-paradigms zixuanli_ jietang yuchenj_uw sainingxie amasad hidecloud imjaredz random_walker
Z.ai (GLM family) IPO in Hong Kong on Jan 8, 2026, aiming to raise $560M at HK$4.35B, marking it as the "first AI-native LLM company" public listing. The IPO highlights GLM-4.7 as a starting point. Meta AI acquired Manus for approximately $4–5B, with Manus achieving $100M ARR in 8–9 months, illustrating the value of application-layer differentiation over proprietary models. Manus focuses on agentic architecture, context engineering, and general primitives like code execution and browser control, emphasizing "agent habitats" as a competitive moat. Discussions around Claude Code highlight skepticism about "vibe coding," advocating for disciplined, framework-like AI-assisted programming practices.
Gemini 2.5 Pro/Flash GA, 2.5 Flash-Lite in Preview
gemini-2.5 gemini-2.5-flash-lite gemini-2.5-flash gemini-2.5-pro gemini-2.5-ultra kimi-dev-72b nanonets-ocr-s ii-medical-8b-1706 jan-nano deepseek-r1 minimax-m1 google moonshot-ai deepseek cognitivecompai kling-ai mixture-of-experts multimodality long-horizon-planning benchmarking coding-performance long-context ocr video-generation model-releases tulsee_doshi oriolvinyalsml demishassabis officiallogank _philschmid swyx sainingxie scaling01 gneubig clementdelangue mervenoyann
Gemini 2.5 models are now generally available, including the new Gemini 2.5 Flash-Lite, Flash, Pro, and Ultra variants, featuring sparse Mixture-of-Experts (MoE) transformers with native multimodal support. A detailed 30-page tech report highlights impressive long-horizon planning demonstrated by Gemini Plays Pokemon. The LiveCodeBench-Pro benchmark reveals frontier LLMs struggle with hard coding problems, while Moonshot AI open-sourced Kimi-Dev-72B, achieving state-of-the-art results on SWE-bench Verified. Smaller specialized models like Nanonets-OCR-s, II-Medical-8B-1706, and Jan-nano show competitive performance, emphasizing that bigger models are not always better. DeepSeek-r1 ties for #1 in WebDev Arena, and MiniMax-M1 sets new standards in long-context reasoning. Kling AI demonstrated video generation capabilities.
Ways to use Anthropic's Tool Use GA
claude-3-opus haiku opus convnext anthropic amazon google tool-use function-calling agentic-ai streaming vision parallelization delegation debate specialization open-science superintelligence convolutional-networks self-attention ai-research yann-lecun alex-albert sainingxie
Anthropic launched general availability of tool use/function calling with support for streaming, forced use, and vision, alongside Amazon and Google. Alex Albert shared five architectures for agentic tool use: delegation, parallelization, debate, specialization, and tool suite experts. Anthropic also introduced a self-guided course on tool use. Yann LeCun emphasized ethical open science funding, gradual emergence of superintelligence with safety guardrails, and convolutional networks for image/video processing as competitive with vision transformers. He also noted growth in AI researchers across industry, academia, and government.