All tags
Person: "akshay_pachaar"
not much happened today
glm-5.1 gemini-3.1 gpt-5.4 claude-3-sonnet haiku opus sonnet qwen-3.6-plus qwen3-coder-next-80b z-ai anthropic berkeley langchain alibaba openai model-performance agent-frameworks orchestration model-routing fine-tuning agent-harness model-selection workflow-automation zixuan_li akshay_pachaar harrison_chase walden_yan yuchen_jin sentdex
GLM-5.1 has reached #3 on Code Arena, surpassing Gemini 3.1 and GPT-5.4, and matching Claude Sonnet 4.6 in coding performance. Z.ai now holds the #1 open model rank close to the top overall. The advisor pattern, combining a cheap executor with an expensive advisor, is gaining traction, improving performance and efficiency in models like Haiku + Opus and Sonnet + Opus. Alibaba's Qwen Code v0.14.x introduces orchestration features including remote control channels, cron tasks, and sub-agent model selection. Model routing is becoming a product-level concern due to specialization and spikiness in top models such as Opus and GPT-5.4. The Hermes Agent ecosystem shows strong momentum with a new workspace mobile app, FAST mode for OpenAI/GPT-5.4, and over 50k GitHub stars. Practitioners report Hermes as a reliable agent framework, with local Qwen3-Coder-Next 80B 4-bit replacing parts of workflows previously reliant on Claude Code. The harness layer is emerging as a key abstraction in agent frameworks.
not much happened today
glm-4.6v glm-4.6v-flash jina-vlm-2b hugging-face zhipu-ai jina-ai google-deepmind axiomprover fine-tuning multimodality model-optimization long-context mechanistic-interpretability formal-methods sequence-architectures reinforcement-learning lioronai akshay_pachaar _akhaliq ben_burtenshaw vllm_project prince_canuma zenmuxai eliebakouch theturingpost axiommathai neelnanda5 sarahookr
Claude Code Skills gains attention with a published talk and Hugging Face's new "skill" enabling one-line fine-tuning pipelines for models from ~0.5B to 70B parameters, supporting SFT, DPO, and GRPO, costing as low as ~$0.30 for small runs. Zhipu AI launches multimodal models GLM-4.6V (106B params MoE) and GLM-4.6V-Flash (9B dense), featuring 128k context and native multimodal function calling, with free Flash variant and API pricing detailed. Jina AI releases Jina-VLM (2B), a compact multilingual VLM excelling in diagrams and documents with top benchmark scores. At NeurIPS 2025, research highlights include Google's post-Transformer sequence architectures (Moneta, Yaad, Memora) showing up to 20% gains in long-context retrieval, AxiomProver's autonomous Lean system solving 9/12 Putnam 2025 problems rapidly, and mechanistic interpretability advances discussed by Chris Olah emphasizing scalable tooling.