All tags  
  Topic: "model-calibration"
 not much happened today 
   mobilellm-r1  qwen3-next-80b-a3b  gpt-5   meta-ai-fair  huggingface  alibaba  openai   reasoning  model-efficiency  hybrid-attention  long-context  benchmarking  agent-evaluation  hallucination-detection  model-calibration  inference-complexity  model-pricing   _akhaliq  tacocohen  pkirgis  sayashk  
 Meta released MobileLLM-R1, a sub-1B parameter reasoning model family on Hugging Face with strong small-model math accuracy, trained on 4.2T tokens. Alibaba introduced Qwen3-Next-80B-A3B with hybrid attention, 256k context window, and improved long-horizon memory, priced competitively on Alibaba Cloud. Meta AI FAIR fixed a benchmark bug in SWE-Bench affecting agent evaluation. LiveMCP-101 benchmark shows frontier models like GPT-5 underperform on complex tasks with common failure modes cataloged. OpenAI highlights hallucination issues due to benchmark incentives, proposing calibration improvements. Community demos and tooling updates continue to evolve.