All tags
Model: "qwen3-next-80b-a3b"
not much happened today
mobilellm-r1 qwen3-next-80b-a3b gpt-5 meta-ai-fair huggingface alibaba openai reasoning model-efficiency hybrid-attention long-context benchmarking agent-evaluation hallucination-detection model-calibration inference-complexity model-pricing _akhaliq tacocohen pkirgis sayashk
Meta released MobileLLM-R1, a sub-1B parameter reasoning model family on Hugging Face with strong small-model math accuracy, trained on 4.2T tokens. Alibaba introduced Qwen3-Next-80B-A3B with hybrid attention, 256k context window, and improved long-horizon memory, priced competitively on Alibaba Cloud. Meta AI FAIR fixed a benchmark bug in SWE-Bench affecting agent evaluation. LiveMCP-101 benchmark shows frontier models like GPT-5 underperform on complex tasks with common failure modes cataloged. OpenAI highlights hallucination issues due to benchmark incentives, proposing calibration improvements. Community demos and tooling updates continue to evolve.