All tags
Model: "minimax-m1"
Zuck goes Superintelligence Founder Mode: $100M bonuses + $100M+ salaries + NFDG Buyout?
llama-4 maverick scout minimax-m1 afm-4.5b chatgpt midjourney-v1 meta-ai-fair openai deeplearning-ai essential-ai minimax arcee midjourney long-context multimodality model-release foundation-models dataset-release model-training video-generation enterprise-ai model-architecture moe prompt-optimization sama nat dan ashvaswani clementdelangue amit_sangani andrewyng _akhaliq
Meta AI is reportedly offering 8-9 figure signing bonuses and salaries to top AI talent, confirmed by Sam Altman. They are also targeting key figures like Nat and Dan from the AI Grant fund for strategic hires. Essential AI released the massive 24-trillion-token Essential-Web v1.0 dataset with rich metadata and a 12-category taxonomy. DeepLearning.AI and Meta AI launched a course on Llama 4, featuring new MoE models Maverick (400B) and Scout (109B) with context windows up to 10M tokens. MiniMax open-sourced MiniMax-M1, a long-context LLM with a 1M-token window, and introduced the Hailuo 02 video model. OpenAI rolled out "Record mode" for ChatGPT Pro, Enterprise, and Edu on macOS. Arcee launched the AFM-4.5B foundation model for enterprise. Midjourney released its V1 video model enabling image animation. These developments highlight major advances in model scale, long-context reasoning, multimodality, and enterprise AI applications.
Gemini 2.5 Pro/Flash GA, 2.5 Flash-Lite in Preview
gemini-2.5 gemini-2.5-flash-lite gemini-2.5-flash gemini-2.5-pro gemini-2.5-ultra kimi-dev-72b nanonets-ocr-s ii-medical-8b-1706 jan-nano deepseek-r1 minimax-m1 google moonshot-ai deepseek cognitivecompai kling-ai mixture-of-experts multimodality long-horizon-planning benchmarking coding-performance long-context ocr video-generation model-releases tulsee_doshi oriolvinyalsml demishassabis officiallogank _philschmid swyx sainingxie scaling01 gneubig clementdelangue mervenoyann
Gemini 2.5 models are now generally available, including the new Gemini 2.5 Flash-Lite, Flash, Pro, and Ultra variants, featuring sparse Mixture-of-Experts (MoE) transformers with native multimodal support. A detailed 30-page tech report highlights impressive long-horizon planning demonstrated by Gemini Plays Pokemon. The LiveCodeBench-Pro benchmark reveals frontier LLMs struggle with hard coding problems, while Moonshot AI open-sourced Kimi-Dev-72B, achieving state-of-the-art results on SWE-bench Verified. Smaller specialized models like Nanonets-OCR-s, II-Medical-8B-1706, and Jan-nano show competitive performance, emphasizing that bigger models are not always better. DeepSeek-r1 ties for #1 in WebDev Arena, and MiniMax-M1 sets new standards in long-context reasoning. Kling AI demonstrated video generation capabilities.
Chinese Models Launch - MiniMax-M1, Hailuo 2 "Kangaroo", Moonshot Kimi-Dev-72B
minimax-m1 hailuo-02 kimi-dev-72b deepseek-r1 ale-agent minimax-ai moonshot-ai deepseek bytedance anthropic langchain columbia-university sakana-ai openai microsoft multi-agent-systems attention-mechanisms coding optimization prompt-injection model-performance video-generation model-training task-automation jerryjliu0 hwchase17 omarsar0 gallabytes lateinteraction karpathy
MiniMax AI launched MiniMax-M1, a 456 billion parameter open weights LLM with a 1 million token input and 80k token output using efficient "lightning attention" and a GRPO variant called CISPO. MiniMax AI also announced Hailuo 02 (0616), a video model similar to ByteDance's Seedance. Moonshot AI released Kimi-Dev-72B, a coding model outperforming DeepSeek R1 on SWEBench Verified. Discussions on multi-agent system design from Anthropic and LangChain highlighted improvements in task completion and challenges like prompt injection attacks, as demonstrated by Karpathy and Columbia University research. Sakana AI introduced ALE-Agent, a coding agent that ranked 21st in the AtCoder Heuristic Competition solving NP-hard optimization problems. There is unverified news about an acquisition involving OpenAI, Microsoft, and Windsurf.