All tags
Model: "jamba-1.5"
Air Street's State of AI 2025 Report
glm-4.6 jamba-1.5 rnd1 claude-code reflection mastra datacurve spellbook kernel figure softbank abb radicalnumerics zhipu-ai ai21-labs anthropic humanoid-robots mixture-of-experts diffusion-models open-weight-models reinforcement-learning benchmarking small-language-models plugin-systems developer-tools agent-stacks adcock_brett achowdhery clementdelangue
Reflection raised $2B to build frontier open-weight models with a focus on safety and evaluation, led by a team with backgrounds from AlphaGo, PaLM, and Gemini. Figure launched its next-gen humanoid robot, Figure 03, emphasizing non-teleoperated capabilities for home and large-scale use. Radical Numerics released RND1, a 30B-parameter sparse MoE diffusion language model with open weights and code to advance diffusion LM research. Zhipu posted strong results with GLM-4.6 on the Design Arena benchmark, while AI21 Labs' Jamba Reasoning 3B leads tiny reasoning models. Anthropic introduced a plugin system for Claude Code to enhance developer tools and agent stacks. The report also highlights SoftBank's acquisition of ABB's robotics unit for $5.4B and the growing ecosystem around open frontier modeling and small-model reasoning.
Everybody shipped small things this holiday weekend
gpt-4o-voice gemini claude jamba-1.5 mistral-nemo-minitron-8b xai google anthropic openai cognition ai21-labs nvidia langchain fine-tuning long-context parameter-efficient-fine-tuning latex-rendering real-time-audio virtual-try-on resource-tags low-code ai-agents workspace-organization model-benchmarking dario-amodei scott-wu fchollet svpino
xAI announced the Colossus 100k H100 cluster capable of training an FP8 GPT-4 class model in 4 days. Google introduced Structured Output for Gemini. Anthropic discussed Claude's performance issues possibly due to API prompt modifications. OpenAI enhanced controls for File Search in their Assistants API. Cognition and Anthropic leaders appeared on podcasts. The viral Kwai-Kolors virtual try-on model and the open-source real-time audio conversational model Mini-Omni (similar to gpt-4o-voice) were released. Tutorials on parameter-efficient fine-tuning with LoRA and QLoRA, long-context embedding challenges, and Claude's LaTeX rendering feature were highlighted. AI21 Labs released Jamba 1.5 models with a 256K context window and faster long-context performance. NVIDIA debuted Mistral-Nemo-Minitron-8B on the Open LLM Leaderboard. LangChain introduced resource tags for workspace organization, and a low-code AI app toolkit was shared by svpino. Legal AI agents and financial agent evaluations using LangSmith were also featured.
not much happened this weekend
jamba-1.5 dream-machine-1.5 ideogram-v2 mistral-nemo-minitron-8b mistral-7b llama-3-8b nous-research cursor-ai gdm george-hotz agibot unitree eth-zurich disney uc-san-diego ai21-labs luma-labs ideogram nvidia mistral-ai meta-ai-fair distributed-ai optimizer inter-gpu-communication low-latency-training open-source humanoid-robots robotics physics-based-motion teleoperation multilingual-models long-context text-to-video text-to-image model-performance george-hotz adcock_brett aman
Nous Research announced DisTrO, a new optimizer that drastically reduces inter-GPU communication by 1000x to 10,000x enabling efficient training on slow networks, offering an alternative to GDM's DiLoCo. Cursor AI gained viral attention from an 8-year-old user and announced a new fundraise, with co-host Aman returning to their podcast. George Hotz launched tinybox for sale. In robotics, AGIBOT revealed 5 new humanoid robots with open-source plans, and Unitree showcased its G1 humanoid robot nearing mass production at $16,000. ETH Zurich and Disney developed an AI system for physics-based robot motion generation from text or images. UC San Diego released ACE, an open-source teleoperation system for controlling multiple robots. AI21 Labs unveiled Jamba 1.5, a multilingual model with 256k context length and permissive licensing. Luma Labs released Dream Machine 1.5 for improved text-to-video generation. Ideogram launched v2 of its text-to-image model with near-perfect text generation. Nvidia and Mistral released Mistral-NeMo-Minitron 8B, a small model outperforming Mistral-7B and llama-3-8b on the Open LLM leaderboard.
Nvidia Minitron: LLM Pruning and Distillation updated for Llama 3.1
llama-3-1-8b llama-3-1 jamba-1.5 claude-3 dracarys-70b dracarys-72b mistral-nemo-minitron-8b mistral-7b nvidia meta-ai-fair ai21-labs anthropic hugging-face pruning knowledge-distillation weight-pruning activation-based-pruning width-pruning kl-divergence teacher-correction prompt-optimization multilinguality long-context mixture-of-experts model-fine-tuning
Nvidia and Meta researchers updated their Llama 3 results with a paper demonstrating the effectiveness of combining weight pruning and knowledge distillation to reduce training costs by training only the largest model from scratch and deriving smaller models via pruning and distillation. The process involves teacher correction, activation-based pruning (favoring width pruning), and retraining with distillation using KL Divergence loss, resulting in better-performing models at comparable sizes. However, distillation incurs some accuracy tradeoffs. Additionally, AI21 Labs launched Jamba 1.5, a hybrid SSM-Transformer MoE model with large context windows and multilingual support. Anthropic updated Claude 3 with LaTeX rendering and prompt caching. An open-source coding-focused LLM, Dracarys, was released in 70B and 72B sizes, showing improved coding performance. The Mistral Nemo Minitron 8B model outperforms Llama 3.1 8B and Mistral 7B on the Hugging Face leaderboard, highlighting pruning and distillation benefits. Research on prompt optimization reveals the complexity of prompt search spaces and the surprising effectiveness of simple algorithms like AutoPrompt/GCG.
super quiet day
jamba-1.5 phi-3.5 dracarys llama-3-1-70b llama-3-1 ai21-labs anthropic stanford hugging-face langchain qdrant aws elastic state-space-models long-context benchmarking ai-safety virtual-environments multi-agent-systems resource-management community-engagement model-performance bindu-reddy rohanpaul_ai jackclarksf danhendrycks reach_vb iqdotgraph
AI21 Labs released Jamba 1.5, a scaled-up State Space Model optimized for long context windows with 94B parameters and up to 2.5X faster inference, outperforming models like Llama 3.1 70B on benchmarks. The Phi-3.5 model was praised for its safety and performance, while Dracarys, a new 70B open-source coding model announced by Bindu Reddy, claims superior benchmarks over Llama 3.1 70B. Discussions on California's SB 1047 AI safety legislation involve Stanford and Anthropic, highlighting a balance between precaution and industry growth. Innovations include uv virtual environments for rapid setup, LangChain's LangSmith resource tags for project management, and multi-agent systems in Qdrant enhancing data workflows. Community events like the RAG workshop by AWS, LangChain, and Elastic continue to support AI learning and collaboration. Memes remain a popular way to engage with AI industry culture.