All tags
Company: "goodfireai"
Cursor 2.0 & Composer-1: Fast Models and New Agents UI
composer-1 gpt-oss-safeguard-20b gpt-oss-safeguard-120b gpt-oss gpt-5-mini cursor_ai openai huggingface ollama cerebras groq goodfireai rakuten agentic-coding reinforcement-learning mixture-of-experts fine-tuning policy-classification open-weight-models inference-stacks cost-efficiency multi-agent-systems ide voice-to-code code-review built-in-browser model-optimization sasha_rush dan_shipper samkottler ellev3n11 swyx
Cursor 2.0 launched with Composer-1, an agentic coding model optimized for speed and precision, featuring multi-agent orchestration, built-in browser for testing, and voice-to-code capabilities. OpenAI released gpt-oss-safeguard models (20B, 120B) for policy-based safety classification, open-weight and fine-tuned from gpt-oss, available on Hugging Face and supported by inference stacks like Ollama and Cerebras. Goodfire and Rakuten demonstrated sparse autoencoders for PII detection matching gpt-5-mini accuracy at significantly lower cost. The Cursor 2.0 update also includes a redesigned interface for managing multiple AI coding agents, marking a major advancement in AI IDEs. "Fast-not-slowest" tradeoff emphasized by early users for Composer-1, enabling rapid iteration with human-in-the-loop.
Grok 3 & 3-mini now API Available
grok-3 grok-3-mini gemini-2.5-flash o3 o4-mini llama-4-maverick gemma-3-27b openai llamaindex google-deepmind epochairesearch goodfireai mechanize agent-development agent-communication cli-tools reinforcement-learning model-evaluation quantization-aware-training model-compression training-compute hybrid-reasoning model-benchmarking
Grok 3 API is now available, including a smaller version called Grok 3 mini, which offers competitive pricing and full reasoning traces. OpenAI released a practical guide for building AI agents, while LlamaIndex supports the Agent2Agent protocol for multi-agent communication. Codex CLI is gaining traction with new features and competition from Aider and Claude Code. GoogleDeepMind launched Gemini 2.5 Flash, a hybrid reasoning model topping the Chatbot Arena leaderboard. OpenAI's o3 and o4-mini models show emergent behaviors from large-scale reinforcement learning. EpochAIResearch updated its methodology, removing Maverick from high FLOP models as Llama 4 Maverick training compute drops. GoodfireAI announced a $50M Series A for its Ember neural programming platform. Mechanize was founded to build virtual work environments and automation benchmarks. GoogleDeepMind's Quantisation Aware Training for Gemma 3 models reduces model size significantly, with open source checkpoints available.