All tags
Model: "claude-code"
3x in 3 months: Cursor @ $28b, Cognition + Windsurf @ $10b
qwen3-coder chatgpt-agent claude-code mini cursor cognition windsurf alibaba openai anthropic perplexity agentic-ai fundraising software-engineering ai-coding agentic-economy model-integration community-feedback performance-benchmarking bindureddy xikun_zhang_ aravsrinivas gergelyorosz jeremyphoward
Cursor is reportedly fundraising at a $28 billion valuation with $1 billion ARR, while the combined Cognition+Windsurf entity is fundraising at a $10 billion valuation after acquiring Windsurf remainco for $300 million. The competition between AI coding agents intensifies as Cursor focuses on Async SWE Agents and Cognition+Windsurf acquires an agentic IDE. Alibaba's Qwen3-Coder gains widespread adoption for coding tasks and integration into tools like Claude Code and LM Studio. OpenAI rolls out ChatGPT Agent to all Plus, Pro, and Team users, sparking discussions about an "agentic economy" emphasizing AI literacy. Anthropic's Claude Code is praised as a premier development tool with active community feedback. Perplexity's Comet browser assistant receives positive reviews and new feature showcases. The debate continues on whether AI coding tools will replace developers, with critiques highlighting the ongoing human effort required. A new minimalistic software engineering agent, mini, achieves 65% on SWE-bench with just 100 lines of code.
not much happened today
seedance-1.0 codex claude-code kling-2.1 veo-3 bytedance morph-labs huggingface deeplearning.ai figure-ai langchain sakana-ai video-generation autoformalization ai-assisted-coding api-design context-engineering reinforcement-learning ai-evals hypernetworks model-fine-tuning foundation-models andrew_ng hwchase17 adcock_brett clementdelangue akhaliq jxmnop hamelhusain sh_reya
Bytedance showcased an impressive state-of-the-art video generation model called Seedance 1.0 without releasing it, while Morph Labs announced Trinity, an autoformalization system for Lean. Huggingface Transformers deprecated Tensorflow/JAX support. Andrew Ng of DeepLearning.AI highlighted the rise of the GenAI Application Engineer role emphasizing skills in AI building blocks and AI-assisted coding tools like Codex and Claude Code. Engineering teams are increasingly testing API designs against LLMs for usability. Figure AI's CEO stressed speed as a key competitive advantage, and LangChain introduced the concept of Context Engineering for AI agents. Reinforcement learning on LLMs shows transformative potential, and the community values AI evals and data work. Sakana AI released Text-to-LoRA, a hypernetwork method for generating task-specific LoRA adapters from natural language, enabling efficient model customization. The video generation race heats up with Bytedance's Seed-based model praised for quality, challenging American labs, alongside models like Kling 2.1 and Veo 3.
AI Engineer World's Fair Talks Day 1
gemini-2.5 gemma claude-code mistral cursor anthropic openai aie google-deepmind meta-ai-fair agent-based-architecture open-source model-memorization scaling-laws quantization mixture-of-experts language-model-memorization model-generalization langgraph model-architecture
Mistral launched a new Code project, and Cursor released version 1.0. Anthropic improved Claude Code plans, while ChatGPT announced expanded connections. The day was dominated by AIE keynotes and tracks including GraphRAG, RecSys, and Tiny Teams. On Reddit, Google open-sourced the DeepSearch stack for building AI agents with Gemini 2.5 and LangGraph, enabling flexible agent architectures and integration with local LLMs like Gemma. A new Meta paper analyzed language model memorization, showing GPT-style transformers store about 3.5–4 bits/parameter and exploring the transition from memorization to generalization, with implications for Mixture-of-Experts models and quantization effects.
Claude 3.7 Sonnet
claude-3-7-sonnet claude-3 claude-code anthropic hybrid-reasoning extended-thinking coding-benchmarks agentic-ai prompt-caching streaming token-capacity tool-use
Anthropic launched Claude 3.7 Sonnet, their most intelligent model to date featuring hybrid reasoning with two thinking modes: near-instant and extended step-by-step thinking. The release includes Claude Code, an agentic coding tool in limited preview, and supports a 128k output token capability in beta. Claude 3.7 Sonnet performs well on coding benchmarks like SWE-Bench Verified and Cognition's junior-dev eval, and introduces advanced features such as streaming thinking, prompt caching, and tool use. The model is also benchmarked on Pokebench, reflecting agentic capabilities similar to the Voyager paper. The launch is accompanied by extensive documentation, cookbooks, and prompting guides for extended thinking. "The first generally available hybrid reasoning model" and "first coding tool from Anthropic" were highlighted in social media announcements.