All tags
Company: "waymo"
GPT 5.1 in ChatGPT: No evals, but adaptive thinking and instruction following
gpt-5.1 gpt-5.0 claude isaac-0.1 qwen3vl-235b glm-4.6 gemini openai anthropic waymo perceptron langchain llamaindex nousresearch adaptive-reasoning instruction-following personalization autonomous-driving robotics multimodality agent-evaluation agent-governance middleware structured-extraction benchmarking dmitri_dolgov jeffdean fidji_simo akshats07
OpenAI launched GPT-5.1 with improvements in conversational tone, instruction following, and adaptive reasoning. GPT-5.0 is being sunset in 3 months. ChatGPT introduces new tone toggles for personalization, serving over 800 million users. Waymo rolls out freeway driving for public riders in major California cities, showcasing advances in autonomous driving. Anthropic's Project Fetch explores LLMs as robotics copilots using Claude. Perceptron releases a new API and Python SDK for multimodal perception-action apps supporting Isaac-0.1 and Qwen3VL-235B. Code Arena offers live coding evaluations supporting Claude, GPT-5, GLM-4.6, and Gemini. LangChain introduces middleware for agent governance with human-in-the-loop controls. LlamaIndex releases a structured extraction template for SEC filings using LlamaAgents. NousResearch promotes ARC Prize benchmarks for generalized intelligence evaluation.
AI Engineer World's Fair: Second Run, Twice The Fun
gemini-2.5-pro google-deepmind waymo tesla anthropic braintrust retrieval-augmentation graph-databases recommendation-systems software-engineering-agents agent-reliability reinforcement-learning voice image-generation video-generation infrastructure security evaluation ai-leadership enterprise-ai mcp tiny-teams product-management design-engineering robotics foundation-models coding web-development demishassabis
The 2025 AI Engineer World's Fair is expanding with 18 tracks covering topics like Retrieval + Search, GraphRAG, RecSys, SWE-Agents, Agent Reliability, Reasoning + RL, Voice AI, Generative Media, Infrastructure, Security, and Evals. New focuses include MCP, Tiny Teams, Product Management, Design Engineering, and Robotics and Autonomy featuring foundation models from Waymo, Tesla, and Google. The event highlights the growing importance of AI Architects and enterprise AI leadership. Additionally, Demis Hassabis announced the Gemini 2.5 Pro Preview 'I/O edition', which leads coding and web development benchmarks on LMArena.