All tags
Person: "jeffdean"
Gemini 3.0 Flash Preview: 1/4 cost of Pro, but ~as smart, retakes Pareto Frontier
gemini-3-flash gemini-3 gpt-5.2 gemini-3-pro google google-deepmind tool-calling multimodality benchmarking reasoning cost-efficiency model-performance context-window agentic-ai model-deployment sundar_pichai jeffdean demishassabis
Google launched Gemini 3 Flash, a pro-grade reasoning model with flash latency, supporting tool calling and multimodal IO, available via multiple platforms including Google AI Studio and Vertex AI. It offers competitive pricing at $0.50 per 1M input tokens and $3.00 per 1M output tokens, with context windows up to 1M tokens. Benchmarks show Gemini 3 Flash rivals or outperforms larger models like GPT-5.2 and Gemini 3 Pro in agentic, coding, and reasoning tasks, validated by ARC-AGI-2, SWE-bench, LMArena, and Arena benchmarks. Despite some tradeoffs like high token use and hallucination rates, it is cost-effective overall. Key figures include Sundar Pichai, Jeff Dean, and Demis Hassabis who publicly celebrated this achievement. The model's tool calling capabilities were demonstrated with 100 tools in a live demo.
Nano Banana Pro (Gemini Image Pro) solves text-in-images, infographic generation, 2-4k resolution, and Google Search grounding
gemini-3-pro gpt-5 google openai hugging-face togethercompute lmsys image-generation text-rendering model-provenance scientific-research proof-assistance multimodal-integration api-access fine-tuning jeffdean kevinweil demishassabis
Google launched Gemini 3 Pro Image (Nano Banana Pro), a next-generation AI image generation and editing model with integrated Google Search grounding, multi-image composition, and fine-grained visual controls, offering pricing at $0.134 per 2K image and $0.24 per 4K image. It features improved text rendering with error rates dropping from 56% to 8% compared to its predecessor, and includes SynthID watermark checks for provenance. The model is available via Gemini App, API, LM Arena, Hugging Face Spaces, Together AI, and Flow. Meanwhile, OpenAI shared early experiments with GPT-5 accelerating scientific research, including proofs of previously unsolved problems in math, physics, biology, and materials science. "GPT-5 accelerated research tasks in math/physics/biology/materials; in 4, it helped find proofs of previously unsolved problems."
GPT 5.1 in ChatGPT: No evals, but adaptive thinking and instruction following
gpt-5.1 gpt-5.0 claude isaac-0.1 qwen3vl-235b glm-4.6 gemini openai anthropic waymo perceptron langchain llamaindex nousresearch adaptive-reasoning instruction-following personalization autonomous-driving robotics multimodality agent-evaluation agent-governance middleware structured-extraction benchmarking dmitri_dolgov jeffdean fidji_simo akshats07
OpenAI launched GPT-5.1 with improvements in conversational tone, instruction following, and adaptive reasoning. GPT-5.0 is being sunset in 3 months. ChatGPT introduces new tone toggles for personalization, serving over 800 million users. Waymo rolls out freeway driving for public riders in major California cities, showcasing advances in autonomous driving. Anthropic's Project Fetch explores LLMs as robotics copilots using Claude. Perceptron releases a new API and Python SDK for multimodal perception-action apps supporting Isaac-0.1 and Qwen3VL-235B. Code Arena offers live coding evaluations supporting Claude, GPT-5, GLM-4.6, and Gemini. LangChain introduces middleware for agent governance with human-in-the-loop controls. LlamaIndex releases a structured extraction template for SEC filings using LlamaAgents. NousResearch promotes ARC Prize benchmarks for generalized intelligence evaluation.