All tags
Topic: "tool-calling"
Claude Skills grows: Open Standard, Directory, Org Admin
claude-skills gpt-5.2-codex gemini-3-flash functiongemma t5gemma-2 anthropic openai google-deepmind hugging-face agentic-ai fine-tuning long-context tool-calling on-device-ai multimodality security workflow-optimization sama gregbrockman philschmid
Claude Skills are gaining significant traction since their launch in October, with a milestone of 100k views in one day for the Claude Skills talk, signaling growing adoption and importance. Announcements include org admin support, a new Skills Directory, and the move to an open standard named Agent Skills. In frontier model launches, OpenAI released GPT-5.2-Codex, touted as the best agentic coding model with improvements in native compaction, long-context reliability, and tool-calling, emphasizing real-world security impacts. Google DeepMind introduced Gemini 3 Flash, focusing on speed as a product feature impacting workflows and user engagement, alongside FunctionGemma and T5Gemma 2, emphasizing on-device deployment, fine-tuning, and multimodality.
Gemini 3.0 Flash Preview: 1/4 cost of Pro, but ~as smart, retakes Pareto Frontier
gemini-3-flash gemini-3 gpt-5.2 gemini-3-pro google google-deepmind tool-calling multimodality benchmarking reasoning cost-efficiency model-performance context-window agentic-ai model-deployment sundar_pichai jeffdean demishassabis
Google launched Gemini 3 Flash, a pro-grade reasoning model with flash latency, supporting tool calling and multimodal IO, available via multiple platforms including Google AI Studio and Vertex AI. It offers competitive pricing at $0.50 per 1M input tokens and $3.00 per 1M output tokens, with context windows up to 1M tokens. Benchmarks show Gemini 3 Flash rivals or outperforms larger models like GPT-5.2 and Gemini 3 Pro in agentic, coding, and reasoning tasks, validated by ARC-AGI-2, SWE-bench, LMArena, and Arena benchmarks. Despite some tradeoffs like high token use and hallucination rates, it is cost-effective overall. Key figures include Sundar Pichai, Jeff Dean, and Demis Hassabis who publicly celebrated this achievement. The model's tool calling capabilities were demonstrated with 100 tools in a live demo.