All tags
Person: "johnschulman2"
not much happened today
gpt-5.5 codex thinking-machines openai anthropic multimodality real-time-interaction visual-proactivity deployment cybersecurity threat-modeling automation continuous-audio-video-text-processing security-models field-engineering enterprise-ai johnschulman2 soumithchintala chillee liliyu_lili rown kimmonismus giffmana swyx eliebakouch gdb sama therundownai lukolejnik matvelloso
Thinking Machines previewed their new native interaction models designed for full-duplex multimodal interaction enabling real-time concurrent listening, speaking, watching, thinking, searching, and reacting, marking a shift beyond turn-based AI. This approach emphasizes continuous audio, video, and text processing, with innovations like visual proactivity and background tool use, implemented using SGLang. Meanwhile, OpenAI announced the OpenAI Deployment Company, a new unit with 150 Forward Deployed Engineers and $4B initial investment to help enterprises deploy frontier models, signaling a move into the deployment layer of the AI economy. OpenAI also launched Daybreak, a security-focused initiative integrating GPT-5.5 and Codex for cyber defense, threat modeling, and automated patching, offering differentiated access tiers including GPT-5.5-Cyber. This contrasts with Anthropic's more restrictive cyber approach, highlighting tensions in AI security strategies.
not much happened today
claude-4 claude-4-opus claude-4-sonnet gemini-2.5-pro gemma-3n imagen-4-ultra anthropic google-deepmind openai codebase-understanding coding agentic-performance multimodality text-to-speech video-generation model-integration benchmarking memory-optimization cline amanrsanger ryanpgreenblatt johnschulman2 alexalbert__ nearcyan mickeyxfriedman jeremyphoward gneubig teortaxesTex scaling01 artificialanlys philschmid
Anthropic's Claude 4 models (Opus 4, Sonnet 4) demonstrate strong coding abilities, with Sonnet 4 achieving 72.7% on SWE-bench and Opus 4 at 72.5%. Claude Sonnet 4 excels in codebase understanding and is considered SOTA on large codebases. Criticism arose over Anthropic's handling of ASL-3 security requirements. Demand for Claude 4 is high, with integration into IDEs and support from Cherry Studio and FastHTML. Google DeepMind introduced Gemini 2.5 Pro Deep Think and Gemma 3n, a mobile multimodal model reducing RAM usage by nearly 3x. Google's Imagen 4 Ultra ranks third in the Artificial Analysis Image Arena, available on Vertex AI Studio. Google also promoted Google Beam, an AI video model for immersive 3D experiences, and new text-to-speech models with multi-speaker support. The GAIA benchmark shows Claude 4 Opus and Sonnet leading in agentic performance.