subscribe / issues / tags /

Topic: "pipeline-optimization"

GPT 5.4: SOTA Knowledge Work -and- Coding -and- CUA Model, OpenAI is so very back

gpt-5.4 gpt-5.4-pro openai cursor_ai perplexity_ai arena native-computer-use long-context efficiency steering benchmarking gpu-kernels attention-mechanisms algorithmic-optimization pipeline-optimization sama reach_vb scaling01 danshipper yuchenj_uw

OpenAI launched GPT-5.4 and GPT-5.4 Pro with unified mainline and Codex models, featuring native computer use, up to ~1M token context, and efficiency improvements including a new Codex /fast mode. Benchmarks showed strong results like OSWorld-Verified 75.0% surpassing human baseline and GDPval 83% against industry pros. User feedback highlighted coding utility but raised concerns about pricing and overthinking. Integration with devtools like Cursor, Perplexity, and Arena was announced. In systems research, FlashAttention-4 (FA4) was introduced with near-matmul speed attention on Blackwell GPUs, featuring innovations like polynomial exp emulation and online softmax. "Steering mid-response" and "fewer tokens, faster speed" were emphasized as UX and efficiency improvements.

© 2026 • AINews

You can also subscribe by rss .

Press Esc or click anywhere to close