All tags
Topic: "coding-workflows"
not much happened today
glm-5 glm-5.2 kimi nemotron prime-intellect wandb vibrant-labs anthropic executor yc agentic-reinforcement-learning moe-models inference-optimization training-optimization rollout-orchestration persistent-agents asynchronous-agents organizational-agents agent-ux open-models coding-workflows security post-training benchmarking task-specific-rollouts samsja19 eliebakouch mervenoyann wandb claudeai claudedevs _catwu karpathy zhihu-frontier hwchase17 teknuim rhyssullivan joshua_saxe
Prime Intellect's
prime-rl v0.6.0 advances agentic reinforcement learning infrastructure supporting 1 trillion parameter MoE models with sub-5-minute step times and a 131k context GLM-5 agentic setup. The release includes optimizations in inference, training, and rollout orchestration, supporting models like GLM5, Kimi, Nemotron. Anthropic's Claude Tag exemplifies the shift to persistent, asynchronous agents embedded in organizations, already writing 65% of the product team's code and operating as background watchers and proactive task executors in workflows. The ecosystem features innovations like StarAgent, Self-Harness, Hermes Agent, and Executor's MCP gateway for operational agent fleets. GLM-5.2 gains momentum as a leading open model, especially for coding and agentic workflows, raising security concerns about enabling private offensive workflows without API logging. This highlights a broader trend of agent training becoming an infrastructure challenge, with emphasis on open post-training stacks, verifiable environments, and task-specific rollouts. Cohere Command A Reasoning beats GPT-OSS-120B and DeepSeek R1 0528
command-a-reasoning deepseek-v3.1 cohere deepseek intel huggingface baseten vllm-project chutes-ai anycoder agentic-ai hybrid-models long-context fp8-training mixture-of-experts benchmarking quantization reasoning coding-workflows model-pricing artificialanlys reach_vb scaling01 cline ben_burtenshaw haihaoshen jon_durbin _akhaliq willccbb teortaxestex
Cohere's Command A Reasoning model outperforms GPT-OSS in open deep research capabilities, emphasizing agentic use cases for 2025. DeepSeek-V3.1 introduces a hybrid reasoning architecture toggling between reasoning and non-reasoning modes, optimized for agentic workflows and coding, with extensive long-context pretraining (~630B tokens for 32k context, ~209B for 128k), FP8 training, and a large MoE expert count (~37B). Benchmarks show competitive performance with notable improvements in SWE-Bench and other reasoning tasks. The model supports a $0.56/M input and $1.68/M output pricing on the DeepSeek API and enjoys rapid ecosystem integration including HF weights, INT4 quantization by Intel, and vLLM reasoning toggles. Community feedback highlights the hybrid design's pragmatic approach to agent and software engineering workflows, though some note the lack of tool use in reasoning mode.