Model: "glm-5"

glm-5 glm-4.5 kimi-k2.5 zhipu-ai openrouter modal deepinfra ollama qoder vercel deepseek-sparse-attention long-context model-scaling pretraining benchmarking office-productivity context-window model-deployment cost-efficiency

Zhipu AI launched GLM-5, an Opus-class model scaling from 355B to 744B parameters with DeepSeek Sparse Attention integration for cost-efficient long-context serving. GLM-5 achieves SOTA on BrowseComp and leads on Vending Bench 2, focusing on office productivity tasks and surpassing Kimi K2.5 on the GDPVal-AA benchmark. Despite broad availability on platforms like OpenRouter, Modal, DeepInfra, and Ollama Cloud, GLM-5 faces compute constraints impacting rollout and pricing. The model supports up to 200K context length and 128K max output tokens.

You can also subscribe by rss .

Press Esc or click anywhere to close