All tags
Topic: "creative-ai"
Gemini 2.5 Deep Think finally ships
gemini-2.5-deep-think gpt-oss gpt-5 kimi-k2-turbo-preview qwen3-coder-flash glm-4.5 step-3 claude openai anthropic google-deepmind kimi-moonshot alibaba ollama zhipu-ai stepfun parallel-thinking model-releases moe attention-mechanisms multimodal-reasoning model-performance context-windows open-source-models model-leaks creative-ai coding reasoning model-optimization demishassabis philschmid scaling01 teortaxestex teknium1 lmarena_ai andrewyng
OpenAI is rumored to soon launch new GPT-OSS and GPT-5 models amid drama with Anthropic revoking access to Claude. Google DeepMind quietly launched Gemini 2.5 Deep Think, a model optimized for parallel thinking that achieved gold-medal level at the IMO and excels in reasoning, coding, and creative tasks. Leaks suggest OpenAI is developing a 120B MoE and a 20B model with advanced attention mechanisms. Chinese AI companies like Kimi Moonshot, Alibaba, and ZHIpu AI are releasing faster and more capable open models such as kimi-k2-turbo-preview, Qwen3-Coder-Flash, and GLM-4.5, signaling strong momentum and potential to surpass the U.S. in AI development. "The final checkpoint was selected just 5 hours before the IMO problems were released," highlighting rapid development cycles.
X.ai Grok 3 and Mira Murati's Thinking Machines
grok-3 grok-3-mini gemini-2-pro gpt-4o o3-mini-high o1 deepseek-r1 anthropic openai thinking-machines benchmarking reasoning reinforcement-learning coding multimodality safety alignment research-publishing model-performance creative-ai mira-murati lmarena_ai karpathy omarsar0 ibab arankomatsuzaki iscienceluvr scaling01
Grok 3 has launched with mixed opinions but strong benchmark performance, notably outperforming models like Gemini 2 Pro and GPT-4o. The Grok-3 mini variant shows competitive and sometimes superior capabilities, especially in reasoning and coding, with reinforcement learning playing a key role. Mira Murati has publicly shared her post-OpenAI plan, founding the frontier lab Thinking Machines, focusing on collaborative, personalizable AI, multimodality, and empirical safety and alignment research, reminiscent of Anthropic's approach.
12/21/2023: The State of AI (according to LangChain)
mixtral gpt-4 chatgpt bard dall-e langchain openai perplexity-ai microsoft poe model-consistency model-behavior response-quality chatgpt-usage-limitations error-handling user-experience model-comparison hallucination-detection prompt-engineering creative-ai
LangChain launched their first report based on LangSmith stats revealing top charts for mindshare. On OpenAI's Discord, users raised issues about the Mixtral model, noting inconsistencies and comparing it to Poe's Mixtral. There were reports of declining output quality and unpredictable behavior in GPT-4 and ChatGPT, with discussions on differences between Playground GPT-4 and ChatGPT GPT-4. Users also reported anomalous behavior in Bing and Bard AI models, including hallucinations and strange assertions. Various user concerns included message limits on GPT-4, response completion errors, chat lags, voice setting inaccessibility, password reset failures, 2FA issues, and subscription restrictions. Techniques for guiding GPT-4 outputs and creative uses with DALL-E were also discussed. Users highlighted financial constraints affecting subscriptions and queries about earning with ChatGPT and token costs.