All tags
Topic: "model-deprecation"
SOTA Video Gen: Veo 2 and Kling 2 are GA for developers
veo-2 gemini gpt-4.1 gpt-4o gpt-4.5-preview gpt-4.1-mini gpt-4.1-nano google openai video-generation api coding instruction-following context-window performance benchmarks model-deprecation kevinweil stevenheidel aidan_clark_
Google's Veo 2 video generation model is now available in the Gemini API with a cost of 35 cents per second of generated video, marking a significant step in accessible video generation. Meanwhile, China's Kling 2 model launched with pricing around $2 for a 10-second clip and a minimum subscription of $700 per month for 3 months, generating excitement despite some skill challenges. OpenAI announced the GPT-4.1 family release, including GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano, highlighting improvements in coding, instruction following, and a 1 million token context window. The GPT-4.1 models are 26% cheaper than GPT-4o and will replace the GPT-4.5 Preview API version by July 14. Performance benchmarks show GPT-4.1 achieving 54-55% on SWE-bench verified and a 60% improvement over GPT-4o in some internal tests, though some critiques note it underperforms compared to other models like OpenRouter and DeepSeekV3 in coding tasks. The release is API-only, with a prompting guide provided for developers.
GPT 4.1: The New OpenAI Workhorse
gpt-4.1 gpt-4.1-mini gpt-4.1-nano gpt-4o gemini-2.5-pro openai llama-index perplexity-ai google-deepmind coding instruction-following long-context benchmarks model-pricing model-integration model-deprecation sama kevinweil omarsar0 aidan_mclau danhendrycks polynoamial scaling01 aravsrinivas lmarena_ai
OpenAI released GPT-4.1, including GPT-4.1 mini and GPT-4.1 nano, highlighting improvements in coding, instruction following, and handling long contexts up to 1 million tokens. The model achieves a 54 score on SWE-bench verified and shows a 60% improvement over GPT-4o on internal benchmarks. Pricing for GPT-4.1 nano is notably low at $0.10/1M input and $0.40/1M output. GPT-4.5 Preview is being deprecated in favor of GPT-4.1. Integration support includes Llama Index with day 0 support. Some negative feedback was noted for GPT-4.1 nano. Additionally, Perplexity's Sonar API ties with Gemini-2.5 Pro for the top spot in the LM Search Arena leaderboard. New benchmarks like MRCR and GraphWalks were introduced alongside updated prompting guides and cookbooks.
Mistral Large 2 + RIP Mistral 7B, 8x7B, 8x22B
mistral-large-2 mistral-nemo-12b llama-3.1-8b llama-3.1-70b llama-3.1 llama-3-405b yi-34b-200k gpt-4o mistral-ai meta-ai-fair groq togethercompute code-generation math function-calling reasoning context-windows model-deprecation pretraining posttraining benchmarking
Mistral Large 2 introduces 123B parameters with Open Weights under a Research License, focusing on code generation, math performance, and a massive 128k context window, improving over Mistral Large 1's 32k context. It claims better function calling capabilities than GPT-4o and enhanced reasoning. Meanwhile, Meta officially released Llama-3.1 models including Llama-3.1-70B and Llama-3.1-8B with detailed pre-training and post-training insights. The Llama-3.1 8B model's 128k context performance was found underwhelming compared to Mistral Nemo and Yi 34B 200K. Mistral is deprecating older Apache open-source models, focusing on Large 2 and Mistral Nemo 12B. The news also highlights community discussions and benchmarking comparisons.