All tags
Topic: "open-weight-models"
o3-mini launches, OpenAI on "wrong side of history"
o3-mini o1 gpt-4o mistral-small-3-24b deepseek-r1 openai mistral-ai deepseek togethercompute fireworksai_hq ai-gradio replicate reasoning safety cost-efficiency model-performance benchmarking api open-weight-models model-releases sam-altman
OpenAI released o3-mini, a new reasoning model available for free and paid users with a "high" reasoning effort option that outperforms the earlier o1 model on STEM tasks and safety benchmarks, costing 93% less per token. Sam Altman acknowledged a shift in open source strategy and credited DeepSeek R1 for influencing assumptions. MistralAI launched Mistral Small 3 (24B), an open-weight model with competitive performance and low API costs. DeepSeek R1 is supported by Text-generation-inference v3.1.0 and available via ai-gradio and replicate. The news highlights advancements in reasoning, cost-efficiency, and safety in AI models.
Rombach et al: FLUX.1 [pro|dev|schnell], $31m seed for Black Forest Labs
gemma-2-2b gpt-3.5-turbo-0613 mixtral-8x7b flux-1 stability-ai google-deepmind nvidia text-to-image text-to-video model-benchmarking open-weight-models model-distillation safety-classifiers sparse-autoencoders ai-coding-tools rohanpaul_ai fchollet bindureddy clementdelangue ylecun svpino
Stability AI co-founder Rombach launched FLUX.1, a new text-to-image model with three variants: pro (API only), dev (open-weight, non-commercial), and schnell (Apache 2.0). FLUX.1 outperforms Midjourney and Ideogram based on Black Forest Labs' ELO score and plans to expand into text-to-video. Google DeepMind released Gemma-2 2B, a 2 billion parameter open-source model that outperforms larger models like GPT-3.5-Turbo-0613 and Mixtral-8x7b on Chatbot Arena, optimized with NVIDIA TensorRT-LLM. The release includes safety classifiers (ShieldGemma) and sparse autoencoder analysis (Gemma Scope). Discussions highlight benchmarking discrepancies and US government support for open-weight AI models. Critiques of AI coding tools' productivity gains were also noted.