All tags
Company: "upstage"
not much happened today
qwen-image-2512 ax-k1 k-exaone sk-telecom lg upstage naver alibaba unsloth replicate mixture-of-experts model-release quantization open-source-models image-generation model-integration model-benchmarking compute-costs dataset-curation eliebakouch clementdelangue dorialexander rising_sayak _akhaliq ostrisai ivanfioravanti yupp_ai
South Korea's Ministry of Science launched a coordinated program with 5 companies to develop sovereign foundation models from scratch, featuring large-scale MoE architectures like SK Telecom A.X-K1 (519B total / 33B active) and LG K-EXAONE (236B MoE / 23B active), with a total first-round budget of ~$140M. This initiative contrasts with EU approaches by focusing funding on fewer stakeholders and explicitly budgeting for data. Meanwhile, Alibaba's Qwen-Image-2512 emerges as a leading open-source image generation model, rapidly integrated into various toolchains including AI-Toolkit and local inference paths with quantization support, and hosted on platforms like Replicate. The model has undergone extensive blind testing with over 10,000 rounds on AI Arena, highlighting its ecosystem adoption.
12/13/2023 SOLAR10.7B upstages Mistral7B?
solar-10.7b llama-2 mistral-7b phi-2 gpt-4 gemini upstage nous-research openai mistral-ai microsoft depth-up-scaling pretraining synthetic-data gpu-training api-usage model-integration agi asi chat-models vision model-performance fine-tuning
Upstage released the SOLAR-10.7B model, which uses a novel Depth Up-Scaling technique built on the llama-2 architecture and integrates mistral-7b weights, followed by continued pre-training. The Nous community finds it promising but not exceptional. Additionally, weights for the phi-2 base model were released, trained on 1.4 trillion tokens including synthetic texts created by GPT-3 and filtered by GPT-4, using 96 A100 GPUs over 14 days. On OpenAI's Discord, users discussed challenges with various GPT models, including incoherent outputs, API usage limitations, and issues with GPT-4 Vision API. Conversations also covered understanding AGI and ASI, concerns about OpenAI's partnership with Axel Springer, and pricing changes for GPT Plus. Discussions included the Gemini chat model integrated into Bard and comparisons with GPT-4 performance.