All tags
Topic: "supercomputing"
xAI raises $20B Series E at ~$230B valuation
grok-5 claude-code xai nvidia cisco fidelity valor-equity-partners qatar-investment-authority mgx stepstone-group baron-capital-group hugging-face amd ai-infrastructure supercomputing robotics ai-hardware agentic-ai context-management token-optimization local-ai-assistants aakash_gupta fei-fei_li lisa_su clementdelangue thom_wolf saradu omarsar0 yuchenj_uw _catwu cursor_ai
xAI, Elon Musk's AI company, completed a massive $20 billion Series E funding round, valuing it at about $230 billion with investors like Nvidia, Cisco Investments, and others. The funds will support AI infrastructure expansion including Colossus I and II supercomputers and training Grok 5, leveraging data from X's 600 million monthly active users. At CES 2026, the focus was on "AI everywhere" with a strong emphasis on AI-first hardware and integration between NVIDIA and Hugging Face's LeRobot for robotics development. The Reachy Mini robot is gaining traction as a consumer robotics platform. In software, Claude Code is emerging as a popular local/private coding assistant, with new UI features in Claude Desktop and innovations like Cursor's dynamic context reducing token usage by nearly 47% in multi-MCP setups. "The 600 million MAU figure in xAI’s announcement combines X platform users with Grok users. That’s a clever framing choice."
gpt-image-1 - ChatGPT's imagegen model, confusingly NOT 4o, now available in API
gpt-image-1 o3 o4-mini gpt-4.1 eagle-2.5-8b gpt-4o qwen2.5-vl-72b openai nvidia hugging-face x-ai image-generation content-moderation benchmarking long-context multimodality model-performance supercomputing virology video-understanding model-releases kevinweil lmarena_ai _philschmid willdepue arankomatsuzaki epochairesearch danhendrycks reach_vb mervenoyann _akhaliq
OpenAI officially launched the gpt-image-1 API for image generation and editing, supporting features like alpha channel transparency and a "low" content moderation policy. OpenAI's models o3 and o4-mini are leading in benchmarks for style control, math, coding, and hard prompts, with o3 ranking #1 in several categories. A new benchmark called Vending-Bench reveals performance variance in LLMs on extended tasks. GPT-4.1 ranks in the top 5 for hard prompts and math. Nvidia's Eagle 2.5-8B matches GPT-4o and Qwen2.5-VL-72B in long-video understanding. AI supercomputer performance doubles every 9 months, with xAI's Colossus costing an estimated $7 billion and the US dominating 75% of global performance. The Virology Capabilities Test shows OpenAI's o3 outperforms 94% of expert virologists. Nvidia also released the Describe Anything Model (DAM), a multimodal LLM for detailed image and video captioning, now available on Hugging Face.
1/16/2024: TIES-Merging
mixtral-8x7b nous-hermes-2 frankendpo-4x7b-bf16 thebloke hugging-face nous-research togethercompute oak-ridge-national-laboratory vast-ai runpod mixture-of-experts random-gate-routing quantization gptq exl2-quants reinforcement-learning-from-human-feedback supercomputing trillion-parameter-models ghost-attention model-fine-tuning reward-models sanjiwatsuki superking__ mrdragonfox _dampf kaltcit rombodawg technotech
TheBloke's Discord community actively discusses Mixture of Experts (MoE) models, focusing on random gate routing layers for training and the challenges of immediate model use. There is a robust debate on quantization methods, comparing GPTQ and EXL2 quants, with EXL2 noted for faster execution on specialized hardware. A new model, Nous Hermes 2, based on Mixtral 8x7B and trained with RLHF, claims benchmark superiority but shows some inconsistencies. The Frontier supercomputer at Oak Ridge National Laboratory is highlighted for training a trillion-parameter LLM with 14TB RAM, sparking discussions on open-sourcing government-funded AI research. Additionally, the application of ghost attention in the academicat model is explored, with mixed reactions from the community. "Random gate layer is good for training but not for immediate use," and "EXL2 might offer faster execution on specialized hardware," are key insights shared.