All tags
Topic: "cloud-computing"
OpenAI closes $110B raise from Amazon, NVIDIA, SoftBank in largest startup fundraise in history @ $840B post-money
codex chatgpt openai softbank nvidia amazon microsoft model-scaling model-metrics investment cloud-computing infrastructure training-capacity user-growth partnerships sama
OpenAI has closed a major funding round totaling $110 billion at a $730 billion pre-money valuation, with investments from SoftBank ($30B), NVIDIA ($30B), and Amazon ($50B). Key user metrics include 1.6 million weekly Codex users, over 9 million paying business users of ChatGPT, and more than 900 million weekly active ChatGPT users with 50 million consumer subscribers. The partnership with Amazon includes exclusive cloud services and 2 gigawatts of Trainium capacity. Microsoft maintains a reduced partnership with stateless APIs. This funding round is one of the largest in history, highlighting OpenAI's dominant position in AI adoption and infrastructure.
not much happened today
kimi-k2 qwen3-next nemotron-nano-2 granite-4.0 gpt-4.5 copilot codex vllm perplexity-ai ibm anthropic graphiti claude cursor-ai microsoft mixture-of-experts model-integration cloud-computing hybrid-models benchmarking agent-systems memory-persistence semantic-search code-retrieval context-length-optimization tool-use evaluation-frameworks software-development scaling01 cedric_chee aravsrinivas omarsar0 _avichawla pierceboggan jo_parkhurst jyangballin ofirpress ml_angelopoulos
Kimi-K2 Reasoner has been integrated into vLLM and will soon be supported by SGLang, featuring a massive 1.2 trillion parameter MoE configuration. Perplexity AI released research on cloud-portable trillion-parameter MoE kernels optimized for AWS EFA, with potential integration into vLLM. IBM's vLLM team formalized hybrid dense and sparse expert models, supporting models like Qwen3-Next, Nemotron Nano 2, and Granite 4.0. Kimi-K2 reportedly scores 77% on GPQA Diamond, outperforming GPT-4.5 at 71.4%, though this is unverified.
Anthropic published a guide on efficient tool-heavy agent systems using MCP patterns, drastically reducing context tokens by ~98.7%. Graphiti MCP demonstrated shared memory across apps like Claude Desktop and Cursor for persistent agent memory. VS Code introduced an "Agent sessions" feature to unify agent management, including Copilot and Codex. Cursor AI improved coding accuracy via semantic search and code retrieval embeddings. New evaluation frameworks like CodeClash and LMArena assess agent and coding model performance in realistic multi-round tasks and occupation-tagged leaderboards.
OpenAI completes Microsoft + For-profit restructuring + announces 2028 AI Researcher timeline + Platform / AI cloud product direction + next $1T of compute
OpenAI has completed a major recapitalization and restructuring, forming a Public Benefit Corporation with a non-profit Foundation holding special voting rights and equity valued at $130B. Microsoft holds about 27% diluted ownership and committed to $250B in Azure spend, losing exclusivity on compute but retaining Azure API exclusivity until AGI is declared. The compute infrastructure deals for 2025 total 30GW worth $1.4T, with OpenAI aiming to build 1GW per week at $20B per GW, projecting $3-4 trillion infrastructure by 2033. The company is shifting focus from first-party apps to a platform approach, emphasizing ecosystem growth and third-party development. Sam Altman and Sama are key figures in this transition, with significant financial and strategic implications for AI industry partnerships, including openness to Anthropic and Google Gemini on Azure.
OpenAI updates Codex, VSCode Extension that can sync tasks with Codex Cloud
codex stepwiser gemini-2.5-flash nemotron-cc-math jet-nemotron openai facebook-ai-fair google-deepmind nvidia process-reward-modeling reinforcement-learning chain-of-thought spatial-reasoning multi-image-fusion developer-tools code-review ide-extension cli cloud-computing model-efficiency jaseweston tesatory benjamindekr tokumin fabianstelzer officiallogank
OpenAI Codex has launched a new IDE Extension integrating with VS Code and Cursor, enabling seamless local and cloud task handoff, sign-in via ChatGPT plans, upgraded CLI, and GitHub code review automation. Facebook AI researchers introduced StepWiser, a process-level reward model improving reasoning and training by chunk-by-chunk evaluation, achieving SOTA on ProcessBench. Google DeepMind's Gemini 2.5 Flash Image model showcases advanced spatial reasoning, multi-image fusion, and developer tools including a browser extension for image remixing. NVIDIA revealed efficiency data on Nemotron-CC-Math (133B) and Jet-Nemotron models.
GPT4o August + 100% Structured Outputs for All (GPT4o mini edition)
gpt-4o-mini gpt-4o-2024-08-06 llama-3 bigllama-3.1-1t-instruct meta-llama-3-120b-instruct gemma-2-2b stability-ai unsloth-ai google hugging-face lora controlnet line-art gpu-performance multi-gpu-support fine-tuning prompt-formatting cloud-computing text-to-image-generation model-integration
Stability.ai users are leveraging LoRA and ControlNet for enhanced line art and artistic style transformations, while facing challenges with AMD GPUs due to the discontinuation of ZLUDA. Community tensions persist around the r/stablediffusion subreddit moderation. Unsloth AI users report fine-tuning difficulties with LLaMA3 models, especially with PPO trainer integration and prompt formatting, alongside anticipation for multi-GPU support and cost-effective cloud computing on RunPod. Google released the lightweight Gemma 2 2B model optimized for on-device use with 2.6B parameters, featuring safety and sparse autoencoder tools, and announced Diffusers integration for efficient text-to-image generation on limited resources.