Company: "weaviate"

Dec 29, 2025

Meta Superintelligence Labs acquires Manus AI for over $2B, at $100M ARR, 9months after launch

glm-4.7 minimax-m2.1 vllm manus benchmark meta-ai-fair vllm amd sglang weaviate teknim baseten alphaxiv minimax performance-optimization inference-frameworks model-benchmarking model-deployment open-source-models multimodality api code-generation community-building alex_wang nat_friedman

Manus achieved a rapid growth trajectory in 2025, raising $500M from Benchmark and reaching $100M ARR before being acquired by Meta for an estimated $4B. The vLLM team launched a dedicated community site with new resources, while performance issues with AMD MI300X FP8 were noted in vLLM and sglang benchmarks. Weaviate released operational features including Object TTL, Java v6 client GA, and multimodal document embeddings. API fragmentation concerns were raised by Teknium advocating for unified SDK wrappers. In open-weight models, GLM-4.7 gained recognition as a reliable coding model with faster throughput on Baseten, and MiniMax-M2.1 rose as a leading open agentic coder model, topping WebDev leaderboards.

Nov 13, 2025

minor updates to GPT 5.1 and SIMA 2

gpt-5.1 gpt-5.1-codex gpt-5.1-codex-mini sima-2 gemini openai google-deepmind github microsoft cursor_ai perplexity-ai weaviate llamaindex adaptive-reasoning agentic-coding tool-use context-engineering memory-architecture self-improvement retrieval-augmentation database-query-planning chart-parsing robotics sama allisontam_ cline cognition demishassabis omarsar0 helloiamleonie

OpenAI released GPT-5.1 family models including 5.1-Codex and 5.1-Codex-Mini with improved steerability, faster responses, and new tools like apply_patch and shell command execution. Pricing remains unchanged from 5.0. Immediate integrations include GitHub Copilot, VS Code, Cursor, and Perplexity adopting GPT-5.1 models. Google DeepMind announced SIMA 2, a Gemini-powered agent capable of language instruction following, planning, and self-improvement without human feedback, targeting robotics applications. New research on context engineering and agentic tool use patterns was published, with contributions from Weaviate and LlamaIndex on database query planning and chart parsing respectively. "Adaptive reasoning" and agentic coding improvements are highlighted in GPT-5.1- Instant.

Nov 04, 2025

not much happened today

trillium gemini-2.5-pro gemini-deepthink google huawei epoch-ai deutsche-telekom nvidia anthropic reka-ai weaviate deepmind energy-efficiency datacenters mcp context-engineering instruction-following embedding-models math-reasoning benchmarking code-execution sundarpichai yuchenj_uw teortaxestex epochairesearch scaling01 _avichawla rekaailabs anthropicai douwekiela omarsar0 nityeshaga goodside iscienceluvr lmthang

Google's Project Suncatcher prototypes scalable ML compute systems in orbit using solar energy with Trillium-generation TPUs surviving radiation, aiming for prototype satellites by 2027. China's 50% electricity subsidies for datacenters may offset chip efficiency gaps, with Huawei planning gigawatt-scale SuperPoDs for DeepSeek by 2027. Epoch launched an open data center tracking hub, and Deutsche Telekom and NVIDIA announced a $1.1B Munich facility with 10k GPUs. In agent stacks, MCP (Model-Compute-Platform) tools gain traction with implementations like LitServe, Claude Desktop, and Reka's MCP server for VS Code. Anthropic emphasizes efficient code execution with MCP. Context engineering shifts focus from prompt writing to model input prioritization, with reports and tools from Weaviate, Anthropic, and practitioners highlighting instruction-following rerankers and embedding approaches. DeepMind's IMO-Bench math reasoning suite shows Gemini DeepThink achieving high scores, with a ProofAutoGrader correlating strongly with human grading. Benchmarks and governance updates include new tasks and eval sharing in lighteval.

Sep 17, 2025

not much happened today

gpt-5 gemini-2.5-deep-think anthropic openai google-deepmind apollo-evaluations github hugging-face weaviate reasoning reinforcement-learning alignment chain-of-thought model-evaluation agent-frameworks ide-integration natural-language-to-sql real-time-voice sama merettm woj_zaremba markchen90 esyudkowsky

Anthropic published an in-depth postmortem on their August-September reliability issues. OpenAI's GPTeam achieved a perfect 12/12 score at the ICPC 2025 World Finals, showcasing rapid progress in general-purpose reasoning and introducing controllable "thinking time" tiers for gpt-5 in ChatGPT. Google DeepMind's gemini-2.5-deep-think earned a gold medal level at ICPC, solving 10/12 problems with advances in parallel thoughts, multi-step reasoning, and novel reinforcement learning techniques. OpenAI and Apollo Evaluations detected "scheming" behaviors in frontier models, emphasizing the need for chain-of-thought transparency and launching a $500K Kaggle challenge. GitHub launched an MCP server registry integrated with VS Code Insiders, with additional support from JetBrains and Hugging Face for open LLMs in Copilot Chat. Weaviate released a native Query Agent translating natural language to database operations with citations.

Sep 04, 2025

not much happened today

embeddinggemma qwen-2.5-coder minicpm-v-4.5 gpt-4o gemini-2.0-pro google-deepmind hugging-face jina-ai lighton microsoft stanford openai ollama weaviate langchain llamaindex embeddings retrieval-augmented-generation quantization multilingual-models on-device-ai semantic-search contrastive-learning dataset-release vision multimodality video-generation text-to-speech optimizer-benchmarking training-recipes model-compression video-token-compression fine-tuning osanseviero _philschmid tomaarsen ollama weaviate_io lusxvr andimarafioti thibaudfrere _akhaliq clementdelangue gordonwetzstein konstmish wen_kaiyue percyliang

Google DeepMind released EmbeddingGemma (308M), a small multilingual embedding model optimized for on-device retrieval-augmented generation and semantic search, supporting over 100 languages and running efficiently with quantization and EdgeTPU latency under 15ms. Jina AI introduced new code-focused embedding models (0.5B/1.5B) with GGUF quantization, achieving state-of-the-art retrieval across multiple languages and tasks. LightOn demonstrated large-scale retrieval training without distillation using contrastive training on billions of passages. Hugging Face released the FineVision dataset with 17.3M images and 9.5B answer tokens for vision-language model training, showing significant benchmark improvements. The MiniCPM-V 4.5 (8B) multimodal model reported surpassing GPT-4o and Gemini-2.0 Pro on OpenCompass benchmarks with innovative video token compression. Microsoft’s VibeVoice TTS and Stanford’s Mixture-of-Contexts video generation also featured. Additionally, a Stanford study benchmarked optimizers like Muon, Soap, Mars, and Sophia, finding diminishing speedups over AdamW at larger scales but advantages at smaller scales. The new ChatGPT branching feature was noted for its simplicity and popularity. "Everyone's a decacorn now."

Mar 05, 2025

not much happened today

aya-vision-8b aya-vision-32b llama-3-2-90b-vision molmo-72b phi-4-mini phi-4-multimodal cogview4 wan-2-1 weights-and-biases coreweave cohereforai microsoft alibaba google llamaindex weaviate multilinguality vision multimodality image-generation video-generation model-releases benchmarking funding agentic-ai model-performance mervenoyann reach_vb jayalammar sarahookr aidangomez nickfrosst dair_ai akhaliq bobvanluijt jerryjliu0

Weights and Biases announced a $1.7 billion acquisition by CoreWeave ahead of CoreWeave's IPO. CohereForAI released the Aya Vision models (8B and 32B parameters) supporting 23 languages, outperforming larger models like Llama-3.2 90B Vision and Molmo 72B. Microsoft introduced Phi-4-Mini (3.8B parameters) and Phi-4-Multimodal models, excelling in math, coding, and multimodal benchmarks. CogView4, a 6B parameter text-to-image model with 2048x2048 resolution and Apache 2.0 license, was released. Alibaba launched Wan 2.1, an open-source video generation model with 720p output and 16 fps generation. Google announced new AI features for Pixel devices including Scam Detection and Gemini integrations. LlamaCloud reached General Availability and raised $19M Series A funding, serving over 100 Fortune 500 companies. Weaviate launched the Query Agent, the first of three Weaviate Agents.

Nov 01, 2024

The AI Search Wars Have Begun — SearchGPT, Gemini Grounding, and more

gpt-4o o1-preview claude-3.5-sonnet universal-2 openai google gemini nyt perplexity-ai glean nvidia langchain langgraph weights-biases cohere weaviate fine-tuning synthetic-data distillation hallucinations benchmarking speech-to-text robotics neural-networks ai-agents sam-altman alexalbert__ _jasonwei svpino drjimfan virattt

ChatGPT launched its search functionality across all platforms using a fine-tuned version of GPT-4o with synthetic data generation and distillation from o1-preview. This feature includes a Chrome extension promoted by Sam Altman but has issues with hallucinations. The launch coincides with Gemini introducing Search Grounding after delays. Notably, The New York Times is not a partner due to a lawsuit against OpenAI. The AI search competition intensifies with consumer and B2B players like Perplexity and Glean. Additionally, Claude 3.5 Sonnet achieved a new benchmark record on SWE-bench Verified, and a new hallucination evaluation benchmark, SimpleQA, was introduced. Other highlights include the Universal-2 speech-to-text model with 660M parameters and HOVER, a neural whole-body controller for humanoid robots trained in NVIDIA Isaac simulation. AI hedge fund teams using LangChain and LangGraph were also showcased. The news is sponsored by the RAG++ course featuring experts from Weights & Biases, Cohere, and Weaviate.

Sep 25, 2024

Llama 3.2: On-device 1B/3B, and Multimodal 11B/90B (with AI2 Molmo kicker)

llama-3-2 llama-3-1 claude-3-haiku gpt-4o-mini molmo-72b molmo-7b gemma-2 phi-3-5 llama-3-2-vision llama-3-2-3b llama-3-2-20b meta-ai-fair ai2 qualcomm mediatek arm ollama together-ai fireworks-ai weights-biases cohere weaviate multimodality vision context-windows quantization model-release tokenization model-performance model-optimization rag model-training instruction-following mira-murati daniel-han

Meta released Llama 3.2 with new multimodal versions including 3B and 20B vision adapters on a frozen Llama 3.1, showing competitive performance against Claude Haiku and GPT-4o-mini. AI2 launched multimodal Molmo 72B and 7B models outperforming Llama 3.2 in vision tasks. Meta also introduced new 128k-context 1B and 3B models competing with Gemma 2 and Phi 3.5, with collaborations hinted with Qualcomm, Mediatek, and Arm for on-device AI. The release includes a 9 trillion token count for Llama 1B and 3B. Partner launches include Ollama, Together AI offering free 11B model access, and Fireworks AI. Additionally, a new RAG++ course from Weights & Biases, Cohere, and Weaviate offers systematic evaluation and deployment guidance for retrieval-augmented generation systems based on extensive production experience.

Sep 14, 2024

Learnings from o1 AMA

o1-preview o1-mini claude-3.5-sonnet gpt-4o openai weights-biases cohere weaviate reinforcement-learning chain-of-thought reasoning model-performance prompting code-editing rag hybrid-search sama rohanpaul_ai gdb andrew-mayne

OpenAI released the o1 model series, touted as their "most capable and aligned models yet," trained with reinforcement learning to enhance reasoning. The o1-preview model scored 21% on ARC-AGI, ~80% on aider code editing (surpassing Claude 3.5 Sonnet's 77%), and ~52% on Cognition-Golden, showcasing a shift from memorizing answers to memorizing reasoning. The model employs a unique chain-of-thought approach enabling "System II thinking" for better problem-solving. Experts like Andrew Mayne advise framing o1 as a smart friend providing thoughtful explanations. Additionally, an advanced RAG course sponsored by Weights & Biases, Cohere, and Weaviate offers strategies for hybrid search and prompting to optimize AI solutions.