All tags
Company: "teknim"
Meta Superintelligence Labs acquires Manus AI for over $2B, at $100M ARR, 9months after launch
glm-4.7 minimax-m2.1 vllm manus benchmark meta-ai-fair vllm amd sglang weaviate teknim baseten alphaxiv minimax performance-optimization inference-frameworks model-benchmarking model-deployment open-source-models multimodality api code-generation community-building alex_wang nat_friedman
Manus achieved a rapid growth trajectory in 2025, raising $500M from Benchmark and reaching $100M ARR before being acquired by Meta for an estimated $4B. The vLLM team launched a dedicated community site with new resources, while performance issues with AMD MI300X FP8 were noted in vLLM and sglang benchmarks. Weaviate released operational features including Object TTL, Java v6 client GA, and multimodal document embeddings. API fragmentation concerns were raised by Teknium advocating for unified SDK wrappers. In open-weight models, GLM-4.7 gained recognition as a reliable coding model with faster throughput on Baseten, and MiniMax-M2.1 rose as a leading open agentic coder model, topping WebDev leaderboards.
12/25/2023: Nous Hermes 2 Yi 34B for Christmas
nous-hermes-2 yi-34b nucleusx yayi-2 ferret teknim nous-research apple mixtral deepseek qwen huggingface wenge-technology quantization model-optimization throughput-metrics batch-processing parallel-decoding tensor-parallelization multimodality language-model-pretraining model-benchmarking teknium carsonpoole casper_ai pradeep1148 osanseviero metaldragon01
Teknium released Nous Hermes 2 on Yi 34B, positioning it as a top open model compared to Mixtral, DeepSeek, and Qwen. Apple introduced Ferret, a new open-source multimodal LLM. Discussions in the Nous Research AI Discord focused on AI model optimization and quantization techniques like AWQ, GPTQ, and AutoAWQ, with insights on proprietary optimization and throughput metrics. Additional highlights include the addition of NucleusX Model to transformers, a 30B model with 80 MMLU, and the YAYI 2 language model by Wenge Technology trained on 2.65 trillion tokens. "AutoAWQ outperforms vLLM up to batch size 8" was noted, and proprietary parallel decoding and tensor parallelization across GPUs were discussed for speed improvements.
12/20/2023: Project Obsidian - Multimodal Mistral 7B from Nous
gpt-4 gpt-3.5 dall-e-3 nous-research teknim openai multimodality image-detection security-api bias facial-recognition healthcare-ai gpu-optimization prompt-engineering vision
Project Obsidian is a multimodal model being trained publicly, tracked by Teknium on the Nous Discord. Discussions include 4M: Massively Multimodal Masked Modeling and Reason.dev, a TypeScript framework for LLM applications. The OpenAI Discord community discussed hardware specs for running TensorFlow JS for image detection, security API ideas for filtering inappropriate images, and concerns about racial and cultural bias in AI, especially in facial recognition and healthcare. Challenges with GPT-3.5 and GPT-4 in word puzzle games were noted, along with GPU recommendations prioritizing VRAM for AI inference. Users also debated GPT-4's vision capabilities, limitations of DALL·E 3, platform access issues, and prompting strategies for better outputs.