All tags
Topic: "semantic-search"
Moondream 2025.1.9: Structured Text, Enhanced OCR, Gaze Detection in a 2B Model
o1 vdr-2b-multi-v1 llava-mini openai llamaindex langchainai qdrant genmoai vision model-efficiency structured-output gaze-detection reasoning model-distillation multimodality embedding-models gan diffusion-models self-attention training-optimizations development-frameworks api cross-language-deployment semantic-search agentic-document-processing developer-experience philschmid saranormous jxmnop reach_vb iscienceluvr multimodalart arohan adcock_brett awnihannun russelljkaplan ajayj_
Moondream has released a new version that advances VRAM efficiency and adds structured output and gaze detection, marking a new frontier in vision model practicality. Discussions on Twitter highlighted advancements in reasoning models like OpenAI's o1, model distillation techniques, and new multimodal embedding models such as vdr-2b-multi-v1 and LLaVA-Mini, which significantly reduce computational costs. Research on GANs and decentralized diffusion models showed improved stability and performance. Development tools like MLX and vLLM received updates for better portability and developer experience, while frameworks like LangChain and Qdrant enable intelligent data workflows. Company updates include new roles and team expansions at GenmoAI. "Efficiency tricks are all you need."
Qdrant's BM42: "Please don't trust us"
claude-3.5-sonnet gemma-2 nano-llava-1.5 qdrant cohere stripe anthropic hugging-face stablequan_ai semantic-search benchmarking dataset-quality model-evaluation model-optimization vision fine-tuning context-windows nils-reimers jeremyphoward hamelhusain rohanpaul_ai
Qdrant attempted to replace BM25 and SPLADE with a new method called "BM42" combining transformer attention and collection-wide statistics for semantic and keyword search, but their evaluation using the Quora dataset was flawed. Nils Reimers from Cohere reran BM42 on better datasets and found it underperformed. Qdrant acknowledged the errors but still ran a suboptimal BM25 implementation. This highlights the importance of dataset choice and evaluation sanity checks in search model claims. Additionally, Stripe faced criticism for AI/ML model failures causing account and payment issues, prompting calls for alternatives. Anthropic revealed that Claude 3.5 Sonnet suppresses some answer parts with backend tags, sparking debate. Gemma 2 model optimizations allow 2x faster fine-tuning with 63% less memory and longer context windows, running up to 34B parameters on consumer GPUs. nanoLLaVA-1.5 was announced as a compact 1B parameter vision model with significant improvements.