All tags  
  Model: "qwen-2.5-coder"
 not much happened today 
   embeddinggemma  qwen-2.5-coder  minicpm-v-4.5  gpt-4o  gemini-2.0-pro   google-deepmind  hugging-face  jina-ai  lighton  microsoft  stanford  openai  ollama  weaviate  langchain  llamaindex   embeddings  retrieval-augmented-generation  quantization  multilingual-models  on-device-ai  semantic-search  contrastive-learning  dataset-release  vision  multimodality  video-generation  text-to-speech  optimizer-benchmarking  training-recipes  model-compression  video-token-compression  fine-tuning   osanseviero  _philschmid  tomaarsen  ollama  weaviate_io  lusxvr  andimarafioti  thibaudfrere  _akhaliq  clementdelangue  gordonwetzstein  konstmish  wen_kaiyue  percyliang  
 Google DeepMind released EmbeddingGemma (308M), a small multilingual embedding model optimized for on-device retrieval-augmented generation and semantic search, supporting over 100 languages and running efficiently with quantization and EdgeTPU latency under 15ms. Jina AI introduced new code-focused embedding models (0.5B/1.5B) with GGUF quantization, achieving state-of-the-art retrieval across multiple languages and tasks. LightOn demonstrated large-scale retrieval training without distillation using contrastive training on billions of passages. Hugging Face released the FineVision dataset with 17.3M images and 9.5B answer tokens for vision-language model training, showing significant benchmark improvements. The MiniCPM-V 4.5 (8B) multimodal model reported surpassing GPT-4o and Gemini-2.0 Pro on OpenCompass benchmarks with innovative video token compression. Microsoft’s VibeVoice TTS and Stanford’s Mixture-of-Contexts video generation also featured. Additionally, a Stanford study benchmarked optimizers like Muon, Soap, Mars, and Sophia, finding diminishing speedups over AdamW at larger scales but advantages at smaller scales. The new ChatGPT branching feature was noted for its simplicity and popularity. "Everyone's a decacorn now."
  Common Corpus: 2T Open Tokens with Provenance 
   qwen-2.5-coder  claude-3.5-sonnet  janusflow-1.3b  ocronos-vintage   pleais  huggingface  langchainai  deepseek  alibaba  anthropic   provenance  ocr  multilingual-datasets  prompt-engineering  multimodality  image-generation  code-generation  quantization  model-scaling  inference-efficiency   tim-dettmers  tom-doerr  omarsar0  swyx  madiator  reach_vb  
 Pleais via Huggingface released Common Corpus, the largest fully open multilingual dataset with over 2 trillion tokens including detailed provenance information. They also introduced OCRonos-Vintage, a 124M-parameter OCR correction model that efficiently fixes digitization errors on CPU and GPU, unlocking knowledge from PDFs. On AI tools, LangChainAI launched Prompt Canvas for collaborative prompt engineering, while DeepSeek released JanusFlow 1.3B, a unified multimodal LLM integrating autoregressive and rectified flow models for enhanced image understanding and generation. Alibaba Cloud announced Qwen2.5-Coder, a code-focused LLM with advanced coding capabilities, and Claude 3.5 Sonnet was highlighted for superior code generation. Discussions on quantization challenges and scaling laws for precision by Tim Dettmers and others emphasized the impact of low-precision training on model scalability and inference efficiency. "Scaling Laws for Precision" paper insights and alternative efficiency methods were also noted.