claude-3-7-sonnet gpt-4-1 gemini-3 qwen3-vl-embedding qwen3-vl-reranker glm-4-7 falcon-h1r-7b jamba2 stanford google google-deepmind alibaba z-ai tii ai21-labs huggingface copyright-extraction multimodality multilinguality retrieval-augmented-generation model-architecture mixture-of-experts model-quantization reasoning inference kernel-engineering memory-optimization enterprise-ai sundarpichai justinlin610
Stanford paper reveals Claude 3.7 Sonnet memorized 95.8% of Harry Potter 1, highlighting copyright extraction risks compared to GPT-4.1. Google AI Studio sponsors TailwindCSS amid OSS funding debates. Google and Sundar Pichai launch Gmail Gemini 3 features including AI Overviews and natural-language search with user controls. Alibaba Qwen releases Qwen3-VL-Embedding and Qwen3-VL-Reranker, a multimodal, multilingual retrieval stack supporting text, images, and video with quantization and instruction customization, achieving strong benchmark results. Z.ai goes public on HKEX with GLM-4.7 leading the Artificial Analysis Intelligence Index v4.0, showing gains in reasoning, coding, and agentic use, with large-scale MoE architecture and MIT license. Falcon-H1R-7B from TII targets efficient reasoning in smaller models, scoring 16 on the Intelligence Index. AI21 Labs introduces Jamba2, a memory-efficient enterprise model with hybrid SSM-Transformer architecture and Apache 2.0 license, available via SaaS and Hugging Face. vLLM shows throughput improvements in inference and kernel engineering. "Embeddings should be multimodal by default," notes Justin Lin.