All tags
Model: "gemma-2-9b"
Gemma 2 2B + Scope + Shield
gemma-2b gemma-2-9b gemma-2-27b llama-3-1-405b sam-2 gpt-3.5 vicuna alpacaeval g-eval google-deepmind anthropic meta-ai-fair openai perplexity-ai nvidia lmsys knowledge-distillation leaderboards model-interpretability finetuning harm-detection video-segmentation voice publishers-program robotics-data-scaling quantization llm-evaluation prompt-engineering
Gemma 2B, a 2 billion parameter model trained on 2 trillion tokens and distilled from a larger unnamed LLM, has been released by Google DeepMind and shows strong leaderboard performance despite weaknesses in math. The Gemma series, including 9B and 27B models, has gained popularity since its June release. The team also released 400 SAEs for interpretability, inspired by Anthropic's research. A finetuned classifier called ShieldGemma outperforms Meta's LlamaGuard in harm detection. Meanwhile, Meta AI announced Llama-3.1-405B reaching #3 on the Overall Arena leaderboard, and released SAM 2, a video and image segmentation model with significant speed improvements. OpenAI is rolling out an advanced Voice Mode to Plus users. Perplexity AI launched a Publishers Program with major media partners and a status page. NVIDIA introduced Project GR00T for scaling robot data using Apple Vision Pro and generative simulation. Interest in quantization for compressing LLMs is growing, and LLM-as-a-Judge implementations from Vicuna, AlpacaEval, and G-Eval highlight the effectiveness of simple prompts and domain-specific evaluation.
Gemma 2 tops /r/LocalLlama vibe check
gemma-2-9b gemma-2-27b llama-3 mistral-7b phi-3 qwen gemma llamaindex mistral-ai cohere deepseek-ai nous-research eureka-labs model-comparison local-llms multilinguality model-efficiency fine-tuning ai-education ai-teaching-assistants andrej-karpathy
Gemma 2 (9B, 27B) is highlighted as a top-performing local LLM, praised for its speed, multilingual capabilities, and efficiency on consumer GPUs like the 2080ti. It outperforms models like Llama 3 and Mistral 7B in various tasks, including non-English text processing and reasoning. The community discussion on /r/LocalLlama reflects strong preference for Gemma 2, with 18 mentions, compared to 10 mentions for Llama 3 and 9 mentions for Mistral. Other models like Phi 3 and Qwen also received mentions but are considered surpassed by Gemma 2. Additionally, Andrej Karpathy announced the launch of Eureka Labs, an AI+Education startup aiming to create an AI-native school with AI Teaching Assistants, starting with the LLM101n course to teach AI training fundamentals. This initiative is seen as a significant development in AI education.
RouteLLM: RIP Martian? (Plus: AINews Structured Summaries update)
gpt-4 gemma-2-27b gemma-2-9b lmsys openai llm-routing cost-efficiency model-performance model-optimization data-augmentation syntax-based-routing mixture-of-experts inference-throughput software-2.0 computer-vision karpathy bindureddy armand-joulin
LMSys introduces RouteLLM, an open-source router framework trained on preference data from Chatbot Arena, achieving cost reductions over 85% on MT Bench, 45% on MMLU, and 35% on GSM8K while maintaining 95% of GPT-4's performance. This approach surpasses previous task-specific routing by using syntax-based Mixture of Experts (MoE) routing and data augmentation, beating commercial solutions by 40%. The update highlights advances in LLM routing, cost-efficiency, and model performance optimization across multiple models rather than single-model or MoE-level improvements. Additionally, the AI Twitter recap notes the Gemma 2 model family as a top open model, the Block Transformer architecture for improved inference throughput, and a proposal for a fully Software 2.0 computer vision system by karpathy.