All tags
Model: "gpt-4-turbo-0409"
Gemini Nano: 50-90% of Gemini Pro, <100ms inference, on device, in Chrome Canary
gemini-nano gemini-pro claude-3.5-sonnet gpt-4o deepseek-coder-v2 glm-0520 nemotron-4-340b gpt-4-turbo-0409 google gemini huggingface anthropic deepseek zhipu-ai tsinghua nvidia model-quantization prompt-api optimization model-weights benchmarking code-generation math synthetic-data automatic-differentiation retrieval-augmented-generation mitigating-memorization tree-search inference-time-algorithms adcock_brett dair_ai lmsysorg
The latest Chrome Canary now includes a feature flag for Gemini Nano, offering a prompt API and on-device optimization guide, with models Nano 1 and 2 at 1.8B and 3.25B parameters respectively, showing decent performance relative to Gemini Pro. The base and instruct-tuned model weights have been extracted and posted to HuggingFace. In AI model releases, Anthropic launched Claude 3.5 Sonnet, which outperforms GPT-4o on some benchmarks, is twice as fast as Opus, and is free to try. DeepSeek-Coder-V2 achieves 90.2% on HumanEval and 75.7% on MATH, surpassing GPT-4-Turbo-0409, with models up to 236B parameters and 128K context length. GLM-0520 from Zhipu AI/Tsinghua ranks highly in coding and overall benchmarks. NVIDIA announced Nemotron-4 340B, an open model family for synthetic data generation. Research highlights include TextGrad, a framework for automatic differentiation on textual feedback; PlanRAG, an iterative plan-then-RAG decision-making technique; a paper on goldfish loss to mitigate memorization in LLMs; and a tree search algorithm for language model agents.