All tags
Model: "gemini-ultra"
Google I/O in 60 seconds
gemini-1.5-pro gemini-flash gemini-ultra gemini-pro gemini-nano gemma-2 llama-3-70b paligemma imagen-3 veo google google-deepmind youtube tokenization model-performance fine-tuning vision multimodality model-release model-training model-optimization ai-integration image-generation watermarking hardware-optimization voice video-understanding
Google announced updates to the Gemini model family, including Gemini 1.5 Pro with 2 million token support, and the new Gemini Flash model optimized for speed with 1 million token capacity. The Gemini suite now includes Ultra, Pro, Flash, and Nano models, with Gemini Nano integrated into Chrome 126. Additional Gemini features include Gemini Gems (custom GPTs), Gemini Live for voice conversations, and Project Astra, a live video understanding assistant. The Gemma model family was updated with Gemma 2 at 27B parameters, offering near-llama-3-70b performance at half the size, plus PaliGemma, a vision-language open model inspired by PaLI-3. Other launches include DeepMind's Veo, Imagen 3 for photorealistic image generation, and a Music AI Sandbox collaboration with YouTube. SynthID watermarking now extends to text, images, audio, and video. The Trillium TPUv6 codename was revealed. Google also integrated AI across its product suite including Workspace, Email, Docs, Sheets, Photos, Search, and Lens. "The world awaits Apple's answer."
Gemini Ultra is out, to mixed reviews
gemini-ultra gemini-advanced solar-10.7b openhermes-2.5-mistral-7b subformer billm google openai mistral-ai hugging-face multi-gpu-support training-data-contamination model-merging model-alignment listwise-preference-optimization high-performance-computing parameter-sharing post-training-quantization dataset-viewer gpu-scheduling fine-tuning vram-optimization
Google released Gemini Ultra as a paid tier for "Gemini Advanced with Ultra 1.0" following the discontinuation of Bard. Reviews noted it is "slightly faster/better than ChatGPT" but with reasoning gaps. The Steam Deck was highlighted as a surprising AI workstation capable of running models like Solar 10.7B. Discussions in AI communities covered topics such as multi-GPU support for OSS Unsloth, training data contamination from OpenAI outputs, ethical concerns over model merging, and new alignment techniques like Listwise Preference Optimization (LiPO). The Mojo programming language was praised for high-performance computing. In research, the Subformer model uses sandwich-style parameter sharing and SAFE for efficiency, and BiLLM introduced 1-bit post-training quantization to reduce resource use. The OpenHermes dataset viewer tool was launched, and GPU scheduling with Slurm was discussed. Fine-tuning challenges for models like OpenHermes-2.5-Mistral-7B and VRAM requirements were also topics of interest.
Adept Fuyu-Heavy: Multimodal model for Agents
fuyu-heavy fuyu-8b gemini-pro claude-2 gpt4v gemini-ultra deepseek-coder-33b yi-34b-200k goliath-120b mistral-7b-instruct-v0.2 mamba rwkv adept hugging-face deepseek mistral-ai nous-research multimodality visual-question-answering direct-preference-optimization benchmarking model-size-estimation quantization model-merging fine-tuning instruct-tuning rms-optimization heterogeneous-ai-architectures recurrent-llms contrastive-preference-optimization
Adept launched Fuyu-Heavy, a multimodal model focused on UI understanding and visual QA, outperforming Gemini Pro on the MMMU benchmark. The model uses DPO (Direct Preference Optimization), gaining attention as a leading tuning method. The size of Fuyu-Heavy is undisclosed but estimated between 20B-170B parameters, smaller than rumored frontier models like Claude 2, GPT4V, and Gemini Ultra. Meanwhile, Mamba was rejected at ICLR for quality concerns. In Discord discussions, DeepSeek Coder 33B was claimed to outperform GPT-4 in coding tasks, and deployment strategies for large models like Yi-34B-200K and Goliath-120B were explored. Quantization debates highlighted mixed views on Q8 and EXL2 quants. Fine-tuning and instruct-tuning of Mistral 7B Instruct v0.2 were discussed, alongside insights on RMS optimization and heterogeneous AI architectures combining Transformers and Selective SSM (Mamba). The potential of recurrent LLMs like RWKV and techniques like Contrastive Preference Optimization (CPO) were also noted.
1/2/2024: Smol tweaks to Smol Talk
claude-2 bard copilot meta-ai gemini-ultra chatgpt openai meta-ai-fair perplexity-ai prompt-engineering api json yaml markdown chatbot image-generation vpn browser-compatibility personality-tuning plugin-issues
OpenAI Discord discussions highlight a detailed comparison of AI search engines including Perplexity, Copilot, Bard, and Claude 2, with Bard and Claude 2 trailing behind. Meta AI chatbot by Meta is introduced, available on Instagram and Whatsapp, featuring image generation likened to a free GPT version. Users report multiple browser issues with ChatGPT, including persistent captchas when using VPNs and plugin malfunctions. Debates cover prompt engineering, API usage, and data formats like JSON, YAML, and Markdown. Discussions also touch on ChatGPT's personality tuning and model capability variations. "Meta AI includes an image generation feature, which he likened to a free version of GPT."
12/7/2023: Anthropic says "skill issue"
claude-2.1 gpt-4 gpt-3.5 gemini-pro gemini-ultra gpt-4.5 chatgpt bingchat dall-e gpt-5 anthropic openai google prompt-engineering model-performance regulation language-model-performance image-generation audio-processing midi-sequence-analysis subscription-issues network-errors
Anthropic fixed a glitch in their Claude 2.1 model's needle in a haystack test by adding a prompt. Discussions on OpenAI's Discord compared Google's Gemini Pro and Gemini Ultra models with OpenAI's GPT-4 and GPT-3.5, with some users finding GPT-4 superior in benchmarks. Rumors about a GPT-4.5 release circulated without official confirmation. Concerns were raised about "selective censorship" affecting language model performance. The EU's potential regulation of AI, including ChatGPT, was highlighted. Users reported issues with ChatGPT Plus message limits and subscription upgrades, and shared experiences with BingChat and DALL-E. The community discussed prompt engineering techniques and future applications like image generation and MIDI sequence analysis, expressing hopes for GPT-5.
Is Google's Gemini... legit?
gemini gemini-pro gemini-ultra gpt-4 gpt-3.5 claude-2.1 palm2 google openai chain-of-thought context-windows prompt-engineering model-evaluation multimodality speech-processing chatbot-errors subscription-management swyx
Google's Gemini AI model is generating significant discussion and skepticism, especially regarding its 32-shot chain of thought MMLU claim and 32k context window. The community is comparing Gemini's performance and capabilities with OpenAI's GPT-4 and GPT-3.5, highlighting the upcoming Gemini Pro and Gemini Ultra models on the Bard platform. Users report various OpenAI service issues including chatbot errors and subscription problems. Discussions also cover prompt engineering techniques, AI model evaluation comparing GPT-4, Claude 2.1, and PaLM2, and improvements in speech and multimodal capabilities. The bot now supports reading and summarizing links from platforms like arXiv, Twitter, and YouTube, enhancing user interaction.