All tags
Model: "command-r"
Music's Dall-E moment
griffin command-r-plus gpt-4-0613 gpt-4-0314 mistral-8x22b codegemma stable-diffusion-1.5 command-r gemini-1.5 google mistral-ai lmsys cohere model-architecture benchmarking open-source model-quantization memory-optimization inference-speed multimodality finetuning performance-optimization audio-processing andrej-karpathy
Google's Griffin architecture outperforms transformers with faster inference and lower memory usage on long contexts. Command R+ climbs to 6th place on the LMSYS Chatbot Arena leaderboard, surpassing GPT-4-0613 and GPT-4-0314. Mistral AI releases an open-source 8x22B model with a 64K context window and around 130B total parameters. Google open-sources CodeGemma models with pre-quantized 4-bit versions for faster downloads. Ella weights enhance Stable Diffusion 1.5 with LLM for semantic alignment. Unsloth enables 4x larger context windows and 80% memory reduction for finetuning. Andrej Karpathy releases LLMs implemented in pure C for potential performance gains. Command R+ runs in realtime on M2 Max MacBook using iMat q1 quantization. Cohere's Command R model offers low API costs and strong leaderboard performance. Gemini 1.5 impresses with audio capabilities recognizing speech tone and speaker identification from audio clips.
Not much happened today
jamba-v0.1 command-r gpt-3.5-turbo openchat-3.5-0106 mixtral-8x7b mistral-7b midnight-miqu-70b-v1.0.q5_k_s cohere lightblue openai mistral-ai nvidia amd hugging-face ollama rag mixture-of-experts model-architecture model-analysis debate-persuasion hardware-performance gpu-inference cpu-comparison local-llm stable-diffusion ai-art-bias
RAGFlow open sourced, a deep document understanding RAG engine with 16.3k context length and natural language instruction support. Jamba v0.1, a 52B parameter MoE model by Lightblue, released but with mixed user feedback. Command-R from Cohere available on Ollama library. Analysis of GPT-3.5-Turbo architecture reveals about 7 billion parameters and embedding size of 4096, comparable to OpenChat-3.5-0106 and Mixtral-8x7B. AI chatbots, including GPT-4, outperform humans in debates on persuasion. Mistral-7B made amusing mistakes on a math riddle. Hardware highlights include a discounted HGX H100 640GB machine with 8 H100 GPUs bought for $58k, and CPU comparisons between Epyc 9374F and Threadripper 1950X for LLM inference. GPU recommendations for local LLMs focus on VRAM and inference speed, with users testing 4090 GPU and Midnight-miqu-70b-v1.0.q5_k_s model. Stable Diffusion influences gaming habits and AI art evaluation shows bias favoring human-labeled art.
MM1: Apple's first Large Multimodal Model
mm1 gemini-1 command-r claude-3-opus claude-3-sonnet claude-3-haiku claude-3 apple cohere anthropic hugging-face langchain multimodality vqa fine-tuning retrieval-augmented-generation open-source robotics model-training react reranking financial-agents yann-lecun francois-chollet
Apple announced the MM1 multimodal LLM family with up to 30B parameters, claiming performance comparable to Gemini-1 and beating larger older models on VQA benchmarks. The paper targets researchers and hints at applications in embodied agents and business/education. Yann LeCun emphasized that human-level AI requires understanding the physical world, memory, reasoning, and hierarchical planning, while Fran ois Chollet cautioned that NLP is far from solved despite LLM advances. Cohere released Command-R, a model for Retrieval Augmented Generation, and Anthropic highlighted the Claude 3 family (Opus, Sonnet, Haiku) for various application needs. Open-source hardware DexCap enables dexterous robot manipulation data collection affordably. Tools like CopilotKit simplify AI integration into React apps, and migration to Keras 3 with JAX backend offers faster training. New projects improve reranking for retrieval and add financial agents to LangChain. The content includes insights on AI progress, new models, open-source tools, and frameworks.