All tags
Topic: "linear-attention"
not much happened today
Poolside raised $1B at a $12B valuation. Eric Zelikman raised $1B after leaving Xai. Weavy joined Figma. New research highlights FP16 precision reduces training-inference mismatch in reinforcement-learning fine-tuning compared to BF16. Kimi AI introduced a hybrid KDA (Kimi Delta Attention) architecture improving long-context throughput and RL stability, alongside a new Kimi CLI for coding with agent protocol support. OpenAI previewed Agent Mode in ChatGPT enabling autonomous research and planning during browsing.
Mergestral, Meta MTIAv2, Cohere Rerank 3, Google Infini-Attention
mistral-8x22b command-r-plus rerank-3 infini-attention llama-3 sd-1.5 cosxl meta-ai-fair mistral-ai cohere google stability-ai hugging-face ollama model-merging training-accelerators retrieval-augmented-generation linear-attention long-context foundation-models image-generation rag-pipelines model-benchmarking context-length model-performance aidan_gomez ylecun swyx
Meta announced their new MTIAv2 chips designed for training and inference acceleration with improved architecture and integration with PyTorch 2.0. Mistral released the 8x22B Mixtral model, which was merged back into a dense model to effectively create a 22B Mistral model. Cohere launched Rerank 3, a foundation model enhancing enterprise search and retrieval-augmented generation (RAG) systems supporting 100+ languages. Google published a paper on Infini-attention, an ultra-scalable linear attention mechanism demonstrated on 1B and 8B models with 1 million sequence length. Additionally, Meta's Llama 3 is expected to start rolling out soon. Other notable updates include Command R+, an open model surpassing GPT-4 in chatbot performance with 128k context length, and advancements in Stable Diffusion models and RAG pipelines.