All tags
Company: "cognition"
Cognition's DeepWiki, a free encyclopedia of all GitHub repos
o4-mini perception-encoder qwen-2.5-vl dia-1.6b grok-3 gemini-2.5-pro claude-3.7 gpt-4.1 cognition meta-ai-fair alibaba hugging-face openai perplexity-ai vllm vision text-to-speech reinforcement-learning ocr model-releases model-integration open-source frameworks chatbots model-selector silas-alberti mervenoyann reach_vb aravsrinivas vikparuchuri lioronai
Silas Alberti of Cognition announced DeepWiki, a free encyclopedia of all GitHub repos providing Wikipedia-like descriptions and Devin-backed chatbots for public repos. Meta released Perception Encoders (PE) with A2.0 license, outperforming InternVL3 and Qwen2.5VL on vision tasks. Alibaba launched the Qwen Chat App for iOS and Android. Hugging Face integrated the Dia 1.6B SoTA text-to-speech model via FAL. OpenAI expanded deep research usage with a lightweight version powered by o4-mini model, now available to free users. Perplexity AI updated their model selector with Grok 3 Beta, o4-mini, and support for models like gemini 2.5 pro, claude 3.7, and gpt-4.1. vLLM project introduced OpenRLHF framework for reinforcement learning with human feedback. Surya OCR alpha model supports 90+ languages and LaTeX. MegaParse open-source library was introduced for LLM-ready data formats.
not much happened today
ic-light-v2 claude-3-5-sonnet puzzle nvidia amazon anthropic google pydantic supabase browser-company world-labs cognition distillation neural-architecture-search inference-optimization video trajectory-attention timestep-embedding ai-safety-research fellowship-programs api domain-names reverse-thinking reasoning agent-frameworks image-to-3d ai-integration akhaliq adcock_brett omarsar0 iscienceluvr
AI News for 11/29/2024-12/2/2024 highlights several developments: Nvidia introduced Puzzle, a distillation-based neural architecture search for inference-optimized large language models, enhancing efficiency. The IC-Light V2 model was released for varied illumination scenarios, and new video model techniques like Trajectory Attention and Timestep Embedding were presented. Amazon increased its investment in Anthropic to $8 billion, supporting AI safety research through a new fellowship program. Google is expanding AI integration with the Gemini API and open collaboration tools. Discussions on domain name relevance emphasize alternatives to .com domains like .io, .ai, and .co. Advances in reasoning include a 13.53% improvement in LLM performance using "Reverse Thinking". Pydantic launched a new agent framework, and Supabase released version 2 of their assistant. Other notable mentions include Browser Company teasing a second browser and World Labs launching image-to-3D-world technology. The NotebookLM team departed from Google, and Cognition was featured on the cover of Forbes. The news was summarized by Claude 3.5 Sonnet.
nothing much happened today
o1 chatgpt-4o llama-3-1-405b openai lmsys scale-ai cognition langchain qdrant rohanpaul_ai reinforcement-learning model-merging embedding-models toxicity-detection image-editing dependency-management automated-code-review visual-search benchmarking denny_zhou svpino alexandr_wang cwolferesearch rohanpaul_ai _akhaliq kylebrussell
OpenAI's o1 model faces skepticism about open-source replication due to its extreme restrictions and unique training advances like RL on CoT. ChatGPT-4o shows significant performance improvements across benchmarks. Llama-3.1-405b fp8 and bf16 versions perform similarly with cost benefits for fp8. A new open-source benchmark "Humanity's Last Exam" offers $500K in prizes to challenge LLMs. Model merging benefits from neural network sparsity and linear mode connectivity. Embedding-based toxic prompt detection achieves high accuracy with low compute. InstantDrag enables fast, optimization-free drag-based image editing. LangChain v0.3 releases with improved dependency management. Automated code review tool CodeRabbit adapts to team coding styles. Visual search advances integrate multimodal data for better product search. Experts predict AI will be default software by 2030.
Everybody shipped small things this holiday weekend
gpt-4o-voice gemini claude jamba-1.5 mistral-nemo-minitron-8b xai google anthropic openai cognition ai21-labs nvidia langchain fine-tuning long-context parameter-efficient-fine-tuning latex-rendering real-time-audio virtual-try-on resource-tags low-code ai-agents workspace-organization model-benchmarking dario-amodei scott-wu fchollet svpino
xAI announced the Colossus 100k H100 cluster capable of training an FP8 GPT-4 class model in 4 days. Google introduced Structured Output for Gemini. Anthropic discussed Claude's performance issues possibly due to API prompt modifications. OpenAI enhanced controls for File Search in their Assistants API. Cognition and Anthropic leaders appeared on podcasts. The viral Kwai-Kolors virtual try-on model and the open-source real-time audio conversational model Mini-Omni (similar to gpt-4o-voice) were released. Tutorials on parameter-efficient fine-tuning with LoRA and QLoRA, long-context embedding challenges, and Claude's LaTeX rendering feature were highlighted. AI21 Labs released Jamba 1.5 models with a 256K context window and faster long-context performance. NVIDIA debuted Mistral-Nemo-Minitron-8B on the Open LLM Leaderboard. LangChain introduced resource tags for workspace organization, and a low-code AI app toolkit was shared by svpino. Legal AI agents and financial agent evaluations using LangSmith were also featured.
Summer of Code AI: $1.6b raised, 1 usable product
ltm-2 llama-3-1-405b gemini-advanced cognition poolside codeium magic google-deepmind nvidia google-cloud long-context model-efficiency custom-hardware cuda training-stack gpu-scaling neural-world-models diffusion-models quantization nat-friedman ben-chess rohan-paul
Code + AI is emphasized as a key modality in AI engineering, highlighting productivity and verifiability benefits. Recent major funding rounds include Cognition AI raising $175M, Poolside raising $400M, Codeium AI raising $150M, and Magic raising $320M. Magic announced their LTM-2 model with a 100 million token context window, boasting efficiency improvements over Llama 3.1 405B by about 1000x cheaper in sequence-dimension algorithm and drastically lower memory requirements. Magic's stack is built from scratch with custom CUDA and no open-source foundations, partnered with Google Cloud and powered by NVIDIA H100 and GB200 GPUs, aiming to scale to tens of thousands of GPUs. Google DeepMind revealed updates to Gemini Advanced with customizable expert "Gems." Neural Game Engines like GameNGen can run DOOM in a diffusion model trained on 0.9B frames. The content also references LLM quantization research by Rohan Paul.
Claude Crushes Code - 92% HumanEval and Claude.ai Artifacts
claude-3.5-sonnet claude-3-opus gpt-4o anthropic openai cognition benchmarking model-performance coding model-optimization fine-tuning instruction-following model-efficiency model-release api performance-optimization alex-albert
Claude 3.5 Sonnet, released by Anthropic, is positioned as a Pareto improvement over Claude 3 Opus, operating at twice the speed and costing one-fifth as much. It achieves state-of-the-art results on benchmarks like GPQA, MMLU, and HumanEval, surpassing even GPT-4o and Claude 3 Opus on vision tasks. The model demonstrates significant advances in coding capabilities, passing 64% of test cases compared to 38% for Claude 3 Opus, and is capable of autonomously fixing pull requests. Anthropic also introduced the Artifacts feature, enabling users to interact with AI-generated content such as code snippets and documents in a dynamic workspace, similar to OpenAI's Code Interpreter. This release highlights improvements in performance, cost-efficiency, and coding proficiency, signaling a growing role for LLMs in software development.
ALL of AI Engineering in One Place
claude-3-sonnet claude-3 openai google-deepmind anthropic mistral-ai cohere hugging-face adept midjourney character-ai microsoft amazon nvidia salesforce mastercard palo-alto-networks axa novartis discord twilio tinder khan-academy sourcegraph mongodb neo4j hasura modular cognition anysphere perplexity-ai groq mozilla nous-research galileo unsloth langchain llamaindex instructor weights-biases lambda-labs neptune datastax crusoe covalent qdrant baseten e2b octo-ai gradient-ai lancedb log10 deepgram outlines crew-ai factory-ai interpretability feature-steering safety multilinguality multimodality rag evals-ops open-models code-generation gpus agents ai-leadership
The upcoming AI Engineer World's Fair in San Francisco from June 25-27 will feature a significantly expanded format with booths, talks, and workshops from top model labs like OpenAI, DeepMind, Anthropic, Mistral, Cohere, HuggingFace, and Character.ai. It includes participation from Microsoft Azure, Amazon AWS, Google Vertex, and major companies such as Nvidia, Salesforce, Mastercard, Palo Alto Networks, and more. The event covers 9 tracks including RAG, multimodality, evals/ops, open models, code generation, GPUs, agents, AI in Fortune 500, and a new AI leadership track. Additionally, Anthropic shared interpretability research on Claude 3 Sonnet, revealing millions of interpretable features that can be steered to modify model behavior, including safety-relevant features related to bias and unsafe content, though more research is needed for practical applications. The event offers a discount code for AI News readers.