All tags
Company: "cognition-labs"
ChatGPT Canvas GA
llama-3-70b llama-3-1-8b tgi-v3 deepseek-v2.5-1210 coconut openai deepseek-ai meta-ai-fair huggingface cognition-labs hyperbolic google-deepmind code-execution gpt-integration model-finetuning gradient-checkpointing context-length latent-space-reasoning performance-optimization gpu-memory-optimization kubernetes gpu-marketplace ai-capabilities employment-impact neurips-2024 ai-scaling humor arav_srinivas sama jonathan-frankle dylan
OpenAI launched ChatGPT Canvas to all users, featuring code execution and GPT integration, effectively replacing Code Interpreter with a Google Docs-like interface. Deepseek AI announced their V2.5-1210 update improving performance on MATH-500 (82.8%) and LiveCodebench. Meta AI Fair introduced COCONUT, a new continuous latent space reasoning paradigm. Huggingface released TGI v3, processing 3x more tokens and running 13x faster than vLLM on long prompts. Cognition Labs released Devin, an AI developer building Kubernetes operators. Hyperbolic raised $12M Series A to build an open AI platform with an H100 GPU marketplace. Discussions included AI capabilities and employment impact, and NeurIPS 2024 announcements with Google DeepMind demos and a debate on AI scaling. On Reddit, Llama 3.3-70B supports 90K context length finetuning using Unsloth with gradient checkpointing and Apple's Cut Cross Entropy (CCE) algorithm, fitting on 41GB VRAM. Llama 3.1-8B reaches 342K context lengths with Unsloth, surpassing native limits.
DeepMind SIMA: one AI, 9 games, 600 tasks, vision+language ONLY
llama-3 claude-3-opus claude-3 gpt-3.5-turbo deepmind cognition-labs deepgram modal-labs meta-ai-fair anthropic multimodality transformer software-engineering ai-agents ai-infrastructure training text-to-speech speech-to-text real-time-processing model-architecture benchmarking andrej-karpathy arav-srinivas francois-chollet yann-lecun soumith-chintala john-carmack
DeepMind SIMA is a generalist AI agent for 3D virtual environments evaluated on 600 tasks across 9 games using only screengrabs and natural language instructions, achieving 34% success compared to humans' 60%. The model uses a multimodal Transformer architecture. Andrej Karpathy outlines AI autonomy progression in software engineering, while Arav Srinivas praises Cognition Labs' AI agent demo. François Chollet expresses skepticism about automating software engineering fully. Yann LeCun suggests moving away from generative models and reinforcement learning towards human-level AI. Meta's Llama-3 training infrastructure with 24k H100 Cluster Pods is shared by Soumith Chintala and Yann LeCun. Deepgram's Aura offers low-latency speech APIs, and Modal Labs' Devin AI demonstrates document navigation and interaction with ComfyUI. Memes and humor circulate in the AI community.
The world's first fully autonomous AI Engineer
gpt-4 devin cognition-labs openai reinforcement-learning fine-tuning long-term-reasoning planning ai-agents software-engineering model-integration asynchronous-chat ide agentic-ai patrick-collison fred-ehrsam tim-dettmers
Cognition Labs's Devin is highlighted as a potentially groundbreaking AI software engineer agent capable of learning unfamiliar technologies, addressing bugs, deploying frontend apps, and fine-tuning its own AI models. It integrates OpenAI's GPT-4 with reinforcement learning and features tools like asynchronous chat, browser, shell access, and an IDE. The system claims advanced long-term reasoning and planning abilities, attracting praise from investors like Patrick Collison and Fred Ehrsam. The technology is noted for its potential as one of the most advanced AI agents, sparking excitement about agents and AGI.