All tags
Model: "sora"
not much happened today
o1-full sora gpt-4.5 gpt-4 claude-3.5-sonnet llama-3-1-nemotron-51b llama-3-1 llama-3 nemotron-51b openai google-deepmind anthropic nvidia huggingface vision model-performance neural-architecture-search model-optimization multimodality model-release model-training reinforcement-learning image-generation lucas-beyer alexander-kolesnikov xiaohua-zhai aidan_mclau giffmana joannejang sama
OpenAI announced their "12 Days of OpenAI" event with daily livestreams and potential releases including the O1 full model, Sora video model, and GPT-4.5. Google DeepMind released the GenCast weather model capable of 15-day forecasts in 8 minutes using TPU chips, and launched Genie 2, a model generating playable 3D worlds from single images. Leading vision researchers Lucas Beyer, Alexander Kolesnikov, and Xiaohua Zhai moved from DeepMind to OpenAI, which is opening a Zürich office. Criticism arose over OpenAI's strategy and model quality compared to Anthropic and Claude 3.5 Sonnet. On Reddit, a modified llama.cpp supports Nvidia's Llama-3_1-Nemotron-51B, matching performance of larger 70B models via NAS optimization.
Lilian Weng on Video Diffusion
wizardlm-2 llama-3 reka-core devin opus sora openai adobe reka-ai diffusion-models video-generation training-free-adaptation multimodality intuition creativity analogy-recognition self-improving-ai model-recognition agi-timelines model-performance startup-competition lilian-weng sam-altman geoffrey-hinton yann-lecun
OpenAI expands with a launch in Japan, introduces a Batch API, and partners with Adobe to bring the Sora video model to Premiere Pro. Reka AI releases the Reka Core multimodal language model. WizardLM-2 is released showing impressive performance, and Llama 3 news is anticipated soon. Geoffrey Hinton highlights AI models exhibiting intuition, creativity, and analogy recognition beyond humans. The Devin AI model notably contributes to its own codebase. Opus demonstrates the ability to recognize its own generated outputs. Sam Altman warns startups about being steamrolled by OpenAI if they don't adapt quickly. Yann LeCun discusses AGI timelines, emphasizing it is inevitable but not imminent or solely from LLMs. Lilian Weng's blog on diffusion models for video generation highlights training-free adaptation as a breakthrough technique.
Companies liable for AI hallucination is Good Actually for AI Engineers
mistral-next large-world-model sora babilong air-canada huggingface mistral-ai quantization retrieval-augmented-generation fine-tuning cuda-optimization video-generation ai-ethics dataset-management open-source community-driven-development andrej-karpathy
Air Canada faced a legal ruling requiring it to honor refund policies communicated by its AI chatbot, setting a precedent for corporate liability in AI engineering accuracy. The tribunal ordered a refund of $650.88 CAD plus damages after the chatbot misled a customer about bereavement travel refunds. Meanwhile, AI community discussions highlighted innovations in quantization techniques for GPU inference, Retrieval-Augmented Generation (RAG) and fine-tuning of LLMs, and CUDA optimizations for PyTorch models. New prototype models like Mistral-Next and the Large World Model (LWM) were introduced, showcasing advances in handling large text contexts and video generation with models like Sora. Ethical and legal implications of AI autonomy were debated alongside challenges in dataset management. Community-driven projects such as the open-source TypeScript agent framework bazed-af emphasize collaborative AI development. Additionally, benchmarks like BABILong for up to 10M context evaluation and tools from karpathy were noted.
Sora pushes SOTA
gemini-1.5 sora h20-gpt mistral-7b llama-13b mistralcasualml mixtral-instruct yi-models openai google-deepmind nvidia mistral-ai h2oai multimodality gpu-power-management long-context model-merging fine-tuning retrieval-augmented-generation role-play-model-optimization cross-language-integration training-loss synthetic-data-generation coding-support
Discord communities analyzed over 20 guilds, 312 channels, and 10550 messages reveal intense discussions on AI developments. Key highlights include the Dungeon Master AI assistant for Dungeons and Dragons using models like H20 GPT, GPU power supply debates involving 3090 and 3060 GPUs, and excitement around Google's Gemini 1.5 with its 1 million token context window and OpenAI's Sora model. Challenges with large world models (LWM) multimodality, GPT-assisted coding, and role-play model optimization with Yi models and Mixtral Instruct were discussed. Technical issues like model merging errors with MistralCasualML, fine-tuning scripts like AutoFineTune, and cross-language engineering via JSPyBridge were also prominent. NVIDIA's Chat with RTX feature leveraging retrieval-augmented generation (RAG) on 30+ series GPUs was compared to LMStudio's support for Mistral 7b and Llama 13b models. The community is cautiously optimistic about these frontier models' applications in media and coding.