All tags
Topic: "activation-functions"
Evals: The Next Generation
gpt-4 gpt-5 gpt-3.5 phi-3 mistral-7b llama-3 scale-ai mistral-ai reka-ai openai moderna sanctuary-ai microsoft mit meta-ai-fair benchmarking data-contamination multimodality fine-tuning ai-regulation ai-safety ai-weapons neural-networks model-architecture model-training model-performance robotics activation-functions long-context sam-altman jim-fan
Scale AI highlighted issues with data contamination in benchmarks like MMLU and GSM8K, proposing a new benchmark where Mistral overfits and Phi-3 performs well. Reka released the VibeEval benchmark for multimodal models addressing multiple choice benchmark limitations. Sam Altman of OpenAI discussed GPT-4 as "dumb" and hinted at GPT-5 with AI agents as a major breakthrough. Researchers jailbroke GPT-3.5 via fine-tuning. Global calls emerged to ban AI-powered weapons, with US officials urging human control over nuclear arms. Ukraine launched an AI consular avatar, while Moderna partnered with OpenAI for medical AI advancements. Sanctuary AI and Microsoft collaborate on AI for general-purpose robots. MIT introduced Kolmogorov-Arnold networks with improved neural network efficiency. Meta AI is training Llama 3 models with over 400 billion parameters, featuring multimodality and longer context.