All tags
Person: "zachtratar"
Microsoft AgentInstruct + Orca 3
mistral-7b orca-2.5 microsoft-research apple tencent hugging-face synthetic-data fine-tuning instruction-following transformers model-performance hallucination-detection dataset-quality flashattention mixture-of-experts philschmid sama bindureddy rohanpaul_ai zachtratar dair_ai
Microsoft Research released AgentInstruct, the third paper in its Orca series, introducing a generative teaching pipeline that produces 25.8 million synthetic instructions to fine-tune mistral-7b, achieving significant performance gains: +40% AGIEval, +19% MMLU, +54% GSM8K, +38% BBH, +45% AlpacaEval, and a 31.34% reduction in hallucinations. This synthetic data approach follows the success of FineWeb and Apple's Rephrasing research in improving dataset quality. Additionally, Tencent claims to have generated 1 billion diverse personas for synthetic data. On AI Twitter, notable discussions included a shooting incident at a Trump rally and recent ML research highlights such as FlashAttention-3, RankRAG, and Mixture of A Million Experts.