All tags
Person: "yi-tay"
SciCode: HumanEval gets a STEM PhD upgrade
gpt-4 claude-3.5-sonnet llama-3-7b llama-3 dolphin-2.9.3-yi-1.5-34b-32k-gguf anthropic hugging-face nvidia benchmarks coding model-training gpu-optimization model-performance synthetic-data compiler-optimization zero-shot-learning yi-tay rohanpaul_ai alexalbert__ tri_dao abacaj
PhD-level benchmarks highlight the difficulty of coding scientific problems for LLMs, with GPT-4 and Claude 3.5 Sonnet scoring under 5% on the new SciCode benchmark. Anthropic doubled the max output token limit for Claude 3.5 Sonnet to 8192 tokens. The Q-GaLore method enables training LLaMA-7B on a single 16GB GPU. The Mosaic compiler now generates efficient code for NVIDIA H100 GPUs. The Dolphin 2.9.3-Yi-1.5-34B-32k-GGUF model on Hugging Face has over 111k downloads. Llama 3 shows strong performance, achieving 90% zero-shot accuracy on the MATH dataset. Discussions continue on the limitations and forms of synthetic data for model training.