All tags
Company: "glaive"
Reflection 70B, by Matt from IT Department
llama-3.1-70b llama-3 claude-3.5-sonnet hyperwrite glaive fine-tuning chain-of-thought instruction-following synthetic-data quantization model-evaluation prompt-engineering matt-shumer sahil-chaudhary
Reflection Tuning technique has been used by a two-person team from Hyperwrite and Glaive to finetune llama-3.1-70b, showing strong performance improvements with minimal synthetic data. The approach builds on the concept of adding
thinking
and reflection
steps to outputs, related to the Chain of Thought method. Despite some criticisms like contamination concerns, worse coding performance, and reliance on system prompts, the model has received positive reception and comparisons to claude-3.5-sonnet. The work highlights efficient instruction tuning and synthetic data generation for large models.