Model: "llama-3-1-1b"

Jan 08, 2024

1/6-7/2024: LlaMA Pro - an alternative to PEFT/RAG??

llama-3 llama-3-1-1b llama-3-8-3b gpt-4 gpt-3.5 dall-e openai mistral-ai llamaindex langchain fine-tuning model-expansion token-limits privacy multilinguality image-generation security custom-models model-training yannic-kilcher

New research papers introduce promising Llama Extensions including TinyLlama, a compact 1.1B parameter model pretrained on about 1 trillion tokens for 3 epochs, and LLaMA Pro, an 8.3B parameter model expanding LLaMA2-7B with additional training on 80 billion tokens of code and math data. LLaMA Pro adds layers to avoid catastrophic forgetting and balances language and code tasks but faces scrutiny for not using newer models like Mistral or Qwen. Meanwhile, OpenAI Discord discussions reveal insights on GPT-4 token limits, privacy reassurances, fine-tuning for GPT-3.5, challenges with multi-language image recognition, custom GPT creation requiring ChatGPT Plus, and security concerns in GPT deployment. Users also share tips on dynamic image generation with DALL-E and logo creation.

You can also subscribe by rss .

Press Esc or click anywhere to close