All tags
Company: "zeiss"
AI2 releases OLMo - the 4th open-everything LLM
olmo-1b olmo-7b olmo-65b miqu-70b mistral-medium distilbert-base-uncased ai2 allenai mistral-ai tsmc asml zeiss fine-tuning gpu-shortage embedding-chunking json-generation model-optimization reproducible-research self-correction vram-constraints programming-languages nathan-lambert lhc1921 mrdragonfox yashkhare_ gbourdin
AI2 is gaining attention in 2024 with its new OLMo models, including 1B and 7B sizes and a 65B model forthcoming, emphasizing open and reproducible research akin to Pythia. The Miqu-70B model, especially the Mistral Medium variant, is praised for self-correction and speed optimizations. Discussions in TheBloke Discord covered programming language preferences, VRAM constraints for large models, and fine-tuning experiments with Distilbert-base-uncased. The Mistral Discord highlighted challenges in the GPU shortage affecting semiconductor production involving TSMC, ASML, and Zeiss, debates on open-source versus proprietary models, and fine-tuning techniques including LoRA for low-resource languages. Community insights also touched on embedding chunking strategies and JSON output improvements.