All tags
Topic: "llm-on-consoles"
GPT4Turbo A/B Test: gpt-4-1106-preview
gpt-4-turbo gpt-4 gpt-3.5 openhermes-2.5-mistral-7b-4.0bpw exllamav2 llama-2-7b-chat mistral-instruct-v0.2 mistrallite llama2 openai huggingface thebloke nous-research mistral-ai langchain microsoft azure model-loading rhel dataset-generation llm-on-consoles fine-tuning speed-optimization api-performance prompt-engineering token-limits memory-constraints text-generation nlp-tools context-window-extension sliding-windows rope-theta non-finetuning-context-extension societal-impact
OpenAI released a new GPT-4 Turbo version, prompting a natural experiment in summarization comparing the November 2023 and January 2024 versions. The TheBloke Discord discussed troubleshooting model loading errors with OpenHermes-2.5-Mistral-7B-4.0bpw and exllamav2, debates on RHEL in ML, dataset generation for understanding GPT flaws, and running LLMs like Llama and Mistral on consoles. LangChain fine-tuning challenges for Llama2 were also noted. The OpenAI Discord highlighted GPT-4 speed inconsistencies, API vs web performance, prompt engineering with GPT-3.5 and GPT-4 Turbo, and DALL-E typo issues in image text. Discussions included NLP tools like semantic-text-splitter and collaboration concerns with GPT-4 Vision on Azure. The Nous Research AI Discord focused on extending context windows with Mistral instruct v0.2, MistralLite, and LLaMA-2-7B-Chat achieving 16,384 token context, plus alternatives like SelfExtend for context extension without fine-tuning. The societal impact of AI technology was also considered.