Company: "codium"

Jan 22, 2024

gpt-5 mixtral-7b gpt-3.5 gemini-pro gpt-4 llama-cpp openai codium thebloke amd hugging-face mixture-of-experts fine-tuning model-merging 8-bit-optimization gpu-acceleration performance-comparison command-line-ai vector-stores embeddings coding-capabilities sam-altman ilya-sutskever itamar andrej-karpathy

Sam Altman at Davos highlighted that his top priority is launching the new model, likely called GPT-5, while expressing uncertainty about Ilya Sutskever's employment status. Itamar from Codium introduced the concept of Flow Engineering with AlphaCodium, gaining attention from Andrej Karpathy. On the TheBloke Discord, engineers discussed a multi-specialty mixture-of-experts (MOE) model combining seven distinct 7 billion parameter models specialized in law, finance, and medicine. Debates on 8-bit fine-tuning and the use of bitsandbytes with GPU support were prominent. Discussions also covered model merging using tools like Mergekit and compatibility with Alpaca format. Interest in optimizing AI models on AMD hardware using AOCL blas and lapack libraries with llama.cpp was noted. Users experimented with AI for command line tasks, and the Mixtral MoE model was refined to surpass larger models in coding ability. Comparisons among LLMs such as GPT-3.5, Mixtral, Gemini Pro, and GPT-4 focused on knowledge depth, problem-solving, and speed, especially for coding tasks.

You can also subscribe by rss .

Press Esc or click anywhere to close