All tags
Company: "intel"
OLMo 2 - new SOTA Fully Open LLM
llama-3-1-8b olmo-2 qwen2-5-72b-instruct smolvlm tulu-3 ai2 huggingface intel reinforcement-learning quantization learning-rate-annealing ocr fine-tuning model-training vision
AI2 has updated OLMo-2 to roughly Llama 3.1 8B equivalent, training with 5T tokens and using learning rate annealing and new high-quality data (Dolmino). They credit Tülu 3 and its "Reinforcement Learning with Verifiable Rewards" approach. On Reddit, Qwen2.5-72B instruct model shows near lossless performance with AutoRound 4-bit quantization, available on HuggingFace in 4-bit and 2-bit versions, with discussions on MMLU benchmark and quantization-aware training. HuggingFace released SmolVLM, a 2B parameter vision-language model running efficiently on consumer GPUs, supporting fine-tuning on Google Colab and demonstrating strong OCR capabilities with adjustable resolution and quantization options.
How Carlini Uses AI
gemma-2-2b gpt-3.5-turbo-0613 mixtral-8x7b gen-3-alpha segment-anything-model-2 stable-fast-3d groq intel deepmind box figure-ai openai google meta-ai-fair nvidia stability-ai runway benchmarking adversarial-attacks large-language-models text-generation multimodality robotics emotion-detection structured-data-extraction real-time-processing teleoperation 3d-generation text-to-video nicholas-carlini chris-dixon rasbt
Groq's shareholders' net worth rises while others fall, with Intel's CEO expressing concern. Nicholas Carlini of DeepMind gains recognition and criticism for his extensive AI writings, including an 80,000-word treatise on AI use and a benchmark for large language models. Chris Dixon comments on AI Winter skepticism, emphasizing long-term impact. Box introduces an AI API for extracting structured data from documents, highlighting potential and risks of LLM-driven solutions. Recent AI developments include Figure AI launching the advanced humanoid robot Figure 02, OpenAI rolling out Advanced Voice Mode for ChatGPT with emotion detection, Google open-sourcing Gemma 2 2B model matching GPT-3.5-Turbo-0613 performance, Meta AI Fair releasing Segment Anything Model 2 (SAM 2) for real-time object tracking, NVIDIA showcasing Project GR00T for humanoid teleoperation with Apple Vision Pro, Stability AI launching Stable Fast 3D for rapid 3D asset generation, and Runway unveiling Gen-3 Alpha for AI text-to-video generation.
Mixtral 8x22B Instruct sparks efficiency memes
mixtral-8x22b llama-2-7b olmo-7b mistral-ai hugging-face google microsoft intel softbank nvidia multilinguality math code-generation context-window model-performance model-release retrieval-augmented-generation deepfake ai-investment ai-chip hybrid-architecture training-data guillaume-lample osanseviero _philschmid svpino
Mistral released an instruct-tuned version of their Mixtral 8x22B model, notable for using only 39B active parameters during inference, outperforming larger models and supporting 5 languages with 64k context window and math/code capabilities. The model is available on Hugging Face under an Apache 2.0 license for local use. Google plans to invest over $100 billion in AI, with other giants like Microsoft, Intel, and SoftBank also making large investments. The UK criminalized non-consensual deepfake porn, raising enforcement debates. A former Nvidia employee claims Nvidia's AI chip lead is unmatchable this decade. AI companions could become a $1 billion market. AI has surpassed humans on several basic tasks but lags on complex ones. Zyphra introduced Zamba, a novel 7B parameter hybrid model outperforming LLaMA-2 7B and OLMo-7B with less training data, trained on 128 H100 GPUs over 30 days. GroundX API advances retrieval-augmented generation accuracy.
Karpathy emerges from stealth?
mistral-7b mixtral-8x7b zephyr-7b gpt-4 llama-2 intel mistral-ai audiogen thebloke tokenization quantization model-optimization fine-tuning model-merging computational-efficiency memory-optimization retrieval-augmented-generation multi-model-learning meta-reasoning dataset-sharing open-source ethical-ai community-collaboration andrej-karpathy
Andrej Karpathy released a comprehensive 2-hour tutorial on tokenization, detailing techniques up to GPT-4's tokenizer and noting the complexity of Llama 2 tokenization with SentencePiece. Discussions in AI Discord communities covered model optimization and efficiency, focusing on quantization of models like Mistral 7B and Zephyr-7B to reduce memory usage for consumer GPUs, including Intel's new weight-only quantization algorithm. Efforts to improve computational efficiency included selective augmentation reducing costs by 57.76% and memory token usage versus kNN for Transformers. Challenges in hardware compatibility and software issues were shared, alongside fine-tuning techniques such as LoRA and model merging. Innovative applications of LLMs in retrieval-augmented generation (RAG), multi-model learning, and meta-reasoning were explored. The community emphasized dataset sharing, open-source releases like SDXL VAE encoded datasets and Audiogen AI codecs, and ethical AI use with censorship and guardrails. Collaboration and resource sharing remain strong in these AI communities.
The Dissection of Smaug (72B)
smaug-72b qwen-1.0 qwen-1.5 gpt-4 mistral-7b miqumaid wizardlm_evol_instruct_v2_196k openhermes-2.5 abacus-ai hugging-face nous-research laion thebloke lm-studio intel nvidia elevenlabs fine-tuning model-merging quantization web-ui model-conversion hardware-setup privacy image-generation optical-character-recognition prompt-engineering bindureddy
Abacus AI launched Smaug 72B, a large finetune of Qwen 1.0, which remains unchallenged on the Hugging Face Open LLM Leaderboard despite skepticism from Nous Research. LAION introduced a local voice assistant model named Bud-E with a notable demo. The TheBloke Discord community discussed model performance trade-offs between large models like GPT-4 and smaller quantized models, fine-tuning techniques using datasets like WizardLM_evol_instruct_V2_196k and OpenHermes-2.5, and challenges in web UI development and model merging involving Mistral-7b and MiquMaid. The LM Studio Discord highlighted issues with model conversion from PyTorch to gguf, hardware setups involving Intel Xeon CPUs and Nvidia P40 GPUs, privacy concerns, and limitations in image generation and web UI availability.