All tags
Topic: "optical-character-recognition"
not much happened today
llama-3-2-vision gpt-2 meta-ai-fair ollama amd llamaindex gemini gitpod togethercompute langchainai weights-biases stanfordnlp deeplearningai model-scaling neural-networks multi-gpu-support skip-connections transformers healthcare-ai automated-recruitment zero-trust-security small-language-models numerical-processing chain-of-thought optical-character-recognition multi-agent-systems agent-memory interactive-language-learning bindureddy fstichler stasbekman jxmnop bindureddy omarsar0 giffmana rajammanabrolu
This week in AI news highlights Ollama 0.4 supporting Meta's Llama 3.2 Vision models (11B and 90B), with applications like handwriting recognition. Self-Consistency Preference Optimization (ScPO) was introduced to improve model consistency without human labels. Discussions on model scaling, neural networks resurgence, and AMD's multi-GPU bandwidth challenges were noted. The importance of skip connections in Transformers was emphasized. In healthcare, less regulation plus AI could revolutionize disease treatment and aging. Tools like LlamaParse and Gemini aid automated resume insights. Gitpod Flex demonstrated zero-trust architecture for secure development environments. Research includes surveys on Small Language Models (SLMs), number understanding in LLMs, and DTrOCR using a GPT-2 decoder for OCR. Multi-agent systems in prediction markets were discussed by TogetherCompute and LangChainAI. Community events include NeurIPS Happy Hour, NLP seminars, and courses on Agent Memory with LLMs as operating systems.
The Dissection of Smaug (72B)
smaug-72b qwen-1.0 qwen-1.5 gpt-4 mistral-7b miqumaid wizardlm_evol_instruct_v2_196k openhermes-2.5 abacus-ai hugging-face nous-research laion thebloke lm-studio intel nvidia elevenlabs fine-tuning model-merging quantization web-ui model-conversion hardware-setup privacy image-generation optical-character-recognition prompt-engineering bindureddy
Abacus AI launched Smaug 72B, a large finetune of Qwen 1.0, which remains unchallenged on the Hugging Face Open LLM Leaderboard despite skepticism from Nous Research. LAION introduced a local voice assistant model named Bud-E with a notable demo. The TheBloke Discord community discussed model performance trade-offs between large models like GPT-4 and smaller quantized models, fine-tuning techniques using datasets like WizardLM_evol_instruct_V2_196k and OpenHermes-2.5, and challenges in web UI development and model merging involving Mistral-7b and MiquMaid. The LM Studio Discord highlighted issues with model conversion from PyTorch to gguf, hardware setups involving Intel Xeon CPUs and Nvidia P40 GPUs, privacy concerns, and limitations in image generation and web UI availability.