All tags
Model: "qwq"
not much happened today
glm-4.7-flash grok deepseek-r1 qwq x-ai unsloth-ai google deepseek ollama transformer-architecture recommendation-systems local-inference kv-cache quantization tensor-parallelism reasoning model-optimization fine-tuning giffmana david_sholz yuchenj_uw nearcyan sam_paech teortaxes_tex danielhanchen alexocheema nopmobiel rohanpaul_ai
X Engineering open-sourced its new transformer-based recommender algorithm, sparking community debate on transparency and fairness. GLM-4.7-Flash (30B-A3B) gains momentum as a strong local inference model with efficient KV-cache management and quantization tuning strategies. Innovations include tensor parallelism on Mac Minis achieving ~100 tok/s throughput. Research highlights "Societies of Thought" as a reasoning mechanism improving model accuracy by 20%+.
Qwen with Questions: 32B open weights reasoning model nears o1 in GPQA/AIME/Math500
deepseek-r1 qwq gpt-4o claude-3.5-sonnet qwen-2.5 llama-cpp deepseek sambanova hugging-face dair-ai model-releases benchmarking fine-tuning sequential-search inference model-deployment agentic-rag external-tools multi-modal-models justin-lin clementdelangue ggerganov vikparuchuri
DeepSeek r1 leads the race for "open o1" models but has yet to release weights, while Justin Lin released QwQ, a 32B open weight model that outperforms GPT-4o and Claude 3.5 Sonnet on benchmarks. QwQ appears to be a fine-tuned version of Qwen 2.5, emphasizing sequential search and reflection for complex problem-solving. SambaNova promotes its RDUs as superior to GPUs for inference tasks, highlighting the shift from training to inference in AI systems. On Twitter, Hugging Face announced CPU deployment for llama.cpp instances, Marker v1 was released as a faster and more accurate deployment tool, and Agentic RAG developments focus on integrating external tools and advanced LLM chains for improved response accuracy. The open-source AI community sees growing momentum with models like Flux gaining popularity, reflecting a shift towards multi-modal AI models including image, video, audio, and biology.