Topic: "prompt-engineering"

not much happened today

not much happened today

Open Responses: explicit spec for OpenAI's Responses API supported by OpenRouter, Ollama, Huggingface, vLLM, et al

not much happened today

OpenRouter's State of AI - An Empirical 100 Trillion Token Study

not much happened today

OpenAI's IMO Gold model also wins IOI Gold

OpenAI rolls out GPT-5 and GPT-5 Thinking to >1B users worldwide; -mini and -nano help claim Pareto Frontier

not much happened today

Context Engineering: Much More than Prompts

not much happened today

Promptable Prosody, SOTA ASR, and Semantic VAD: OpenAI revamps Voice AI

not much happened today

not much happened today

Gemini (Experimental-1114) retakes #1 LLM rank with 1344 Elo

Common Corpus: 2T Open Tokens with Provenance

not much happened today

Creating a LLM-as-a-Judge

not much happened today

not much happened today + AINews Podcast?

Reflection 70B, by Matt from IT Department

not much happened today

Gemma 2 2B + Scope + Shield

Problems with MMLU-Pro

Not much happened today

Ten Commandments for Deploying Fine-Tuned Models

GPT-4o: the new SOTA-EVERYTHING Frontier model (GPT4T version)

Quis promptum ipso promptiet?

Anime pfp anon eclipses $10k A::B prompting challenge

Evals-based AI Engineering

The Dissection of Smaug (72B)

MetaVoice & RIP Bard

RWKV "Eagle" v5: Your move, Mamba

GPT4Turbo A/B Test: gpt-4-1106-preview

RIP Latent Diffusion, Hello Hourglass Diffusion

1/13-14/2024: Don't sleep on #prompt-engineering

1/11/2024: Mixing Experts vs Merging Models

1/10/2024: All the best papers for AI Engineers

1/2/2024: Smol tweaks to Smol Talk

1/1/2024: How to start with Open Source AI

12/22/2023: Anyscale's Benchmark Criticisms

12/21/2023: The State of AI (according to LangChain)

12/20/2023: Project Obsidian - Multimodal Mistral 7B from Nous

12/19/2023: Everybody Loves OpenRouter

12/18/2023: Gaslighting Mistral for fun and profit

12/15/2023: Mixtral-Instruct beats Gemini Pro (and matches GPT3.5)

12/14/2023: $1e7 for Superalignment

12/12/2023: Towards LangChain 0.1

12/8/2023 - Mamba v Mistral v Hyena

12/7/2023: Anthropic says "skill issue"

Is Google's Gemini... legit?