Archive
Historical Challenges
Read-only corpus — submissions are closed. New weekly challenges rotate at /challenges.
Prompt Engineering(45)
Examples in the Preamble
ArchiveVery Easy·0 solvesTemperature & Creativity
ArchiveVery Easy·56 solvesThe Heat Knob
ArchiveVery Easy·0 solvesThe Hidden Layer
ArchiveVery Easy·54 solvesThe RCTF Framework
ArchiveVery Easy·0 solvesThe Skeleton's Name
ArchiveVery Easy·0 solvesThe Step-by-Step Spell
ArchiveVery Easy·0 solvesToken Limits
ArchiveVery Easy·46 solvesAnthropic's Favorite Brackets
ArchiveEasy·0 solvesAnthropic Turn Format [Anthropic]
ArchiveEasy·34 solvesMajority Wins
ArchiveEasy·0 solvesPruned Reasoning
ArchiveEasy·0 solvesRole of the Priest
ArchiveEasy·0 solvesSelf-Consistency
ArchiveEasy·52 solvesStructured Output [OpenAI]
ArchiveEasy·51 solvesThe Magic Phrase
ArchiveEasy·57 solvesThe Magic Phrase
ArchiveEasy·0 solvesThe RCTF Framework
ArchiveEasy·42 solvesXML Delimiters [Anthropic]
ArchiveEasy·53 solvesA Tiny Spoonful of Reflection
ArchiveMedium·0 solvesConstitutional AI [Anthropic]
ArchiveMedium·48 solvesFew-Shot Balance
ArchiveMedium·41 solvesFirst Things First
ArchiveMedium·37 solvesMeta-Prompting
ArchiveMedium·47 solvesPrompt Caching [Anthropic]
ArchiveMedium·37 solvesSignals From the Margin
ArchiveMedium·0 solvesThe Cascade of Smaller Prompts
ArchiveMedium·0 solvesThe Forbidden List
ArchiveMedium·33 solvesThe Lost Middle
ArchiveMedium·0 solvesThe Paper That Took Four
ArchiveMedium·0 solvesDivide and Conquer
ArchiveHard·46 solvesLatent Wedges Between Your Ribs
ArchiveHard·0 solvesMarkdown Exfiltration
ArchiveHard·58 solvesRead It Back
ArchiveHard·49 solvesThe Alias for a Banned Word
ArchiveHard·0 solvesThe Chess-Playing Aside
ArchiveHard·0 solvesThe Constitution of the Tiny Law Firm
ArchiveHard·0 solvesThe Token Weighted in Whispers
ArchiveHard·0 solvesTree of Thoughts
ArchiveHard·44 solvesDual LLM Pattern
ArchiveExpert·45 solvesMeta's Serpentine Optimizer
ArchiveExpert·0 solvesNanda's Dual Triumph
ArchiveExpert·0 solvesThe Calligrapher's Gemma Trick
ArchiveExpert·0 solvesThe Oracle's Shadow
ArchiveExpert·0 solvesThe Smuggler of Side Channels
ArchiveExpert·0 solves
Agentic Architectures(40)
Function Calling
ArchiveVery Easy·48 solvesThe Chef's Knife Rack
ArchiveVery Easy·0 solvesThe Persistent Drawer
ArchiveVery Easy·0 solvesThe Three Steps of an Agent Loop
ArchiveVery Easy·0 solvesVoltron for LLMs
ArchiveVery Easy·0 solvesWhat the Planner Plans
ArchiveVery Easy·0 solvesAgent Memory Types
ArchiveEasy·45 solvesAnthropic's Name For A Worker
ArchiveEasy·0 solvesGraph With Named Edges
ArchiveEasy·0 solvesHuman-in-the-Loop
ArchiveEasy·37 solvesMCP Protocol [Anthropic]
ArchiveEasy·39 solvesThe JSON That Asks For A Tool
ArchiveEasy·0 solvesThe Library That Fakes A Browser
ArchiveEasy·0 solvesThe Pattern That Reads And Acts
ArchiveEasy·0 solvesThe ReAct Pattern
ArchiveEasy·44 solvesA Clever Name For A Budget
ArchiveMedium·0 solvesLoops vs Chains [LangChain]
ArchiveMedium·56 solvesRAG vs RAG Agent
ArchiveMedium·51 solvesSupervisor Pattern
ArchiveMedium·40 solvesThe Five Voices In The Room
ArchiveMedium·0 solvesThe Inner Monologue Library
ArchiveMedium·0 solvesThe Memory Strata
ArchiveMedium·0 solvesThe Protocol That Lives In Your IDE
ArchiveMedium·0 solvesThree at Once
ArchiveMedium·53 solvesTool Poisoning
ArchiveMedium·46 solvesAgent Evaluation
ArchiveHard·40 solvesA Pluriel Of Personas
ArchiveHard·0 solvesA Tiny Boss For The Research Lab
ArchiveHard·0 solvesContext Saturation
ArchiveHard·55 solvesOpenAI's Swarm Acronym
ArchiveHard·0 solvesPlanning Agents
ArchiveHard·48 solvesReflection Pattern
ArchiveHard·31 solvesThe Three Colors Of Function-Calling
ArchiveHard·0 solvesThree Shapes Of A Trajectory
ArchiveHard·0 solvesComputer Use [Anthropic]
ArchiveExpert·40 solvesPipes, But With Trees
ArchiveExpert·0 solvesThe Name Of The Graph's Edge Type
ArchiveExpert·0 solvesThe OpenAI Agent SDK's Safety Net
ArchiveExpert·0 solvesThe Open Source Protocol Of 2024
ArchiveExpert·0 solvesThe Two Letters Of The JSON-RPC Header
ArchiveExpert·0 solves
RAG & Retrieval(37)
Chunking 101 [LangChain]
ArchiveVery Easy·45 solvesNot Dense, Not Sparse, But Both
ArchiveVery Easy·0 solvesThe Algorithm With A Cosine
ArchiveVery Easy·0 solvesThe Librarian's Fingertip
ArchiveVery Easy·0 solvesThe Small Piece Of A Big Doc
ArchiveVery Easy·0 solvesThree Gears Of Retrieval
ArchiveVery Easy·0 solvesChunk Overlap
ArchiveEasy·39 solvesLost in the Middle
ArchiveEasy·44 solvesThe Algorithm Robertson Loves
ArchiveEasy·0 solvesThe Graph That Is Too Approximate To Fail
ArchiveEasy·0 solvesThe Paris Library Of Vectors
ArchiveEasy·0 solvesThe Single Word For The Re-Scorer
ArchiveEasy·0 solvesThe Usual Sandwich Of Pieces
ArchiveEasy·0 solvesColBERT's Extra Letter
ArchiveMedium·0 solvesEmbedding Collapse
ArchiveMedium·41 solvesHybrid Search
ArchiveMedium·49 solvesLost in Translation
ArchiveMedium·39 solvesParent Document Retrieval
ArchiveMedium·41 solvesRe-Ranking
ArchiveMedium·48 solvesThe Dragon And The Spider
ArchiveMedium·0 solvesThe Paper With Graph-Based Community Summaries
ArchiveMedium·0 solvesThe Two Questions And One Answer
ArchiveMedium·0 solvesThe Verb For Querying The Query
ArchiveMedium·0 solvesContext Relevance [RAGAS]
ArchiveHard·42 solvesHyDE
ArchiveHard·36 solvesSix Hundred And Sixty Million Pages
ArchiveHard·0 solvesSqueeze the Noise
ArchiveHard·49 solvesThe Ill-Posed Curse Of Dimensionality
ArchiveHard·0 solvesThe Metric That Counts Top-K Hits
ArchiveHard·0 solvesThe Postgres Extension For Vectors
ArchiveHard·0 solvesThree Hats For A Chunker
ArchiveHard·0 solvesKnowledge Graph RAG
ArchiveExpert·43 solvesThe Bridge Between Keyword And Vector
ArchiveExpert·0 solvesThe Model Behind Most Open Rerankers
ArchiveExpert·0 solvesThe Sparse Dense Of SPLADE
ArchiveExpert·0 solvesThe Tiny Triple That Beats Fine-Tuning
ArchiveExpert·0 solvesThe Trick The Voyager Uses
ArchiveExpert·0 solves
AI Security(38)
The Escape From The Sandbox
ArchiveVery Easy·0 solvesThe OWASP List For LLMs
ArchiveVery Easy·0 solvesThe Process Of Breaking Your Own Model
ArchiveVery Easy·0 solvesThe Trojan Email
ArchiveVery Easy·0 solvesTwo Lines Of Defense
ArchiveVery Easy·0 solvesBoth Directions
ArchiveEasy·46 solvesInjection 101
ArchiveEasy·48 solvesJailbreak Patterns
ArchiveEasy·43 solvesPrompt Leaking
ArchiveEasy·46 solvesThe Anatomy Of A DAN Prompt
ArchiveEasy·0 solvesThe Clever Hidden Payload
ArchiveEasy·0 solvesThe Detector That Counts Probabilities
ArchiveEasy·0 solvesThe Greene-Emoji Bypass
ArchiveEasy·0 solvesThe Line Between Extraction And Leak
ArchiveEasy·0 solvesGuardrails Pipeline
ArchiveMedium·64 solvesIndirect Injection
ArchiveMedium·43 solvesOWASP LLM #1
ArchiveMedium·42 solvesPII Extraction
ArchiveMedium·47 solvesThe Dedicated Bouncer
ArchiveMedium·59 solvesThe Famous Character-String Attack
ArchiveMedium·0 solvesThe Four Pillars Of OWASP Mitigation
ArchiveMedium·0 solvesThe Framework That Audits Itself
ArchiveMedium·0 solvesThe Nightshade Of Training Data
ArchiveMedium·0 solvesThe Three Letter Defense
ArchiveMedium·0 solvesSandwich Defense
ArchiveHard·42 solvesThe Acronym For The 2025 Defensive Paradigm
ArchiveHard·0 solvesThe Encoding Trick
ArchiveHard·57 solvesThe Fix For The Echoing Agent
ArchiveHard·0 solvesThe Four-Word Phrase That Leaks The Weights
ArchiveHard·0 solvesThe Library For Checking Hallucinated Citations
ArchiveHard·0 solvesThe Membership Check From Shokri
ArchiveHard·0 solvesThink Like the Enemy
ArchiveHard·41 solvesConfused Deputy
ArchiveExpert·45 solvesThe 2024 Benchmark Of Deception
ArchiveExpert·0 solvesThe Detector Named After A Compound
ArchiveExpert·0 solvesThe Hidden Bytes Attack On Claude
ArchiveExpert·0 solvesThe Three Lines Of The Swiss Cheese
ArchiveExpert·0 solvesThe Word Anthropic Uses For Dangerous Eval
ArchiveExpert·0 solves
Evaluation & Benchmarks(35)
The Coding Benchmark With The Shortest Name
ArchiveVery Easy·0 solvesThe Giant Of The Hundred Tasks
ArchiveVery Easy·0 solvesThe Harness With A Sixty-Category Soul
ArchiveVery Easy·0 solvesThe Pass At One Story
ArchiveVery Easy·0 solvesThe Triple Of Eval Metrics For Summarization
ArchiveVery Easy·0 solvesVibe Check Problem
ArchiveVery Easy·40 solvesBLEU Limitation
ArchiveEasy·34 solvesLLM-as-a-Judge Biases
ArchiveEasy·53 solvesStanford's Holistic Eval
ArchiveEasy·0 solvesThe Arena Of Pairwise Preference
ArchiveEasy·0 solvesThe Eleuther Harness Name
ArchiveEasy·0 solvesThe Shrink's Loanword
ArchiveEasy·0 solvesThree Pieces Of Any Eval Harness
ArchiveEasy·0 solvesAnswer Relevance [RAGAS]
ArchiveMedium·37 solvesBenchmark Contamination
ArchiveMedium·34 solvesElo Rating [LMSYS]
ArchiveMedium·32 solvesFaithfulness Score [RAGAS]
ArchiveMedium·39 solvesThe Factual Grounding Metric
ArchiveMedium·0 solvesThe Four Axes Of HELM
ArchiveMedium·0 solvesThe Metric For The Needle In A Haystack
ArchiveMedium·0 solvesThe Paper That Trained A Judge
ArchiveMedium·0 solvesThe Pushy Twin Benchmark
ArchiveMedium·0 solvesG-Eval
ArchiveHard·47 solvesHuman Preference
ArchiveHard·44 solvesLLM Regression Testing
ArchiveHard·46 solvesThe 2024 Agent Benchmark Of Web Tasks
ArchiveHard·0 solvesThe Bias Axes Of BBQ
ArchiveHard·0 solvesThe Meta-Metric Of Judges
ArchiveHard·0 solvesThe Open Reasoning Benchmark With Twenty Thousand Problems
ArchiveHard·0 solvesThe Replacement Of MMLU In 2024
ArchiveHard·0 solvesThe Contamination Detector Of Dekoninck
ArchiveExpert·0 solvesThe Eval Suite By Anthropic For Red Teams
ArchiveExpert·0 solvesThe Hard Trivia Name
ArchiveExpert·0 solvesThe Paper That Argued Against Four-Choice Tests
ArchiveExpert·0 solvesThree Rings Of METR
ArchiveExpert·0 solves
Fine-Tuning & Training(37)
The Contrastive Cousin Of RLHF
ArchiveVery Easy·0 solvesThe Four-Bit Adapter
ArchiveVery Easy·0 solvesThe Four Stages Of Instruct Training
ArchiveVery Easy·0 solvesThe Model Trained From Preferences
ArchiveVery Easy·0 solvesThe Small Adapters Of Hu
ArchiveVery Easy·0 solvesData Quality
ArchiveEasy·46 solvesInstruction Tuning
ArchiveEasy·36 solvesLoRA
ArchiveEasy·42 solvesOverfitting
ArchiveEasy·45 solvesThe Algorithm That Clips The Ratio
ArchiveEasy·0 solvesThe HuggingFace Fine-Tuning Library
ArchiveEasy·0 solvesThe Parameter That Wants To Be High
ArchiveEasy·0 solvesThe Teacher-Student Dance
ArchiveEasy·0 solvesThree Columns Of A Preference Dataset
ArchiveEasy·0 solvesCatastrophic Forgetting
ArchiveMedium·31 solvesMistral's Favorite Fine-Tune Format
ArchiveMedium·0 solvesPEFT
ArchiveMedium·36 solvesQLoRA
ArchiveMedium·35 solvesRLHF Pipeline
ArchiveMedium·35 solvesSynthetic Data
ArchiveMedium·38 solvesThe Binary Loss Of Kahneman
ArchiveMedium·0 solvesThe DeepSeek-R1 Training Trick
ArchiveMedium·0 solvesThe Single-Stage ORPO
ArchiveMedium·0 solvesThree Blessings Of LoRA
ArchiveMedium·0 solvesDPO
ArchiveHard·45 solvesMereology Of Megatron-LM
ArchiveHard·0 solvesModel Merging
ArchiveHard·38 solvesThe Four Horsemen Of Fine-Tuning
ArchiveHard·0 solvesThe Meta Repo Whose Name Is A Sword
ArchiveHard·0 solvesThe Paper That Re-Finetunes The Reward
ArchiveHard·0 solvesThe Quantization That Zero-Crashes
ArchiveHard·0 solvesTraining Data Poisoning
ArchiveHard·43 solvesThe Catastrophic Name
ArchiveExpert·0 solvesThe Data Scaling Law Of Chinchilla
ArchiveExpert·0 solvesThe DeepSeek Paper's Main Algorithm
ArchiveExpert·0 solvesThe Three Phases Of R1-Zero-To-R1
ArchiveExpert·0 solvesThe Trick For Growing A Model's Tongue
ArchiveExpert·0 solves
LLM Infrastructure(35)
Real-Time Tokens
ArchiveVery Easy·45 solvesThe Cache Of Attention
ArchiveVery Easy·0 solvesThe Engine Of UC Berkeley
ArchiveVery Easy·0 solvesThe Library With The Jaguar Name
ArchiveVery Easy·0 solvesThe Three Numbers Of Precision
ArchiveVery Easy·0 solvesToken Pricing
ArchiveVery Easy·37 solvesTriton's Serving Cousin
ArchiveVery Easy·0 solvesEmbeddings
ArchiveEasy·46 solvesPrefill vs Decode
ArchiveEasy·35 solvesQuantization
ArchiveEasy·36 solvesRate Limiting
ArchiveEasy·41 solvesThe Aspect Of Flash
ArchiveEasy·0 solvesThe Flag With The GPUs Inside
ArchiveEasy·0 solvesThe Kwon Paper's Attention Name
ArchiveEasy·0 solvesThe Term For Serving Two Calls In One Batch
ArchiveEasy·0 solvesTwo Stages Of A Batch
ArchiveEasy·0 solvesA NVIDIA's CPU Offload Method
ArchiveMedium·0 solvesKV Cache
ArchiveMedium·55 solvesNo Padding Waste
ArchiveMedium·43 solvesThe Engine From LM-Sys
ArchiveMedium·0 solvesThe Paper's Kernel Library
ArchiveMedium·0 solvesThe Post-Training Quantization Of Frantar
ArchiveMedium·0 solvesThree Stages Of A Speculative Decode
ArchiveMedium·0 solvesDeepSeek's MoE Routing Trick
ArchiveHard·0 solvesDistillation
ArchiveHard·42 solvesDraft and Check
ArchiveHard·37 solvesThe 2024 Compression By SmoothQuant
ArchiveHard·0 solvesThe Llama.cpp Kernel On Apple Silicon
ArchiveHard·0 solvesThe Llama File Format Of Gerganov
ArchiveHard·0 solvesThree Features Of Continuous Batching
ArchiveHard·0 solvesThe Absorbed Attention Of DeepSeek-V2
ArchiveExpert·0 solvesThe Attention Kernel Of Dao
ArchiveExpert·0 solvesThe NVIDIA's Special Transformer Engine
ArchiveExpert·0 solvesThe Open Serving API Of 2024
ArchiveExpert·0 solvesThree Regimes Of Attention Cost
ArchiveExpert·0 solves
Multimodal & Vision(33)
The Contrastive Image-Text Pair
ArchiveVery Easy·0 solvesThe Format That Shows Up In Every Chart Request
ArchiveVery Easy·0 solvesThe OpenAI Image Model Name
ArchiveVery Easy·0 solvesThe Speech-To-Text Model From OpenAI
ArchiveVery Easy·0 solvesTwo Halves Of A Vision Transformer
ArchiveVery Easy·0 solvesVision-Language Models
ArchiveVery Easy·34 solvesOCR vs VLM
ArchiveEasy·29 solvesThe Diffusion Schedule Name
ArchiveEasy·0 solvesThe Latent Space Trick
ArchiveEasy·0 solvesThe Sigmoid Cousin Of CLIP
ArchiveEasy·0 solvesThe Weird Name For OCR-With-LLM
ArchiveEasy·0 solvesThree Tokens Of LLaVA Input
ArchiveEasy·0 solvesCLIP Training
ArchiveMedium·32 solvesDocument Understanding
ArchiveMedium·35 solvesDraw to Guide
ArchiveMedium·37 solvesImage Tokenization
ArchiveMedium·49 solvesThe 2024 Consistency Trick
ArchiveMedium·0 solvesThe Grounding Pipeline Name
ArchiveMedium·0 solvesThe Paper With The Blip Name
ArchiveMedium·0 solvesThe Tiny Tokenizer Of Images
ArchiveMedium·0 solvesThree Tasks Of A VLM
ArchiveMedium·0 solvesFour Resolutions Of Tile-Based Vision
ArchiveHard·0 solvesMultimodal RAG
ArchiveHard·42 solvesThe Apple Vision Language Model Suffix
ArchiveHard·0 solvesThe Stability-AI Diffusion That Runs In Your Browser
ArchiveHard·0 solvesThe Two-Letter Audio Model
ArchiveHard·0 solvesThe Word For A Visual Prompt
ArchiveHard·0 solvesVision Hallucinations
ArchiveHard·30 solvesThe GPT-4o Name For Its Speech Mode
ArchiveExpert·0 solvesThe Open VLM Named After A Persian Prince
ArchiveExpert·0 solvesThe Paper With The Meta Movie Model
ArchiveExpert·0 solvesThe Segmented Everything
ArchiveExpert·0 solvesThree Levels Of Grounding
ArchiveExpert·0 solves
Know Your Model(35)
Gemini's Quota Name
ArchiveVery Easy·0 solvesThe Five-Minute Fix
ArchiveVery Easy·38 solvesThe OpenAI Param That Caps Output
ArchiveVery Easy·0 solvesThe Parameter Claude Calls 'system'
ArchiveVery Easy·0 solvesThe Role String That Claude Reads
ArchiveVery Easy·0 solvesThree Headers Of The Anthropic API
ArchiveVery Easy·0 solvesAnthropic's Token Accounting Field
ArchiveEasy·0 solvesContext Windows
ArchiveEasy·43 solvesFour Pieces Of An OpenAI Tool Call
ArchiveEasy·0 solvesGemini's Nothing Character
ArchiveEasy·0 solvesModel Cards
ArchiveEasy·44 solvesThe Anthropic Beta Flag For Prompt Caching
ArchiveEasy·0 solvesThe OpenAI Parameter For Top-P Sampling
ArchiveEasy·0 solvesThinking Tokens
ArchiveEasy·36 solvesXML vs Markdown
ArchiveEasy·39 solvesClaude's Special System Tag
ArchiveMedium·0 solvesLlama Prompt Format
ArchiveMedium·42 solvesMistral's Streaming Delimiter
ArchiveMedium·0 solvesOpenAI's Structured Output Mode
ArchiveMedium·0 solvesPrefill Technique
ArchiveMedium·46 solvesPrompt Caching Providers
ArchiveMedium·46 solvesSystem Prompt Behavior
ArchiveMedium·41 solvesThe Claude Field For Stopped Reasons
ArchiveMedium·0 solvesThree Concurrent Roles Of Anthropic
ArchiveMedium·0 solvesTool Use Formats
ArchiveMedium·32 solvesThe Anthropic Model Suffix Convention
ArchiveHard·0 solvesThe Gemini 2.5 Pro Reasoning Field
ArchiveHard·0 solvesThe GPT-4o Marketing Letter
ArchiveHard·0 solvesThe OpenAI Context Of GPT-4.1
ArchiveHard·0 solvesThree Anthropic Model Speeds
ArchiveHard·0 solvesAnthropic's Grounding Footnote Field
ArchiveExpert·0 solvesOpenAI's Realtime API's Wire Format
ArchiveExpert·0 solvesThe Anthropic Beta Header Format
ArchiveExpert·0 solvesThree Caches Of Anthropic
ArchiveExpert·0 solvesThree Metrics In Anthropic Usage
ArchiveExpert·0 solves