Data extraction

Extract key details from a given block of text.

Price-Performance Score Distribution (Top 20)

Click a model name to view its detail page.

ScoreCostTime
Gemma 3 4B92%$0.0000303ms
Mistral Small Creative93%$0.0000362ms
Ministral 3B73%$0.0000308ms
Ministral 8B75%$0.0000331ms
Gemini 2.5 Flash Lite92%$0.0000357ms
Ministral 3 3B78%$0.0000416ms
Llama 3.1 8B85%$0.0000441ms
Inception Mercury91%$0.0000528ms
Ministral 3 14B88%$0.0000448ms
Gemma 3 12B92%$0.0000542ms
Ministral 3 8B71%$0.0000382ms
Mistral Small 3.2 24B83%$0.0000691ms
Mistral Small 488%$0.0000539ms
Gemini 2.5 Flash83%$0.0000473ms
Gemma 3 27B92%$0.0000780ms
LFM2 24B79%$0.00001.4s
Stealth: Aurora Alpha92%—1.6s
GPT-5.4 Nano93%$0.0000768ms
Mistral Medium 3.188%$0.0000655ms
Arcee AI: Trinity Large (Preview)81%$0.00001.1s
0.500.600.700.800.901.00

Cost vs Performance

Compares total cost for this test against the test score. Quadrant lines are drawn at the median values. Only models with available cost data are shown.

11 low-scoring outliers hidden: Arcee AI: Trinity Large (Preview) (80.8%), LFM2 24B (79.2%), Ministral 3 3B (78.3%), Rocinante 12B (77.9%), Ministral 8B (75.0%), WizardLM 2 8x22b (73.3%), Ministral 3B (72.9%), Grok 4.20 (Beta, Reasoning) (71.7%), Ministral 3 8B (70.8%), Mistral Large (70.4%), Cohere Command R+ (Aug. 2024) (63.3%).

Most Stable Models (Top 20)

Ranked by stability (median × consistency). Click a model name to view its detail page.

ScoreConsistencyStability
Gemini 3 Flash (Preview, Reasoning)99%82%82%
Claude Sonnet 496%72%72%
GPT-4o Mini (temp=0)96%72%72%
GPT-4o Mini (temp=1)94%63%63%
Z.AI GLM 4.696%60%60%
Mistral Small Creative93%59%59%
DeepSeek V3 (2025-03-24)91%55%55%
GPT-5.4 Nano93%55%55%
Gemini 2.5 Pro94%53%53%
Gemini 2.5 Flash Lite (Reasoning)94%53%53%
ByteDance Seed 2.0 Lite94%53%53%
Claude Opus 492%53%53%
Gemini 3.1 Pro (Preview)93%50%50%
Z.AI GLM 593%50%50%
MoonshotAI: Kimi K2.593%50%50%
Gemini 2.5 Flash (Reasoning)93%50%50%
GPT-5.4 Mini93%50%50%
GPT-5 Mini93%47%47%
GPT-5.293%47%47%
MiniMax M2.593%47%47%
0%10%20%30%40%50%60%70%80%90%100%

Top Overall Models (Top 20)

Ranked by composite score (performance, cost, speed & stability). Click a model name to view its detail page.

ScoreCostSpeedStability
Claude Sonnet 496%$0.00041.6s72%
Gemini 3 Flash (Preview, Reasoning)99%$0.00267.0s82%
GPT-4o Mini (temp=0)96%$0.00008.1s72%
Mistral Small Creative93%$0.0000362ms59%
GPT-5.4 Nano93%$0.0000768ms55%
GPT-5.4 Mini93%$0.0001658ms50%
Gemini 2.5 Flash Lite (Reasoning)94%$0.00043.8s53%
DeepSeek V3 (2025-03-24)91%$0.00002.4s55%
Gemma 3 4B92%$0.0000303ms45%
Gemini 2.5 Flash Lite92%$0.0000357ms45%
Gemma 3 12B92%$0.0000542ms45%
Inception Mercury 292%$0.0002471ms45%
Gemma 3 27B92%$0.0000780ms45%
Gemini 3 Flash (Preview)92%$0.0001835ms45%
GPT-5.492%$0.0003696ms45%
Gemini 3.1 Flash Lite (Preview)91%$0.0000708ms44%
GPT-5.4 Nano (Reasoning, Low)92%$0.00011.9s45%
DeepSeek V3 (2024-12-26)90%$0.00001.3s47%
GPT-5.4 Mini (Reasoning, Low)92%$0.00032.2s45%
GPT-4o Mini (temp=1)94%$0.000016.4s63%
0%10%20%30%40%50%60%70%80%90%100%
Model Total â–¼Who's the tallest?What's the color of the car?What instrument does Lucy play?Guess the petWhat's the correct time?Who's the sister?Contextual pronounIndirect birth yearFruits excluding citrusFuture event timeHighest-rated movieAll valid emails
Gemini 3 Flash (Preview, Reasoning)99%100%100%100%100%90%100%100%100%100%100%100%100%
Z.AI GLM 4.696%100%100%100%100%50%100%100%100%100%100%100%100%
Claude Sonnet 496%100%100%100%100%100%100%100%100%100%50%100%100%
GPT-4o Mini (temp=0)96%100%100%100%100%100%100%100%100%100%50%100%100%
Gemini 2.5 Pro94%100%100%100%100%40%100%100%100%90%100%100%100%
Gemini 2.5 Flash Lite (Reasoning)94%100%100%100%100%30%100%100%100%100%100%100%100%
ByteDance Seed 2.0 Lite94%100%100%100%100%30%100%100%100%100%100%100%100%
GPT-4o Mini (temp=1)94%100%100%100%100%80%100%100%100%100%50%100%100%
Gemini 3.1 Pro (Preview)93%100%100%100%100%20%100%100%100%100%100%100%100%
Z.AI GLM 593%100%100%100%100%20%100%100%100%100%100%100%100%
MoonshotAI: Kimi K2.593%100%100%100%100%20%100%100%100%100%100%100%100%
Gemini 2.5 Flash (Reasoning)93%100%80%100%100%60%100%100%100%80%100%100%100%
GPT-5.4 Mini93%100%100%100%100%20%100%100%100%100%100%100%100%
Mistral Small Creative93%100%100%100%100%70%100%100%100%100%50%100%100%
GPT-5.4 Nano93%100%100%90%100%60%100%100%100%100%65%100%100%
1–15 of 118
Page 1 / 8

Who's the tallest?

What's the color of the car?

What instrument does Lucy play?

Guess the pet

What's the correct time?

Who's the sister?

Contextual pronoun

Indirect birth year

Fruits excluding citrus

Future event time

Highest-rated movie

All valid emails