Run: 2025-04-14-01JRTSWAYC2MG2BNQC7RMZTXW8

Run Details

Model:: GPT-4.1
Test:: Data extraction
Scenario:: What's the correct time?
Duration:: 1.1s
Cost:: $0.000142
Provider:: OpenAI

Evaluation summary

Score	Evaluator	Details Link
Fail	Matches Regex	(details)

Model input

User/Human

What is the correct time based on the following text? Only return the time in HH:MM format, nothing else.

The clock shows 3:45. It's 15 minutes slow. The actual time is a quarter past an hour.

Model response

4:00

Evaluation details

Result	Evaluator	Details	Meta Data
Fail	Matches Regex	/\b0?4:15\b/i	n/a

id	"gen-1744656281-DBi9GQYUNNSLRtqKnxwR"
upstream_id	"chatcmpl-BMJ5NsmzeD6agIYlgvP6dMqR07cnF"
total_cost	0.000142
cache_discount	(null)
provider_name	"OpenAI"
created_at	"2025-04-14T18:44:42.287407+00:00" 4/14/2025, 6:44:42 PM
model	"openai/gpt-4.1-2025-04-14"
app_id	182717
streamed	true
cancelled	false
latency	504
moderation_latency	116
generation_time	34
tokens_prompt	56
tokens_completion	3
native_tokens_prompt	55
native_tokens_completion	4
native_tokens_reasoning	0
num_media_prompt	(null)
num_media_completion	(null)
num_search_results	(null)
origin	"https://ncbench.com/"
is_byok	false
finish_reason	"stop"
native_finish_reason	"stop"
usage	0.000142