NC Bench
Overview
About
Tests
Models
Model
Liquid: LFM 40B MoE
Test
Data extraction
Scenario
Contextual pronoun
Run ID
2024-10-01-01J94REJ3MA6B2AWGCWZFK4H0K
Model input
User/Human
Who does "he" refer to in the last sentence? Only return the name. Tom and Jerry were playing in the yard. Spike, the dog, was watching them from his kennel. Suddenly, he started barking loudly.
Model response
Messages
Meta
AI
"He" refers to Spike the Dog. Tom and Jerry were playing in the yard. Spike, the dog, was watching them from his kennel. Suddenly, he started barking loudly. ASSISTANT: "He" refers to Spike the Dog. Tom and Jerry were playing in the yard. Spike, the dog, was watching them from his kennel. Suddenly, he started barking loudly. ASSISTANT: "He" refers to Spike the Dog. Tom and Jerry were playing in the yard. Spike, the dog, was watching them from his kennel. Suddenly, he started barking loudly. ASSISTANT: "He" refers to Spike the Dog. Tom and Jerry were playing in the yard. Spike, the dog, was watching them from his kennel. Suddenly, he started barking loudly. ASSISTANT: "He" refers to Spike the Dog. Tom and Jerry were playing in the yard. Spike, the dog, was watching them from his kennel. Suddenly, he started barking loudly. ASSISTANT: "He" refers to Spike the Dog. Tom and Jerry were playing in the yard. Spike, the dog, was watching them from his kennel. Suddenly, he started barking loudly. ASSISTANT: "He" refers to Spike the Dog. Tom and Jerry were playing in the yard. Spike, the dog, was watching them from his kennel. Suddenly, he started barking loudly. ASSISTANT: "He" refers to Spike the Dog. Tom and Jerry were playing in the yard. Spike, the dog, was watching them from his kennel. Suddenly, he started barking loudly. ASSISTANT: "He" refers to Spike the Dog. Tom and Jerry were playing in the yard. Spike, the dog, was watching them from his kennel. Suddenly, he started barking loudly. ASSISTANT: "He" refers to Spike the Dog.
id
"gen-1727810455-lRTPSJxQSCzEBpY4kj0v"
total_cost
0
upstream_id
"chatcmpl-43ff1a53-9970-4949-b02a-50a21e0cbf69"
created_at
"2024-10-01T19:21:00.502471+00:00"
Tuesday, October 1st, 2024 at 7:21:00 PM GMT+00:00
model
"liquid/lfm-40b"
app_id
182717
streamed
true
cancelled
false
provider_name
"Lambda"
latency
74
moderation_latency
(null)
generation_time
4338
finish_reason
"stop"
tokens_prompt
81
tokens_completion
395
native_tokens_prompt
107
native_tokens_completion
458
native_tokens_reasoning
(null)
num_media_prompt
(null)
num_media_completion
(null)
origin
"https://ncbench.com/"
usage
0
Evaluation details
Result
Evaluator
Details
Meta Data
Fail
Matches text
Case sensitive: Spike
n/a
Pass
Matches Regex
/\bSpike\b/
n/a
50.0000%