anthropic/claude-3-haiku:beta

Claude 3 Haiku

Release Date

Mar 13th, 2024

Parameters

Context Size

200k

Creative writing

36.00%

Rule following

50.47%

Utility

64.67%

Mathematics

95.00%

Tooling

53.46%

Language

72.18%

Logic

80.63%

Dialogue tags

Various tasks related to dialogue tags in text.

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Creative writingRule following
100%100%100%100%100%100%100%100%100%100%100%
0-shot Creative writingRule following
49%49%49%49%49%47%45%26%7%0%37%
0-shot Creative writingRule following
74%71%61%51%49%41%41%38%22%14%46%
0-shot Creative writingRule following
50%50%50%47%38%22%18%7%1%0%28%
0-shot Creative writingRule following
79%71%15%3%1%0%0%0%0%0%17%
0-shot Creative writingRule following
38%1%1%0%0%0%0%0%0%0%4%
0-shot Creative writingRule following
49%44%30%25%22%14%13%1%0%0%20%
36.00%

Language Comprehension

Does the model understand more than just English?

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Total
0-shot Language
100%100%100%100%100%100%
0-shot Language
100%0%0%0%0%20%
0-shot Language
100%100%100%100%100%100%
0-shot Language
100%100%0%0%0%40%
65.00%

Language Writing

Can the model generate text in different languages?

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Total
0-shot Language
100%100%90%89%88%93%
0-shot Language
83%50%50%50%43%55%
0-shot Language
100%73%50%50%43%63%
0-shot Language
100%100%100%50%40%78%
0-shot Language
100%100%100%100%100%100%
77.93%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot ToolingUtility
100%100%100%100%100%100%100%100%100%100%100%
100.00%

N-Length Sentences

Write sentences with exactly N words

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Rule following
89%81%78%77%72%72%68%65%47%1%65%
0-shot Rule following
97%84%84%83%83%75%74%63%61%28%73%
0-shot Rule following
27%15%13%10%4%0%0%0%0%0%7%
48.40%

Voice/dialogue sheets

Extract dialogue from given text as voice sheets.

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Utility
0%0%0%0%0%0%0%0%0%0%0%
1-shot Utility
100%100%100%100%100%100%100%100%100%100%100%
Few-shot Utility
100%100%100%100%100%100%100%100%100%100%100%
0-shot Utility
0%0%0%0%0%0%0%0%0%0%0%
0-shot Utility
0%0%0%0%0%0%0%0%0%0%0%
40.00%

Write N of X

Write exactly N words/sentences/paragraphs...

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Rule following
100%100%100%98%98%98%98%92%92%77%96%
0-shot Rule following
100%100%100%100%98%54%54%54%27%2%69%
0-shot Rule following
0%0%0%0%0%0%0%0%0%0%0%
0-shot Rule following
0%0%0%0%0%0%0%0%0%0%0%
0-shot Rule following
0%0%0%0%0%0%0%0%0%0%0%
0-shot Rule following
100%100%100%100%100%100%100%100%100%100%100%
0-shot Rule following
100%100%100%100%100%100%100%100%100%100%100%
0-shot Rule following
100%100%100%100%77%77%77%54%54%27%77%
0-shot Rule following
100%100%100%0%0%0%0%0%0%0%30%
0-shot Rule following
27%0%0%0%0%0%0%0%0%0%3%
0-shot Rule following
100%100%100%100%100%100%100%100%100%100%100%
0-shot Rule following
100%100%100%100%100%100%100%100%100%0%90%
0-shot Rule following
100%100%100%100%100%100%100%100%100%100%100%
58.75%