qwen/qwen-2.5-72b-instruct

Qwen 2.5 72B

Release Date

Sep 19th, 2024

Parameters

72B

Context Size

131.1k

Creative writing

26.60%

Rule following

56.88%

Utility

67.17%

Mathematics

100.00%

Tooling

66.54%

Language

66.91%

Logic

81.25%

Dialogue tags

Various tasks related to dialogue tags in text.

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Creative writingRule following
100%100%100%100%61%14%14%14%14%14%53%
0-shot Creative writingRule following
50%50%45%43%22%0%0%0%0%0%21%
0-shot Creative writingRule following
50%50%49%49%26%13%10%7%5%0%26%
0-shot Creative writingRule following
100%100%79%72%53%50%50%50%20%0%57%
0-shot Creative writingRule following
47%2%1%0%0%0%0%0%0%0%5%
0-shot Creative writingRule following
14%12%5%0%0%0%0%0%0%0%3%
0-shot Creative writingRule following
50%47%37%35%30%12%0%0%0%0%21%
26.60%

Language Comprehension

Does the model understand more than just English?

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Total
0-shot Language
100%100%100%100%100%100%
0-shot Language
100%100%100%100%100%100%
0-shot Language
100%100%100%100%0%80%
0-shot Language
0%0%0%0%0%0%
70.00%

Language Writing

Can the model generate text in different languages?

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Total
0-shot Language
100%94%92%89%87%92%
0-shot Language
100%95%67%0%0%52%
0-shot Language
100%79%0%0%0%36%
0-shot Language
100%100%92%88%41%84%
0-shot Language
100%48%47%46%46%57%
64.44%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot ToolingUtility
100%100%100%100%100%100%100%100%100%100%100%
100.00%

N-Length Sentences

Write sentences with exactly N words

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Rule following
100%100%100%100%100%100%100%95%93%0%89%
0-shot Rule following
100%96%94%92%91%88%85%83%65%37%83%
0-shot Rule following
100%61%61%61%61%61%61%61%40%35%60%
77.33%

Voice/dialogue sheets

Extract dialogue from given text as voice sheets.

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Utility
0%0%0%0%0%0%0%0%0%0%0%
1-shot Utility
0%0%0%0%0%0%0%0%0%0%0%
Few-shot Utility
0%0%0%0%0%0%0%0%0%0%0%
0-shot Utility
0%0%0%0%0%0%0%0%0%0%0%
0-shot Utility
100%100%100%100%100%100%100%100%100%100%100%
20.00%

Write N of X

Write exactly N words/sentences/paragraphs...

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Rule following
100%98%98%98%98%98%92%92%77%27%88%
0-shot Rule following
100%100%100%100%100%100%100%100%92%54%95%
0-shot Rule following
100%98%77%77%54%27%2%0%0%0%44%
0-shot Rule following
100%100%92%9%2%0%0%0%0%0%30%
0-shot Rule following
0%0%0%0%0%0%0%0%0%0%0%
0-shot Rule following
100%100%100%100%100%100%100%100%100%100%100%
0-shot Rule following
100%100%100%100%100%100%100%100%100%100%100%
0-shot Rule following
100%98%98%98%98%98%98%92%92%92%97%
0-shot Rule following
98%98%54%54%27%27%9%0%0%0%37%
0-shot Rule following
0%0%0%0%0%0%0%0%0%0%0%
0-shot Rule following
100%100%100%100%100%100%100%100%100%100%100%
0-shot Rule following
100%100%100%100%100%100%100%100%100%100%100%
0-shot Rule following
100%100%100%100%100%100%100%100%100%100%100%
68.46%