microsoft/phi-3-medium-128k-instruct

Phi-3 Medium 128k

Release Date

Apr 21st, 2024

Parameters

14B

Context Size

128k

Creative writing

17.53%

Rule following

37.23%

Utility

46.94%

Mathematics

50.00%

Tooling

30.26%

Language

62.92%

Logic

75.63%

Dialogue tags

Various tasks related to dialogue tags in text.

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Creative writingRule following
100%100%61%14%14%14%1%1%0%0%30%
0-shot Creative writingRule following
48%34%31%14%4%1%0%0%0%0%13%
0-shot Creative writingRule following
54%50%49%49%36%24%18%8%0%0%29%
0-shot Creative writingRule following
50%36%3%0%0%0%0%0%0%0%9%
0-shot Creative writingRule following
35%33%22%11%10%1%0%0%0%0%11%
0-shot Creative writingRule following
57%49%41%20%11%2%1%0%0%0%18%
0-shot Creative writingRule following
50%30%30%10%0%0%0%0%0%0%12%
17.53%

Language Comprehension

Does the model understand more than just English?

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Total
0-shot Language
100%100%100%0%0%60%
0-shot Language
100%0%0%0%0%20%
0-shot Language
100%100%0%0%0%40%
0-shot Language
100%100%0%0%0%40%
40.00%

Language Writing

Can the model generate text in different languages?

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Total
0-shot Language
100%100%100%67%67%87%
0-shot Language
100%100%88%83%77%90%
0-shot Language
100%75%58%43%0%55%
0-shot Language
100%100%88%67%40%79%
0-shot Language
100%100%100%100%80%96%
81.26%

Novel outline

Handle questions about the outline of a novel in various formats

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot ToolingUtility
100%100%100%100%100%100%100%100%100%100%100%
0-shot ToolingUtility
0%0%0%0%0%0%0%0%0%0%0%
0-shot ToolingUtility
100%100%100%100%100%100%100%100%100%100%100%
0-shot ToolingUtility
100%0%0%0%0%0%0%0%0%0%10%
0-shot ToolingUtility
100%0%0%0%0%0%0%0%0%0%10%
0-shot ToolingUtility
0%0%0%0%0%0%0%0%0%0%0%
0-shot ToolingUtility
100%100%100%0%0%0%0%0%0%0%30%
0-shot ToolingUtility
0%0%0%0%0%0%0%0%0%0%0%
0-shot ToolingUtility
100%100%100%100%0%0%0%0%0%0%40%
0-shot ToolingUtility
0%0%0%0%0%0%0%0%0%0%0%
0-shot ToolingUtility
50%50%0%0%0%0%0%0%0%0%10%
0-shot ToolingUtility
50%50%0%0%0%0%0%0%0%0%10%
25.83%

Tool usage within Novelcrafter

Output messages that are related to tool usage within Novelcrafter

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot ToolingUtility
100%100%100%100%100%67%67%67%67%67%83%
83.33%

N-Length Sentences

Write sentences with exactly N words

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Rule following
87%87%85%82%72%67%46%42%33%28%63%
0-shot Rule following
80%78%77%71%67%61%59%56%53%52%66%
0-shot Rule following
32%15%15%10%7%5%3%0%0%0%9%
45.68%

Voice/dialogue sheets

Extract dialogue from given text as voice sheets.

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Utility
100%100%100%0%0%0%0%0%0%0%30%
1-shot Utility
100%100%0%0%0%0%0%0%0%0%20%
Few-shot Utility
0%0%0%0%0%0%0%0%0%0%0%
0-shot Utility
0%0%0%0%0%0%0%0%0%0%0%
0-shot Utility
100%100%0%0%0%0%0%0%0%0%20%
14.00%

Write N of X

Write exactly N words/sentences/paragraphs...

Scenario Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10 Total
0-shot Rule following
100%92%77%77%27%0%0%0%0%0%37%
0-shot Rule following
100%0%0%0%0%0%0%0%0%0%10%
0-shot Rule following
77%2%0%0%0%0%0%0%0%0%8%
0-shot Rule following
77%77%27%0%0%0%0%0%0%0%18%
0-shot Rule following
0%0%0%0%0%0%0%0%0%0%0%
0-shot Rule following
100%100%100%100%100%100%100%100%98%92%99%
0-shot Rule following
100%100%100%100%100%100%100%100%100%100%100%
0-shot Rule following
100%100%100%98%98%98%77%77%77%77%90%
0-shot Rule following
77%54%2%2%0%0%0%0%0%0%13%
0-shot Rule following
0%0%0%0%0%0%0%0%0%0%0%
0-shot Rule following
100%100%100%100%100%100%100%100%100%100%100%
0-shot Rule following
100%100%100%100%100%100%100%0%0%0%70%
0-shot Rule following
100%100%100%100%100%0%0%0%0%0%50%
45.89%