Creative writing
29.15%Rule following
57.03%Utility
78.17%Mathematics
100.00%Tooling
68.46%Language
95.68%Logic
84.38%Data extraction
Extract key details from a given block of text.
| Scenario | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Run 6 | Run 7 | Run 8 | Run 9 | Run 10 | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 50% | 50% | 50% | 50% | 50% | 75% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 89.58% | |||||||||||
Dialogue tags
Various tasks related to dialogue tags in text.
| Scenario | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Run 6 | Run 7 | Run 8 | Run 9 | Run 10 | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 88% | 88% | 65% | 49% | 48% | 48% | 30% | 0% | 0% | 0% | 42% | |
| 34% | 18% | 12% | 10% | 7% | 3% | 2% | 0% | 0% | 0% | 9% | |
| 91% | 76% | 50% | 50% | 50% | 50% | 28% | 18% | 18% | 10% | 44% | |
| 14% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 1% | |
| 46% | 18% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 6% | |
| 14% | 4% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 2% | |
| 29.15% | |||||||||||
Language Comprehension
Does the model understand more than just English?
| Scenario | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Total |
|---|---|---|---|---|---|---|
| 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 0% | 80% | |
| 100% | 100% | 100% | 100% | 100% | 100% | |
| 95.00% | ||||||
Language Writing
Can the model generate text in different languages?
| Scenario | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Total |
|---|---|---|---|---|---|---|
| 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 91% | 91% | 90% | 94% | |
| 94% | 92% | 92% | 90% | 88% | 91% | |
| 100% | 100% | 94% | 92% | 92% | 96% | |
| 100% | 100% | 100% | 100% | 100% | 100% | |
| 96.22% | ||||||
Novel outline
Handle questions about the outline of a novel in various formats
| Scenario | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Run 6 | Run 7 | Run 8 | Run 9 | Run 10 | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 20% | |
| 100% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 10% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 0% | 0% | 0% | 70% | |
| 100% | 100% | 100% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 30% | |
| 100% | 100% | 100% | 100% | 0% | 0% | 0% | 0% | 0% | 0% | 40% | |
| 100% | 100% | 100% | 100% | 0% | 0% | 0% | 0% | 0% | 0% | 40% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 0% | 0% | 80% | |
| 65.83% | |||||||||||
Tool usage within Novelcrafter
Output messages that are related to tool usage within Novelcrafter
| Scenario | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Run 6 | Run 7 | Run 8 | Run 9 | Run 10 | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100.00% | |||||||||||
N-Length Sentences
Write sentences with exactly N words
| Scenario | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Run 6 | Run 7 | Run 8 | Run 9 | Run 10 | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 100% | 100% | 100% | 98% | 98% | 98% | 89% | 87% | 82% | 81% | 93% | |
| 56% | 54% | 48% | 39% | 39% | 39% | 25% | 11% | 10% | 5% | 33% | |
| 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | |
| 41.97% | |||||||||||
Voice/dialogue sheets
Extract dialogue from given text as voice sheets.
| Scenario | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Run 6 | Run 7 | Run 8 | Run 9 | Run 10 | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 10% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 0% | 0% | 0% | 70% | |
| 76.00% | |||||||||||
Write N of X
Write exactly N words/sentences/paragraphs...
| Scenario | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Run 6 | Run 7 | Run 8 | Run 9 | Run 10 | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 98% | 98% | 98% | 99% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 98% | 98% | 98% | 99% | |
| 100% | 100% | 98% | 92% | 92% | 27% | 27% | 27% | 9% | 2% | 58% | |
| 100% | 100% | 98% | 77% | 54% | 27% | 27% | 27% | 0% | 0% | 51% | |
| 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 98% | 77% | 77% | 9% | 0% | 0% | 66% | |
| 77% | 2% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 8% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| 75.52% | |||||||||||