Training model: gpt-4o-2024-11-20
Inference model: gpt-4o-mini
| Dataset | Validation accuracy | Test accuracy |
|---|---|---|
| espionage | 0.9 | 0.9 |
| potions | 0.9 | 0.6 |
| timetravel_insurance | 0.75 | 0.75 |
| titanic | 0.6470588235294118 | 0.5490196078431373 |
| wisconsin | 0.8412698412698413 | 0.7121212121212122 |
| wisconsin | 0.8253968253968254 | 0.8181818181818182 |