Training model: gpt-3.5-turbo-0125
Inference model: gpt-4o-mini
| Dataset | Validation accuracy | Test accuracy |
|---|---|---|
| espionage | 0.55 | 0.8 |
| potions | 0.9 | 0.75 |
| southgermancredit | 0.6057692307692307 | 0.7787610619469026 |
| timetravel_insurance | 0.65 | 0.55 |
| titanic | 0.5490196078431373 | 0.5098039215686274 |
| wisconsin | 0.6507936507936508 | 0.5454545454545454 |