Training model: gpt-4o-2024-11-20
Inference model: gpt-4o-mini
Dataset | Validation accuracy | Test accuracy |
---|---|---|
espionage | 0.7 | 0.8 |
potions | 0.8 | 0.6 |
timetravel_insurance | 0.55 | 0.55 |
titanic | 0.6862745098039216 | 0.6078431372549019 |
wisconsin | 0.8253968253968254 | 0.7575757575757576 |
wisconsin | 0.873015873015873 | 0.8636363636363636 |