Training model: claude-3-5-haiku-20241022
Inference model: gpt-4o-mini
Dataset | Validation accuracy | Test accuracy |
---|---|---|
espionage | 0.85 | 0.8 |
potions | 0.85 | 0.7 |
southgermancredit | 0.5192307692307693 | 0.5221238938053098 |
timetravel_insurance | 0.9 | 0.75 |
titanic | 0.6078431372549019 | 0.6274509803921569 |
wisconsin | 0.7936507936507936 | 0.8333333333333334 |