Model openaio310

Training model: o3
Inference model: gpt-4o-mini

Investigations

Performance

DatasetValidation accuracyTest accuracy
espionage1.00.75
potions0.80.8
southgermancredit0.59615384615384610.6017699115044248
timetravel_insurance0.850.8
titanic0.76470588235294110.7843137254901961
wisconsin0.90476190476190480.7272727272727273