Model anthropic10

Training model: claude-3-5-haiku-20241022
Inference model: gpt-4o-mini

Investigations

Performance

DatasetValidation accuracyTest accuracy
espionage1.00.85
potions0.90.6
southgermancredit0.57692307692307690.45132743362831856
timetravel_insurance0.80.85
titanic0.74509803921568630.6274509803921569
wisconsin0.44444444444444440.4696969696969697