Ensemble

Dataset: potions

Models

Model Narratives

anthropic10

Round ID: 254
Prompt used:
	Classify an entity as Effective if BOTH of these conditions are true:
	1. The FizzIntensity is greater than 40
	2. The ColourShift is greater than 12
	
	If either condition is not met, classify the entity as Ineffective.
	
	Reasoning steps:
	- High FizzIntensity (> 40) suggests strong potential
	- Significant ColourShift (> 12) indicates meaningful change
	- Both conditions must be simultaneously true to be considered Effective

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    4                    5
Actual Ineffective                    3                    8

Accuracy: 0.600
Precision: 0.571
Recall: 0.444
F1 Score: 0.500

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 45.78967
	ColourShift: 12.371225


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 47.878643
	ColourShift: 10.863845


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 40.698505
	ColourShift: 14.4273


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 28.113564
	ColourShift: 20.790554


gemini

Round ID: 292
Prompt used:
	If ColourShift is greater than 17, then predict 'Ineffective'. Otherwise, if FizzIntensity is less than 30, then predict 'Ineffective'. Otherwise predict 'Effective'

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    8                    1
Actual Ineffective                    6                    5

Accuracy: 0.650
Precision: 0.571
Recall: 0.889
F1 Score: 0.696

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 36.28945
	ColourShift: 11.461653


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 42.903545
	ColourShift: 19.770008


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 37.77209
	ColourShift: 14.7023735


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 26.284357
	ColourShift: 21.189081


geminipro10

Round ID: 341
Prompt used:
	Predict the outcome based on the following rules:
	
	1. If FizzIntensity is greater than 60, predict Effective.
	2. Else if FizzIntensity is greater than 45 AND ColourShift is greater than 10, predict Effective.
	3. Otherwise, predict Ineffective.
	

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    3                    6
Actual Ineffective                    1                   10

Accuracy: 0.650
Precision: 0.750
Recall: 0.333
F1 Score: 0.462

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 47.878643
	ColourShift: 10.863845


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 44.953373
	ColourShift: 12.993897


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 49.864723
	ColourShift: 16.120462


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 34.24991
	ColourShift: 10.365348


Ensemble Confusion Matrix

Predicted +Predicted -
Actual +54
Actual -38

Accuracy 0.650, Precision 0.625, Recall 0.556, F1 0.588