Ensemble

Dataset: potions

Models

Model Narratives

geminipro10

Round ID: 341
Prompt used:
	Predict the outcome based on the following rules:
	
	1. If FizzIntensity is greater than 60, predict Effective.
	2. Else if FizzIntensity is greater than 45 AND ColourShift is greater than 10, predict Effective.
	3. Otherwise, predict Ineffective.
	

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    3                    6
Actual Ineffective                    1                   10

Accuracy: 0.650
Precision: 0.750
Recall: 0.333
F1 Score: 0.462

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 45.78967
	ColourShift: 12.371225


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 36.28945
	ColourShift: 11.461653


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 49.864723
	ColourShift: 16.120462


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 23.177788
	ColourShift: 25.461937

openai10o1

Round ID: 474
Prompt used:
	Choose randomly

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    5                    4
Actual Ineffective                    4                    7

Accuracy: 0.600
Precision: 0.556
Recall: 0.556
F1 Score: 0.556

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 47.042286
	ColourShift: 9.9699135


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 66.28547
	ColourShift: 8.929057


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 23.177788
	ColourShift: 25.461937


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 34.24991
	ColourShift: 10.365348

openai35

Round ID: 486
Prompt used:
	Prompt: If FizzIntensity is greater than 40, predict as Effective. If FizzIntensity is 40 or lower, predict as Ineffective.

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    7                    2
Actual Ineffective                    3                    8

Accuracy: 0.750
Precision: 0.700
Recall: 0.778
F1 Score: 0.737

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 44.953373
	ColourShift: 12.993897


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 36.28945
	ColourShift: 11.461653


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 41.101128
	ColourShift: 15.34901


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 28.113564
	ColourShift: 20.790554

Ensemble Confusion Matrix

	Predicted +	Predicted -
Actual +	5	4
Actual -	1	10

Accuracy 0.750, Precision 0.833, Recall 0.556, F1 0.667