Ensemble

Dataset: potions

Models

Model Narratives

openailong

Round ID: 367
Prompt used:
	Evaluate the entity outcomes based on the following clear and precise rules:
	
	1. If FizzIntensity > 50.0, classify as Effective regardless of ColourShift.
	2. If FizzIntensity is between 45.0 and 50.0 (exclusive) and ColourShift <= 9.0, classify as Effective to prevent false positives.
	3. If FizzIntensity is between 42.0 and 45.0 (exclusive) and ColourShift <= 10.0, classify as Effective, addressing previously misclassified instances.
	4. If FizzIntensity is between 40.0 and 42.0 (exclusive) and ColourShift <= 11.0, classify as Effective to ensure no false negatives.
	5. If FizzIntensity is between 38.0 and 40.0 (exclusive) and ColourShift <= 9.0, classify as Effective, tightening conditions to prevent misclassifications.
	6. If FizzIntensity is between 35.0 and 38.0 (inclusive) and ColourShift <= 8.0, classify as Effective, reinforcing stricter conditions for this range.
	7. If FizzIntensity is between 30.0 and 34.9 (inclusive) and ColourShift <= 4.0, classify as Effective, ensuring accurate outputs in this range.
	8. If FizzIntensity is between 30.0 and 34.9 (inclusive) and ColourShift > 10.0, classify as Ineffective, maintaining a stricter threshold.
	9. If FizzIntensity < 30.0 or ColourShift > 18.0, classify as Ineffective.
	10. If FizzIntensity is between 25.0 and 30.0 (inclusive) and ColourShift is between 10.0 and 18.0, classify as Ineffective to consistently reinforce definitions.
	11. If FizzIntensity is between 20.0 and 24.9 (inclusive), explicitly classify as Ineffective regardless of ColourShift.
	12. If FizzIntensity is below 20.0, classify as Ineffective regardless of ColourShift.

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    2                    7
Actual Ineffective                    0                   11

Accuracy: 0.650
Precision: 1.000
Recall: 0.222
F1 Score: 0.364

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 47.042286
	ColourShift: 9.9699135


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 36.28945
	ColourShift: 11.461653


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 31.36187
	ColourShift: 13.327494


openai35

Round ID: 486
Prompt used:
	Prompt: If FizzIntensity is greater than 40, predict as Effective. If FizzIntensity is 40 or lower, predict as Ineffective.

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    7                    2
Actual Ineffective                    3                    8

Accuracy: 0.750
Precision: 0.700
Recall: 0.778
F1 Score: 0.737

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 47.042286
	ColourShift: 9.9699135


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 36.28945
	ColourShift: 11.461653


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 41.101128
	ColourShift: 15.34901


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 28.113564
	ColourShift: 20.790554


random

Round ID: 134
Prompt used:
	Choose randomly

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    3                    6
Actual Ineffective                    6                    5

Accuracy: 0.400
Precision: 0.333
Recall: 0.333
F1 Score: 0.333

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 36.28945
	ColourShift: 11.461653


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 37.055344
	ColourShift: 15.48838


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 34.24991
	ColourShift: 10.365348


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 26.284357
	ColourShift: 21.189081


Ensemble Confusion Matrix

Predicted +Predicted -
Actual +45
Actual -110

Accuracy 0.700, Precision 0.800, Recall 0.444, F1 0.571