Ensemble

Dataset: potions

Models

Model Narratives

anthropic

Round ID: 280
Prompt used:
	Classification Rules for Entity Effectiveness:
	
	1. High Effectiveness Criteria:
	   - FizzIntensity must be > 45
	   - ColourShift must be > 10
	   
	2. Low Effectiveness Criteria:
	   - FizzIntensity must be < 40
	   - ColourShift must be < 10
	
	3. Borderline Case Scoring:
	   - Calculate a weighted score: 
	     (FizzIntensity * 0.6) + (ColourShift * 0.4)
	   - If score > 35, classify as Effective
	   - If score < 30, classify as Ineffective
	
	4. Intermediate Cases:
	   - If criteria are not clearly met, use the weighted score to determine classification
	   - Carefully evaluate cases with FizzIntensity between 40-45 and ColourShift between 10-15
	
	Classification Decision Process:
	- First check high and low effectiveness criteria
	- If inconclusive, apply weighted scoring
	- Aim to minimize false positives and false negatives by using a nuanced approach

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    4                    5
Actual Ineffective                    1                   10

Accuracy: 0.700
Precision: 0.800
Recall: 0.444
F1 Score: 0.571

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 45.78967
	ColourShift: 12.371225


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 42.094933
	ColourShift: 12.042143


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 49.864723
	ColourShift: 16.120462


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 23.177788
	ColourShift: 25.461937


anthropic10

Round ID: 254
Prompt used:
	Classify an entity as Effective if BOTH of these conditions are true:
	1. The FizzIntensity is greater than 40
	2. The ColourShift is greater than 12
	
	If either condition is not met, classify the entity as Ineffective.
	
	Reasoning steps:
	- High FizzIntensity (> 40) suggests strong potential
	- Significant ColourShift (> 12) indicates meaningful change
	- Both conditions must be simultaneously true to be considered Effective

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    4                    5
Actual Ineffective                    3                    8

Accuracy: 0.600
Precision: 0.571
Recall: 0.444
F1 Score: 0.500

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 42.903545
	ColourShift: 19.770008


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 36.28945
	ColourShift: 11.461653


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 40.698505
	ColourShift: 14.4273


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 31.36187
	ColourShift: 13.327494


openai35

Round ID: 486
Prompt used:
	Prompt: If FizzIntensity is greater than 40, predict as Effective. If FizzIntensity is 40 or lower, predict as Ineffective.

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    7                    2
Actual Ineffective                    3                    8

Accuracy: 0.750
Precision: 0.700
Recall: 0.778
F1 Score: 0.737

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 66.28547
	ColourShift: 8.929057


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 37.055344
	ColourShift: 15.48838


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 49.864723
	ColourShift: 16.120462


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 31.36187
	ColourShift: 13.327494


Ensemble Confusion Matrix

Predicted +Predicted -
Actual +63
Actual -38

Accuracy 0.700, Precision 0.667, Recall 0.667, F1 0.667