Ensemble

Dataset: potions

Models

Model Narratives

gemini

Round ID: 292
Prompt used:
	If ColourShift is greater than 17, then predict 'Ineffective'. Otherwise, if FizzIntensity is less than 30, then predict 'Ineffective'. Otherwise predict 'Effective'

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    8                    1
Actual Ineffective                    6                    5

Accuracy: 0.650
Precision: 0.571
Recall: 0.889
F1 Score: 0.696

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 37.055344
	ColourShift: 15.48838


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 42.903545
	ColourShift: 19.770008


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 37.77209
	ColourShift: 14.7023735


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 28.113564
	ColourShift: 20.790554

openaio3

Round ID: 317
Prompt used:
	Task: Decide whether the entity is “Effective” or “Ineffective”.
	You are given exactly two real‑valued fields per entity:
	• FizzIntensity  (in arbitrary units)
	• ColourShift    (in arbitrary units)
	
	Apply the following rules IN ORDER and output only the single word Effective or Ineffective.
	
	Rule 0 – Definitely Ineffective
	   If FizzIntensity < 25 → Ineffective
	
	Rule 1 – Very high fizz overrides everything
	   Else, if FizzIntensity ≥ 60 → Effective
	
	Rule 2 – Solid fizz with at least a small colour shift
	   Else, if FizzIntensity ≥ 44  AND  ColourShift ≥ 9 → Effective
	
	Rule 3 – Moderate fizz but large colour shift
	   Else, if ColourShift ≥ 20  AND  FizzIntensity ≥ 38 → Effective
	
	Default
	   Else → Ineffective
	
	Remember: evaluate the rules in sequence; stop at the first rule that matches. Output nothing except the final label.

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    4                    5
Actual Ineffective                    0                   11

Accuracy: 0.750
Precision: 1.000
Recall: 0.444
F1 Score: 0.615

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 47.042286
	ColourShift: 9.9699135


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 44.953373
	ColourShift: 12.993897


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 41.101128
	ColourShift: 15.34901

openaio310

Round ID: 327
Prompt used:
	Task
	You will be given exactly one data row with two numeric fields:
	  • FizzIntensity – a positive real number.
	  • ColourShift   – a real number that can be positive, zero or negative.
	
	Output exactly one of the two words below (case‑sensitive, no punctuation, no extra text):
	  Effective
	  Ineffective
	
	Decision rules (apply strictly in the numbered order)
	0. Extremely low fizz never works.                  If FizzIntensity < 20 → Ineffective.
	1. Ultra‑high fizz always works.                    If FizzIntensity ≥ 60 → Effective.
	2. Mid‑high fizz salvage rule.                      If 50 ≤ FizzIntensity < 55 AND ColourShift ≥ 6  → Effective.
	3. Very‑low colour cannot compensate for fizz.      If ColourShift < 8  AND FizzIntensity < 60 → Ineffective.
	4. Near‑ultra fizz with moderate colour.            If 55 ≤ FizzIntensity < 60 AND 9 ≤ ColourShift ≤ 30 → Effective.
	5. Colour‑dominant synergy rule.                    If FizzIntensity ≥ 35 AND ColourShift ≥ 0.55 × FizzIntensity → Effective.
	6. Weak‑fizz / low‑colour penalty band.             If 40 ≤ FizzIntensity < 45 AND ColourShift < 12 → Ineffective.
	7. High‑colour rescue for weak fizz.                If 40 ≤ FizzIntensity < 45 AND ColourShift ≥ 18 → Effective.
	8. Narrow mid‑range recovery rule.                  If 40 ≤ FizzIntensity < 45 AND 12 ≤ ColourShift ≤ 15 → Effective.
	9. Standard fizz window with safe colour.           If 45 ≤ FizzIntensity < 60 AND ColourShift ≤ 15 → Effective.
	10. Colour‑boosted mid‑fizz window.                 If 45 ≤ FizzIntensity < 60 AND 15 < ColourShift ≤ 26 → Effective.
	11. All remaining cases → Ineffective.
	
	Examples
	• FI 52.4,  CS  7.4  → Effective  (rule 2)
	• FI 56.3,  CS  8.8  → Ineffective (rule 3)
	• FI 57.6,  CS 26.6  → Effective  (rule 4)
	• FI 39.1,  CS 24.4  → Effective  (rule 5)
	• FI 42.7,  CS 11.3  → Ineffective (rule 6)
	• FI 43.6,  CS 19.5  → Effective  (rule 7)
	• FI 43.0,  CS 13.9  → Effective  (rule 8)
	• FI 47.3,  CS 14.9  → Effective  (rule 9)
	• FI 49.8,  CS 17.3  → Effective  (rule 10)
	• FI 34.2,  CS 10.5  → Ineffective (rule 11)
	
	Follow the rules exactly. Output only the single word.

Confusion Matrix:
                Predicted Effective  Predicted Ineffective
Actual Effective                    7                    2
Actual Ineffective                    2                    9

Accuracy: 0.800
Precision: 0.778
Recall: 0.778
F1 Score: 0.778

Examples for Correctly predicted Effective: (Correct answer: Effective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 42.094933
	ColourShift: 12.042143


Examples for Falsely predicted Ineffective when it should have been Effective: (Correct answer: Effective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 45.78967
	ColourShift: 12.371225


Examples for Falsely predicted Effective when it should have been Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Effective)
  Entity Data:
	FizzIntensity: 41.101128
	ColourShift: 15.34901


Examples for Correctly predicted Ineffective: (Correct answer: Ineffective, What the previous set of rules predicted: Ineffective)
  Entity Data:
	FizzIntensity: 31.839703
	ColourShift: 19.288298

Ensemble Confusion Matrix

	Predicted +	Predicted -
Actual +	7	2
Actual -	2	9

Accuracy 0.800, Precision 0.778, Recall 0.778, F1 0.778