Ensemble

Dataset: titanic

Models

Model Narratives

openai10

Round ID: 353
Prompt used:
	### Revised Prediction Rules for Entity Classification ###
	
	Refine the following decision rules to predict Failure or Success accurately, targeting edge cases and ambiguity:
	
	1. General Threshold Rules:
	   - Predict Failure if `TcQ_mass` > 50,000 and `Treatment_Months` < 130, unless explicitly overridden by Genetic_Class or cohort-specific conditions.
	   - Predict Success only if `TcQ_mass` > 50,000, `Treatment_Months` >= 130, and `Genetic_Class_A_Matches` >= 3 and `Genetic_Class_B_Matches` >= 3.
	
	2. Rules for `Histogen_Complex` = 'Beta':
	   - Predict Failure if `Genetic_Class_A_Matches` + `Genetic_Class_B_Matches` < 6, regardless of other factors.
	   - Predict Failure if `TcQ_mass` > 50,000 and any Genetic_Class threshold (<3) fails.
	   - Prioritize Success only if all thresholds are met: `TcQ_mass` > 50,000, `Treatment_Months` > 130, and each Genetic_Class reaches at least 3.
	
	3. Rules for `Histogen_Complex` = 'Delta' or 'Omicron':
	   - Predict Failure if `TcQ_mass` > 25,000 and `Genetic_Class_A_Matches` or `Genetic_Class_B_Matches` < 3.
	   - Predict Failure for `Treatment_Months` < 60 when combined Genetic Class matches are below 6.
	   - For `TcQ_mass` in the range 20,000-25,000:
	     a. Predict Success if `Treatment_Months` >= 100 and each Genetic_Class >= 3.
	     b. Otherwise, predict Failure.
	   - Automatically predict Failure if combined Genetic Class score is 4 or below, unless cohort-specific rules apply.
	
	4. Specific Rules for Cohorts:
	   - Predict Success for cohort='Delhi' if `TcQ_mass` > 200,000, regardless of Genetic_Class. Otherwise, predict Failure.
	   - Other cohorts adhere strictly to thresholds for Genetic_Class matches, with no override for high `TcQ_mass` unless `Treatment_Months` > 130.
	
	5. Gender-Specific Rules (for Sex = female):
	   - Predict Failure if `Treatment_Months` > 110, `TcQ_mass` is between 25,000-40,000, and either Genetic_Class threshold (<3) fails.
	   - Predict Success only if `TcQ_mass` > 40,000 and combined Genetic_Class matches >= 7.
	
	6. Special Override Rules:
	   - Predict Failure universally if combined `Genetic_Class_A_Matches` + `Genetic_Class_B_Matches` < 4.
	   - Predict Success universally if combined Genetic_Class matches >= 8 and `Treatment_Months` > 100, provided no critical TcQ_mass condition fails above 50,000.
	   - For cases where `Histogen_Complex` = 'Omicron', override rules if `TcQ_mass` >= 35,000 and `Genetic_Class` conditions are not met.
	
	7. Catch-All Rule:
	   - Default to Failure if an entry does not decisively meet any outlined Success criteria or has borderline thresholds in ambiguity.

Confusion Matrix:
                Predicted Failure    Predicted Success   
Actual Failure                    28                    0
Actual Success                    23                    0

Accuracy: 0.549
Precision: 0.549
Recall: 1.000
F1 Score: 0.709

Examples for Correctly predicted Failure: (Correct answer: Failure, What the previous set of rules predicted: Failure)
  Entity Data:
	Histogen_Complex: Delta
	Sex: female
	Treatment_Months: 84.0
	Genetic_Class_A_Matches: 1
	Genetic_Class_B_Matches: 1
	TcQ_mass: 56495.8
	Cohort: Melbourne


Examples for Falsely predicted Failure when it should have been Success: (Correct answer: Success, What the previous set of rules predicted: Failure)
  Entity Data:
	Histogen_Complex: Beta
	Sex: female
	Treatment_Months: 111.0
	Genetic_Class_A_Matches: 2
	Genetic_Class_B_Matches: 2
	TcQ_mass: 52554.200000000004
	Cohort: Melbourne


openaio3

Round ID: 277
Prompt used:
	Task: Classify each entity as "Success" or "Failure" using the rules below.  Output exactly one of these two words.
	
	Deterministic classification rules (apply in order):
	1.  If Histogen_Complex is Beta or Omicron, classify as Success.
	2.  Otherwise (all other Histogen_Complex values, e.g. Delta, Alpha, etc.), classify as Failure.
	
	Do not rely on any other features.  Do not explain the answer.  Just output the single word label.

Confusion Matrix:
                Predicted Failure    Predicted Success   
Actual Failure                    17                   11
Actual Success                     5                   18

Accuracy: 0.686
Precision: 0.773
Recall: 0.607
F1 Score: 0.680

Examples for Correctly predicted Failure: (Correct answer: Failure, What the previous set of rules predicted: Failure)
  Entity Data:
	Histogen_Complex: Delta
	Sex: female
	Treatment_Months: 93.0
	Genetic_Class_A_Matches: 1
	Genetic_Class_B_Matches: 1
	TcQ_mass: 7775.0
	Cohort: Melbourne


Examples for Falsely predicted Success when it should have been Failure: (Correct answer: Failure, What the previous set of rules predicted: Success)
  Entity Data:
	Histogen_Complex: Omicron
	Sex: female
	Treatment_Months: 99.0
	Genetic_Class_A_Matches: 1
	Genetic_Class_B_Matches: 1
	TcQ_mass: 12275.0
	Cohort: Melbourne


Examples for Falsely predicted Failure when it should have been Success: (Correct answer: Success, What the previous set of rules predicted: Failure)
  Entity Data:
	Histogen_Complex: Delta
	Sex: female
	Treatment_Months: 89.09735294117647
	Genetic_Class_A_Matches: 1
	Genetic_Class_B_Matches: 1
	TcQ_mass: 56495.8
	Cohort: Melbourne


Examples for Correctly predicted Success: (Correct answer: Success, What the previous set of rules predicted: Success)
  Entity Data:
	Histogen_Complex: Beta
	Sex: male
	Treatment_Months: 90.0
	Genetic_Class_A_Matches: 1
	Genetic_Class_B_Matches: 1
	TcQ_mass: 86500.0
	Cohort: Melbourne


openaio310

Round ID: 296
Prompt used:
	You are given one entity at a time with the following fields.
	• Histogen_Complex  (string)
	• Sex               ("male" or "female")
	• Treatment_Months  (number, can be decimal)
	• Genetic_Class_A_Matches (integer ≥0)
	• Genetic_Class_B_Matches (integer ≥0)
	• TcQ_mass          (number, can be decimal)
	• Cohort            (string)
	
	Your task is to predict the treatment OUTCOME for that entity.
	Only two outcomes are possible:
	  Success
	  Failure
	
	Apply the rules below IN ORDER. As soon as a rule is satisfied, output the associated outcome and stop – do not check the lower-priority rules. The rules are designed to be mutually exclusive and cover every possible row.
	
	Rule 1  HIGH GENETIC BURDEN → Failure
	  Let total_matches = Genetic_Class_A_Matches + Genetic_Class_B_Matches.
	  If total_matches ≥ 6, predict Failure.
	
	Rule 2  EXTREME TcQ_mass → Success
	  If TcQ_mass > 200 000, predict Success.
	
	Rule 3  MALE DEFAULT → Success
	  If Sex is "male", predict Success.
	
	Rule 4  SHORT TREATMENT WINDOW FOR FEMALES → Success
	  If Sex is "female" AND Treatment_Months < 24, predict Success.
	
	Rule 5  OTHERWISE → Failure
	  All remaining cases (i.e., Sex "female" with Treatment_Months ≥ 24 and that did not match any earlier rule) are predicted as Failure.
	
	Output exactly one word – either Success or Failure – with nothing else.

Confusion Matrix:
                Predicted Failure    Predicted Success   
Actual Failure                    24                    4
Actual Success                     7                   16

Accuracy: 0.784
Precision: 0.774
Recall: 0.857
F1 Score: 0.814

Examples for Correctly predicted Failure: (Correct answer: Failure, What the previous set of rules predicted: Failure)
  Entity Data:
	Histogen_Complex: Omicron
	Sex: female
	Treatment_Months: 162.0
	Genetic_Class_A_Matches: 1
	Genetic_Class_B_Matches: 1
	TcQ_mass: 14000.0
	Cohort: Melbourne


Examples for Falsely predicted Success when it should have been Failure: (Correct answer: Failure, What the previous set of rules predicted: Success)
  Entity Data:
	Histogen_Complex: Delta
	Sex: male
	Treatment_Months: 120.0
	Genetic_Class_A_Matches: 2
	Genetic_Class_B_Matches: 1
	TcQ_mass: 9475.0
	Cohort: Melbourne


Examples for Falsely predicted Failure when it should have been Success: (Correct answer: Success, What the previous set of rules predicted: Failure)
  Entity Data:
	Histogen_Complex: Omicron
	Sex: female
	Treatment_Months: 96.0
	Genetic_Class_A_Matches: 2
	Genetic_Class_B_Matches: 1
	TcQ_mass: 26000.0
	Cohort: Melbourne


Examples for Correctly predicted Success: (Correct answer: Success, What the previous set of rules predicted: Success)
  Entity Data:
	Histogen_Complex: Omicron
	Sex: male
	Treatment_Months: 72.0
	Genetic_Class_A_Matches: 3
	Genetic_Class_B_Matches: 4
	TcQ_mass: 18750.0
	Cohort: Melbourne


Ensemble Confusion Matrix

Predicted +Predicted -
Actual +271
Actual -1013

Accuracy 0.784, Precision 0.730, Recall 0.964, F1 0.831