Ensemble

Dataset: southgermancredit

Models

Model Narratives

sonnet40

Round ID: 81
Prompt used:
	Classify coral reef health as either "Healthy" or "Stressed" based on the following rules:
	
	**PRIMARY STRESS INDICATORS (if any apply, classify as Stressed):**
	1. If ReefIntegrityScore is "Very Poor" → Stressed
	2. If BleachingEventsPerYear is "Constant" AND CoralAgeEstimate ≤ 15 years → Stressed
	3. If BleachingEventsPerYear is "Frequent" AND PollutionLevel is "High" → Stressed
	
	**PRIMARY HEALTH INDICATORS (override stress indicators):**
	1. If ReefIntegrityScore is "Excellent" AND BiodiversityIndex is "High" → Healthy
	2. If BleachingEventsPerYear is "Rare" AND ReefIntegrityScore is at least "Good" → Healthy
	
	**SECONDARY ASSESSMENT (if primary rules don't apply):**
	Count stress factors:
	- BleachingEventsPerYear is "Frequent" or "Constant" (+2 points)
	- PollutionLevel is "High" (+1 point)  
	- ReefIntegrityScore is "Poor" (+1 point)
	- CoralAgeEstimate ≤ 10 years (+1 point)
	- PredatorActivityLevel is "Severe" (+1 point)
	
	Count health factors:
	- BiodiversityIndex is "High" or "Very High" (+1 point)
	- ReefIntegrityScore is "Good" or "Excellent" (+2 points)
	- BleachingEventsPerYear is "Rare" (+1 point)
	- PollutionLevel is "Low" (+1 point)
	
	**FINAL DECISION:**
	- If stress points > health points → Stressed
	- If health points ≥ stress points → Healthy

Confusion Matrix:
                Predicted Healthy    Predicted Stressed  
Actual Healthy                    49                   39
Actual Stressed                   10                   15

Accuracy: 0.566
Precision: 0.831
Recall: 0.557
F1 Score: 0.667

Examples for Correctly predicted Healthy: (Correct answer: Healthy, What the previous set of rules predicted: Healthy)
  Entity Data:
	CurrentFlowQuality: Fair
	ObservationDuration: 61
	ReefIntegrityScore: Moderate
	PredatorActivityLevel: Low
	AcousticIntensity: 302.1
	AlgalCoverage: Minimal
	CoralAgeEstimate: 21
	BleachingEventsPerYear: Occasional
	BiodiversityIndex: Medium
	NearbyHealthyReef: Absent
	ReefMonitoringDuration: > 6 years
	PollutionLevel: Low
	ReefAverageAge: 43
	DistantStressIndicators: Confirmed
	ReefDepthZone: Shallow
	PreviousStressIncidents: None
	CoralDominantType: Encrusting
	SurveyorExperience: Yes
	RemoteSensorPresent: No
	InvasiveSpeciesDetected: Yes


Examples for Falsely predicted Stressed when it should have been Healthy: (Correct answer: Healthy, What the previous set of rules predicted: Stressed)
  Entity Data:
	CurrentFlowQuality: Good
	ObservationDuration: 43
	ReefIntegrityScore: Moderate
	PredatorActivityLevel: Moderate Low
	AcousticIntensity: 303.6
	AlgalCoverage: Minimal
	CoralAgeEstimate: 16
	BleachingEventsPerYear: Frequent
	BiodiversityIndex: Very High
	NearbyHealthyReef: Absent
	ReefMonitoringDuration: < 1 year
	PollutionLevel: High
	ReefAverageAge: 27
	DistantStressIndicators: Confirmed
	ReefDepthZone: Mid-depth
	PreviousStressIncidents: None
	CoralDominantType: Encrusting
	SurveyorExperience: Yes
	RemoteSensorPresent: No
	InvasiveSpeciesDetected: Yes


Examples for Falsely predicted Healthy when it should have been Stressed: (Correct answer: Stressed, What the previous set of rules predicted: Healthy)
  Entity Data:
	CurrentFlowQuality: Poor
	ObservationDuration: 61
	ReefIntegrityScore: Moderate
	PredatorActivityLevel: Moderate Low
	AcousticIntensity: 136.5
	AlgalCoverage: Minimal
	CoralAgeEstimate: 16
	BleachingEventsPerYear: Constant
	BiodiversityIndex: Very High
	NearbyHealthyReef: Absent
	ReefMonitoringDuration: 4-6 years
	PollutionLevel: Low
	ReefAverageAge: 29
	DistantStressIndicators: Confirmed
	ReefDepthZone: Mid-depth
	PreviousStressIncidents: None
	CoralDominantType: Encrusting
	SurveyorExperience: Yes
	RemoteSensorPresent: No
	InvasiveSpeciesDetected: Yes


Examples for Correctly predicted Stressed: (Correct answer: Stressed, What the previous set of rules predicted: Stressed)
  Entity Data:
	CurrentFlowQuality: Fair
	ObservationDuration: 61
	ReefIntegrityScore: Moderate
	PredatorActivityLevel: Very Low
	AcousticIntensity: 1299.6
	AlgalCoverage: Minimal
	CoralAgeEstimate: 6
	BleachingEventsPerYear: Frequent
	BiodiversityIndex: Medium
	NearbyHealthyReef: Absent
	ReefMonitoringDuration: > 6 years
	PollutionLevel: Critical
	ReefAverageAge: 41
	DistantStressIndicators: Confirmed
	ReefDepthZone: Deep
	PreviousStressIncidents: None
	CoralDominantType: Soft
	SurveyorExperience: Yes
	RemoteSensorPresent: Yes
	InvasiveSpeciesDetected: Yes


sonnet4010

Round ID: 105
Prompt used:
	Classify coral reef health as either "Healthy" or "Stressed" based on the following rules:
	
	**CLASSIFY AS STRESSED if ANY of the following conditions are met:**
	
	1. **Critical Structural Damage**: If ReefIntegrityScore = "Very Poor", classify as Stressed
	
	2. **Critical Pollution with Poor Structure**: If PollutionLevel = "Critical" AND ReefIntegrityScore = "Poor", classify as Stressed
	
	3. **Extreme Acoustic Stress**: If AcousticIntensity ≥ 1200, classify as Stressed
	
	4. **Young Reef with High Acoustic Stress**: If ReefAverageAge ≤ 30 AND AcousticIntensity ≥ 700, classify as Stressed
	
	5. **Poor Flow + Short Observation**: If CurrentFlowQuality = "Poor" AND ObservationDuration ≤ 35 days, classify as Stressed
	
	6. **Young Reef with Multiple Stressors**: If ReefAverageAge ≤ 30 AND any TWO of the following are true:
	   - CurrentFlowQuality = "Poor" or "Fair"
	   - PollutionLevel = "High" or "Critical"
	   - BleachingEventsPerYear = "Constant"
	   - PredatorActivityLevel = "None"
	
	7. **Multiple Moderate Stressors**: If any THREE of the following are true:
	   - CurrentFlowQuality = "Poor" or "Fair"
	   - PollutionLevel = "High" or "Critical"
	   - ReefIntegrityScore = "Poor"
	   - PredatorActivityLevel = "High", "Severe", or "Extreme"
	   - BleachingEventsPerYear = "Frequent" or "Constant"
	
	8. **High Pollution with Poor Flow**: If PollutionLevel = "High" AND CurrentFlowQuality = "Poor" AND ReefAverageAge ≤ 35, classify as Stressed
	
	9. **Insufficient Observation with Stress Indicators**: If ObservationDuration ≤ 30 days AND any TWO of the following are true:
	   - CurrentFlowQuality = "Poor" or "Fair"
	   - PollutionLevel = "High" or "Critical"
	   - ReefIntegrityScore = "Poor" or "Very Poor"
	   - PredatorActivityLevel = "High", "Severe", or "Extreme"
	
	**CLASSIFY AS HEALTHY if:**
	- None of the above stress conditions are met
	- ReefIntegrityScore = "Excellent" AND CurrentFlowQuality = "Excellent" AND PollutionLevel = "Low"
	
	**DEFAULT CLASSIFICATION:**
	If none of the explicit stress conditions are met and the reef doesn't qualify for the excellent health criteria, classify as Healthy.

Confusion Matrix:
                Predicted Healthy    Predicted Stressed  
Actual Healthy                    64                   24
Actual Stressed                    7                   18

Accuracy: 0.726
Precision: 0.901
Recall: 0.727
F1 Score: 0.805

Examples for Correctly predicted Healthy: (Correct answer: Healthy, What the previous set of rules predicted: Healthy)
  Entity Data:
	CurrentFlowQuality: Excellent
	ObservationDuration: 43
	ReefIntegrityScore: Moderate
	PredatorActivityLevel: Severe
	AcousticIntensity: 109.6
	AlgalCoverage: Minimal
	CoralAgeEstimate: 16
	BleachingEventsPerYear: Occasional
	BiodiversityIndex: Very High
	NearbyHealthyReef: Absent
	ReefMonitoringDuration: 1-3 years
	PollutionLevel: Low
	ReefAverageAge: 29
	DistantStressIndicators: Confirmed
	ReefDepthZone: Mid-depth
	PreviousStressIncidents: None
	CoralDominantType: Encrusting
	SurveyorExperience: Yes
	RemoteSensorPresent: Yes
	InvasiveSpeciesDetected: Yes


Examples for Falsely predicted Stressed when it should have been Healthy: (Correct answer: Healthy, What the previous set of rules predicted: Stressed)
  Entity Data:
	CurrentFlowQuality: Excellent
	ObservationDuration: 25
	ReefIntegrityScore: Excellent
	PredatorActivityLevel: None
	AcousticIntensity: 210.0
	AlgalCoverage: Moderate
	CoralAgeEstimate: 16
	BleachingEventsPerYear: Rare
	BiodiversityIndex: Very High
	NearbyHealthyReef: Absent
	ReefMonitoringDuration: 1-3 years
	PollutionLevel: High
	ReefAverageAge: 27
	DistantStressIndicators: Confirmed
	ReefDepthZone: Mid-depth
	PreviousStressIncidents: None
	CoralDominantType: Encrusting
	SurveyorExperience: Yes
	RemoteSensorPresent: No
	InvasiveSpeciesDetected: Yes


Examples for Falsely predicted Healthy when it should have been Stressed: (Correct answer: Stressed, What the previous set of rules predicted: Healthy)
  Entity Data:
	CurrentFlowQuality: Poor
	ObservationDuration: 55
	ReefIntegrityScore: Excellent
	PredatorActivityLevel: None
	AcousticIntensity: 264.5
	AlgalCoverage: Minimal
	CoralAgeEstimate: 26
	BleachingEventsPerYear: Occasional
	BiodiversityIndex: High
	NearbyHealthyReef: Adjacent
	ReefMonitoringDuration: > 6 years
	PollutionLevel: Moderate
	ReefAverageAge: 46
	DistantStressIndicators: Confirmed
	ReefDepthZone: Shallow
	PreviousStressIncidents: None
	CoralDominantType: Encrusting
	SurveyorExperience: Yes
	RemoteSensorPresent: Yes
	InvasiveSpeciesDetected: No


Examples for Correctly predicted Stressed: (Correct answer: Stressed, What the previous set of rules predicted: Stressed)
  Entity Data:
	CurrentFlowQuality: Fair
	ObservationDuration: 43
	ReefIntegrityScore: Moderate
	PredatorActivityLevel: Moderate High
	AcousticIntensity: 65.9
	AlgalCoverage: Minimal
	CoralAgeEstimate: 16
	BleachingEventsPerYear: Constant
	BiodiversityIndex: High
	NearbyHealthyReef: Absent
	ReefMonitoringDuration: 1-3 years
	PollutionLevel: High
	ReefAverageAge: 33
	DistantStressIndicators: Confirmed
	ReefDepthZone: Mid-depth
	PreviousStressIncidents: None
	CoralDominantType: Encrusting
	SurveyorExperience: Yes
	RemoteSensorPresent: No
	InvasiveSpeciesDetected: Yes


openai35

Round ID: 349
Prompt used:
	Given the entity data for a reef observation, predict the reef status as Healthy or Stressed based on the following rule: If Observation Duration is less than or equal to 50, Predator Activity Level is Severe, Bleaching Events Per Year is Frequent, and Reef Integrity Score is Poor or Fair, predict Stressed. Otherwise, predict Healthy.

Confusion Matrix:
                Predicted Healthy    Predicted Stressed  
Actual Healthy                    88                    0
Actual Stressed                   25                    0

Accuracy: 0.779
Precision: 0.779
Recall: 1.000
F1 Score: 0.876

Examples for Correctly predicted Healthy: (Correct answer: Healthy, What the previous set of rules predicted: Healthy)
  Entity Data:
	CurrentFlowQuality: Excellent
	ObservationDuration: 79
	ReefIntegrityScore: Moderate
	PredatorActivityLevel: Moderate Low
	AcousticIntensity: 139.6
	AlgalCoverage: Moderate
	CoralAgeEstimate: 21
	BleachingEventsPerYear: Constant
	BiodiversityIndex: Medium
	NearbyHealthyReef: Absent
	ReefMonitoringDuration: < 1 year
	PollutionLevel: High
	ReefAverageAge: 31
	DistantStressIndicators: Confirmed
	ReefDepthZone: Mid-depth
	PreviousStressIncidents: None
	CoralDominantType: Encrusting
	SurveyorExperience: Yes
	RemoteSensorPresent: No
	InvasiveSpeciesDetected: Yes


Examples for Falsely predicted Healthy when it should have been Stressed: (Correct answer: Stressed, What the previous set of rules predicted: Healthy)
  Entity Data:
	CurrentFlowQuality: Excellent
	ObservationDuration: 61
	ReefIntegrityScore: Good
	PredatorActivityLevel: Severe
	AcousticIntensity: 218.9
	AlgalCoverage: Minimal
	CoralAgeEstimate: 16
	BleachingEventsPerYear: Constant
	BiodiversityIndex: Very High
	NearbyHealthyReef: Absent
	ReefMonitoringDuration: 1-3 years
	PollutionLevel: High
	ReefAverageAge: 31
	DistantStressIndicators: Confirmed
	ReefDepthZone: Mid-depth
	PreviousStressIncidents: None
	CoralDominantType: Encrusting
	SurveyorExperience: Yes
	RemoteSensorPresent: Yes
	InvasiveSpeciesDetected: Yes


Ensemble Confusion Matrix

Predicted +Predicted -
Actual +7513
Actual -1114

Accuracy 0.788, Precision 0.872, Recall 0.852, F1 0.862