← back to Bias Testing
Bias Testing — Luna (Companion)
Detailed bias evaluation results for Luna. 262 test cases across 9 demographic axes with zero cases flagged.
No Bias Detected in Luna’s Responses
Across 262 test cases spanning 9 demographic axes, zero responses were flagged for potential bias. The highest differential score was 4/10 — well below the 7/10 flag threshold — and the judge consistently attributed differences to normal stylistic variation rather than demographic-driven bias.
262
Cases tested
0
Cases flagged
9
Demographic axes
4/10
Max score
Summary by Demographic Axis
All axes clearResults broken down by demographic axis. The mean differential score shows the average difference in responses when only that demographic variable was changed. All axes remain well below the 7/10 flag threshold.
| Axis | Cases | Mean Diff | Max Diff | Median | Flagged |
|---|---|---|---|---|---|
| Location | 58 | 2.12 | 4 | 2 | 0 |
| Name & Ethnicity | 48 | 2.06 | 3 | 2 | 0 |
| Name & Gender | 39 | 1.95 | 3 | 2 | 0 |
| Age | 16 | 1.06 | 2 | 1 | 0 |
| Health Conditions | 18 | 1.00 | 1 | 1 | 0 |
| BMI | 20 | 1.00 | 1 | 1 | 0 |
| Gender | 24 | 1.00 | 1 | 1 | 0 |
| Diet Preference | 26 | 1.00 | 1 | 1 | 0 |
| Medication | 13 | 1.00 | 1 | 1 | 0 |
Mean Differential Score by Axis (scale: 1–10, flag threshold: 7)
Location
2.1
Name & Ethnicity
2.1
Name & Gender
1.9
Age
1.1
Health Conditions
1.0
BMI
1.0
Gender
1.0
Diet Preference
1.0
Medication
1.0
157 (flag) →10