← Back to The IndexFinal ReviewPaper 12
Final Review

Paper Nº 12

Co-Designing an Ethical Checklist for AI Delegability: Framework from the Workers’ Perspective
30/30
Score
The most methodologically complete submission — four versions, 38 participants including a genuine eight-radiologist domain-expert phase, exemplary change traceability, and a use case that converts every item into concrete architectural requirements — weakened mainly by the absence of reported Likert results and any decision/aggregation logic.
Radiology

The Pros

+
Largest and most diverse participant base (3 peers, 27 cross-sector survey respondents, 8 radiologists) with full demographic breakdown and figures.
+
Four clearly separated phases, each with stated participants, procedure, and outcome, and a strong final domain-expert validation that is shown to generate non-obvious requirements (item 2.6).
+
Outstanding traceability: Table 5 (V1 → V4) records the exact feedback that prompted each item change, satisfying the co-design transparency principle the paper itself cites.
+
Conceptually sharp framing (Type-1, "the checklist produces an architecture, not a binary answer") and well-justified shift from binary to Likert grounded in Madaio et al.
+
Appendix A6 translates all V4 items into concrete technical/UX specifications for the triage system — the strongest "applied use case" of the six.

The Cons

The five-point Likert survey is central to the method, yet no quantitative results (means, SDs, distributions per item) are reported; the reader cannot see what the 27 respondents actually rated.
The instrument has no aggregation or decision rule: with Likert items and no scoring, how a user converts ratings into a delegation verdict is left implicit, and the use case resolves it qualitatively.
Figure-caption inconsistencies (e.g., Figure 2 caption references "11 authors" and "two classroom workshops (2 and 1 participants)") create minor confusion about the V1 → V2 step.
In-class workshop had only three participants (below the recommended six, acknowledged) and the survey skews male/young (70.4% / 48.1% aged 18–25).
V2 items are presented in Italian then translated, introducing small inconsistencies; the checklist is never deployment-tested (acknowledged).
Back to The Index
Final Review · Paper 12The IndexAI Checklists · 2026