Evaluating AI Output

AI tools produce confident, fluent output. That confidence is not evidence of accuracy. These resources give you routines that require students to compare before they conclude, making credibility a visible, taught process rather than a vague expectation.

Comparison before conclusion means students cannot treat the first clean answer as the final answer.

"Show me your comparison before your conclusion." That one instruction shifts the focus from product to process and makes integrity observable without playing detective.

Resources in this theme

[P+S] Primary & Secondary

Comparison Before Conclusion

Frame it consistently with learners: show me your comparison before your conclusion. When students know the marks are in the justification, not the polish, the classroom shifts from finishing fast to thinking longer.

Coming soon

[P+S] Primary & Secondary

Same Prompt, Two Outputs

Students generate two outputs from the same prompt, compare them, and justify their choice. Grade the reasoning, not the product. When this becomes normal, students learn that AI is not a single voice but a set of variable suggestions.

Coming soon

[P+S] Primary & Secondary

When Video Looks Real Enough

The comparison that matters is not whether the video is good. It is whether the storyboard claims are supported by the evidence. Require a storyboard, three written claims, and an evidence map before final production.

Coming soon

[P+S] Primary & Secondary

Automation Is Not Research

Two Sources, One Claim makes credibility evaluation the task instead of source-finding. Students compare two sources for a single claim and justify which is stronger for this specific assessment purpose.

Coming soon

[P+S] Primary & Secondary

The Oversimplification Audit

Students label each statement in a list: always true, sometimes true, depends, or questionable. Then they rewrite two depends items with conditions. Disciplinary thinking as a taught and practised skill.

Coming soon

[S] Secondary

NCEA Integrity by Design

Authenticity rests on reasoning you can see and hear, not on tool policing. The Evidence Matchup Paragraph design gives students a debatable statement, two candidate claims, and a class-approved evidence hunt.

Coming soon

More resources coming

New Evaluating AI Output resources are added regularly. Subscribe to aiEDnz below to receive them as they are released.

← Browse Evidence of Thinking Browse Traceable Decisions →

Evaluating AI Output

Resources in this theme

Comparison Before Conclusion

Same Prompt, Two Outputs

When Video Looks Real Enough

Automation Is Not Research

The Oversimplification Audit

NCEA Integrity by Design

More resources coming

Quick Links

if you would like to get in touch please check out our contact page

or follow us on social media