Tomorrow Ready Resources → Evaluating AI Output

Evaluating AI Output

AI tools produce confident, fluent output. That confidence is not evidence of accuracy. These resources give you routines that require students to compare before they conclude, making credibility a visible, taught process rather than a vague expectation.

Comparison before conclusion means students cannot treat the first clean answer as the final answer.

"Show me your comparison before your conclusion." That one instruction shifts the focus from product to process and makes integrity observable without playing detective.

Resources in this theme

[P+S] Primary & Secondary

Comparison Before Conclusion

Frame it consistently with learners: show me your comparison before your conclusion. When students know the marks are in the justification, not the polish, the classroom shifts from finishing fast to thinking longer.

[P+S] Primary & Secondary

Same Prompt, Two Outputs

Students generate two outputs from the same prompt, compare them, and justify their choice. Grade the reasoning, not the product. When this becomes normal, students learn that AI is not a single voice but a set of variable suggestions.

[P+S] Primary & Secondary

When Video Looks Real Enough

The comparison that matters is not whether the video is good. It is whether the storyboard claims are supported by the evidence. Require a storyboard, three written claims, and an evidence map before final production.

[P+S] Primary & Secondary

Automation Is Not Research

Two Sources, One Claim makes credibility evaluation the task instead of source-finding. Students compare two sources for a single claim and justify which is stronger for this specific assessment purpose.

[P+S] Primary & Secondary

The Oversimplification Audit

Students label each statement in a list: always true, sometimes true, depends, or questionable. Then they rewrite two depends items with conditions. Disciplinary thinking as a taught and practised skill.

[S] Secondary

NCEA Integrity by Design

Authenticity rests on reasoning you can see and hear, not on tool policing. The Evidence Matchup Paragraph design gives students a debatable statement, two candidate claims, and a class-approved evidence hunt.

[P+S] Primary & SecondarySocial Sciences

Sorting information before it becomes a claim

Students sort a short information set into Keep, Cut, and Question before drafting begins. Makes the step from information to claim visible and evaluable in any social sciences inquiry, at any year level.

[Yrs 5–8] PrimaryMathematics and Statistics

Showing the reasoning, not just the answer

Two worked examples, one comparison card, one justified choice — all completed before the actual task begins. Makes method selection visible and assessable at Years 5–8.

[Yrs 1–3] PrimaryEnglish

When agreement gets in the way of learning

A new routine for early writers. Students find one weakness in their first idea and commit to a position before drafting begins. Builds the habit of pushing back on a first thought rather than accepting what arrived without effort.

[Yrs 9–10] SecondaryTechnology

Evaluating options before committing to a design direction

Two candidates, three criteria, one justified choice — all collected before development begins. Makes design decision-making visible at the point where it is most likely to be bypassed.

[Yrs 1–3] PrimaryMathematics and Statistics

The Evaluation Gate for Mathematics — Years 1–3

One comparison, one reason, spoken aloud or drawn. Builds the habit of evaluating before committing from the earliest year levels in Maths.

[Yrs 11–13] SecondaryThe Arts

Context Triage for The Arts — Years 11–13

Senior Arts students sort a curated resource set into Keep, Cut, and Question before any writing or performance work begins. Makes discipline-specific judgment about what serves the work visible and assessable.

[Yrs 7–8] IntermediateEnglish

Why choosing comes before writing

Two candidate thesis statements, one annotated comparison, one two-sentence commitment — all before drafting begins. Makes position selection the assessed act, not the finished essay.

[Yrs 7–8] IntermediateTechnology

Context Triage for Technology — Years 7–8

Students sort brief information into Keep, Cut, and Question before any design work begins. Puts the information evaluation step back into Technology design tasks at Years 7–8.

[Yrs 11–13] SecondarySocial Sciences

When the evidence points in different directions

Two candidate framings, three quality criteria, a scored comparison, and a four-sentence reasoning record — all collected before drafting begins. Makes evaluative reasoning visible and assessable at NCEA level when credible sources hold genuinely different positions.

More resources coming

New Evaluating AI Output resources are added regularly. Subscribe to aiEDnz below to receive them as they are released.

Get new resources every week.

The aiEDnz newsletter publishes practical AI integrity resources for NZ teachers every week. Subscribe free and receive new resources as they are released.

Subscribe to aiEDnz →