Guides

How to build an interview scorecard that produces fair, comparable decisions

Without a scorecard, interviews collapse into gut feeling and whoever spoke last. A simple, criteria-based scorecard makes hiring fairer, more defensible, and far easier to compare.

Scorecard
How-to
Interviews
Finn Glas
Finn GlasCo-Founder + Engineering
·March 6, 2026·
3 min read

Key takeaways

Score against fixed criteria, not an overall vibe; vibe is where bias hides.
Derive the criteria from the intake's must-haves and success picture.
Interviewers score independently first, then compare; avoid groupthink.
Step by step
1

Take criteria from the intake

4 to 6, mapped to must-haves + success picture.

2

Write a 1 to 4 scale with anchors

Describe what each level looks like.

3

Score independently

Each interviewer rates before any group talk.

4

Compare + decide on evidence

Discuss the disagreements; decide from the scores.

1. Why a scorecard beats a gut verdict

A scorecard forces interviewers to rate a candidate against the specific things the role needs, instead of collapsing the whole interview into a single "yeah, I liked them". That single impression is exactly where bias and recency live, the confident talker, the person who reminds you of yourself, the strong finish that erases a weak middle. Scoring each criterion separately makes the decision legible, comparable across candidates, and defensible if it's ever questioned. In Germany it's also a core part of an AGG-compliant, bias-resistant process.

Score evidence, not personality

Anchor every criterion to something the candidate did or demonstrated, not to traits like "culture fit" that quietly become "is like us". Evidence-based scoring is both fairer and more predictive, and it keeps the process defensible under the AGG.

2. Derive the criteria from the intake

Don't invent generic criteria. Take them straight from the must-haves and success picture you agreed in the intake meeting, that's the whole point of having one. A scorecard with four to six criteria that actually map to the role (a specific skill, a relevant kind of experience, a way of working the team needs) beats a long list of vague traits like "communication" that everyone scores differently. Each criterion should be something you can gather real evidence for in the conversation, not a personality guess.

3. Use a scale with behavioural anchors

A 1-to-4 scale works well (an even number quietly forces a lean rather than a safe middle). But numbers alone drift, your 3 isn't my 3. Add a short behavioural anchor to each level: what does a 2 look like versus a 4 on this criterion? Anchors turn a subjective number into a shared standard and make scores comparable across different interviewers. Capture a sentence of evidence next to each score ("rated 4: walked through a near-identical migration they led"), so the rating is grounded, not just asserted.

4. Score independently, then compare

The order matters. Each interviewer fills in their scorecard BEFORE the group discusses, because the moment a senior voice says "I loved them" out loud, everyone's scores quietly converge on it, that's groupthink erasing the value of multiple perspectives. Independent-first, discuss-second surfaces real disagreement, which is often the most useful signal you have. In KI BMS the scorecard lives on the application and feeds the candidate's fit picture, so the structured evaluation you designed is what actually drives the decision and stays on the record, rather than a hallway consensus nobody can reconstruct later.

FAQ

Frequently asked

Try KI BMS

Free plan, no credit card. We host in Germany. You can export and delete everything self-serve.

Finn Glas

Written by

Finn Glas

Co-Founder + Engineering

Finn is one of the Co-Founders. He owns the engineering side, the infrastructure, and most of the late-night fixes that ship before anyone notices.

finn.glas at aicuflow dot comLinkedInWebsite