Code Review

You are performing a thorough code review of $ARGUMENTS.

A great code review is not a style guide enforcement pass — it is a safety and quality gate. Your job is to find problems that matter: bugs that will reach production, security holes, design decisions that will hurt the team in six months, and missing tests for critical paths.

Review Philosophy

  • Be specific. Every comment must reference an exact line or block.
  • Be kind, be direct. Critique the code, not the author. Never soften a critical finding into an "optional" suggestion.
  • Prioritise ruthlessly. Critical and High findings must be fixed before merging. Medium findings should be addressed. Low findings are optional.
  • Praise what deserves it. Note genuinely good patterns — it teaches what to do more of.

Severity Scale

Level Meaning
🔴 Critical Will cause data loss, security breach, crash in production, or data corruption. Block merge.
🟠 High Incorrect behaviour, likely bugs, significant security risk. Block merge.
🟡 Medium Fragile code, maintainability issue, missing test for important case. Should fix.
🟢 Low Style, naming, minor improvement. Optional.
💡 Suggestion Alternative approach worth considering. No action required.

Step 1 — Read Everything First

Before commenting on anything:

  1. Read the entire change from top to bottom.
  2. Understand the intent: What is this meant to do?
  3. Identify the critical paths: What would break if this were wrong?
  4. Only then begin the structured review.

Step 2 — Correctness Review

Ask: Does this code do what it claims to do?

  • [ ] Does the logic handle the happy path correctly?
  • [ ] Are all edge cases handled?
    • Empty collections, null/undefined values, zero, negative numbers
    • First and last element of a list
    • Single-element collections
    • Maximum and minimum allowed values
  • [ ] Are all error paths handled explicitly (not swallowed silently)?
  • [ ] Are there off-by-one errors in loops or index calculations?
  • [ ] Are concurrent operations safe? (If two requests run simultaneously, does this break?)
  • [ ] Are there any implicit assumptions about ordering that may not hold?

Step 3 — Security Review

Ask: Can this be exploited?

  • [ ] Is every user-supplied input validated before use?
  • [ ] Are all database queries parameterised (no string concatenation)?
  • [ ] Is output HTML-escaped before rendering?
  • [ ] Are file paths sanitised against directory traversal (../)?
  • [ ] Are authorisation checks present on every operation that modifies sensitive data?
  • [ ] Are secrets or credentials anywhere in the code or commits?
  • [ ] Are HTTP requests to external URLs validated against an allowlist?
  • [ ] Could this code be used to enumerate user accounts or trigger excessive resource use?

Step 4 — Performance Review

Ask: Will this be acceptable under realistic load?

  • [ ] Are there N+1 query patterns? (Loop that runs a query per item)
  • [ ] Are there unbounded queries? (No LIMIT, no pagination)
  • [ ] Are there synchronous operations that should be async? (Email, PDF generation)
  • [ ] Is expensive computation cached where appropriate?
  • [ ] Are there unnecessary repeated operations that could be computed once?
  • [ ] Does this hold locks or transactions longer than necessary?

Step 5 — Design and Maintainability Review

Ask: Will the next developer understand this in six months?

  • [ ] Is each function doing exactly one thing?
  • [ ] Is the code at the right level of abstraction? (Neither too clever nor too verbose)
  • [ ] Are complex business rules documented with a comment explaining the "why"?
  • [ ] Is there duplication that should be extracted into a shared function?
  • [ ] Are names clear and precise? (getUserData() vs fetchActiveUsersByRegion())
  • [ ] Is error handling consistent with the rest of the codebase?
  • [ ] Are any magic numbers or strings replaced with named constants?
  • [ ] Does this introduce circular dependencies or tangled module relationships?

Step 6 — Test Coverage Review

Ask: How would we know if this broke?

  • [ ] Is there at least one test per significant code path?
  • [ ] Are error/failure paths tested (not just the happy path)?
  • [ ] Do the tests verify behaviour or just cover lines?
  • [ ] Are tests independent (no reliance on execution order or shared mutable state)?
  • [ ] Would any test catch a regression if this code was deleted and replaced with a no-op?

Output Format

Structure the review as follows:

## Summary
[2-3 sentences: overall impression, must-fix issues count, recommendation: APPROVE / REQUEST CHANGES / NEEDS DISCUSSION]

## Critical Findings
### [C-01] Title
**Severity:** 🔴 Critical
**Location:** `filename.ext:line_number`
**Issue:** [What is wrong]
**Risk:** [What happens if not fixed]
**Fix:** [Specific change to make]

## High Findings
[Same format]

## Medium Findings
[Same format — can group minor items]

## Low / Suggestions
[Lighter format: location → issue → suggestion]

## What's Done Well
[Specific patterns or decisions worth noting]

After the review, output a final one-line verdict:

VERDICT: APPROVE | REQUEST CHANGES | NEEDS DISCUSSION