Step 1 · Model and Design

Goal

Produce a reviewed, documented data model that correctly represents the domain, handles edge cases, and is safe to evolve — before writing a single migration file.

Instructions

You are in workflow step 1 of the data-cycle. A new data structure needs to be designed, or an existing one needs to be changed. Your job is to think through the design completely and get it reviewed before writing code.


Tasks to Perform

1. Understand the Domain

Before touching the database, answer these:

  • What are the core entities? (the nouns: User, Order, Product, Invoice…)
  • What are the relationships? (User places Order; Order contains Products)
  • What is the cardinality? (one-to-many, many-to-many)
  • What are the constraints? (status can only be 'active' or 'cancelled'; amount must be positive)
  • What queries will this schema need to answer? (list orders by user, search orders by status and date range)
  • How will this data change over time? What's the most likely future addition?

Read the existing data model first:

# Find existing schema files
find . -name "schema.rb" -o -name "schema.prisma" \
       -o -name "*.sql" -o -name "migrations" -type d \
  | grep -v node_modules | grep -v vendor | head -20

# Check current table structure (PostgreSQL)
psql $DATABASE_URL -c "\dt"
psql $DATABASE_URL -c "\d orders"   # describe specific table

# Check existing models (framework-specific)
find . -path "*/models/*.{php,py,rb,ts}" | grep -v node_modules | grep -v vendor | head -20

2. Draft the Schema

Write out the proposed schema in SQL (even if you will use an ORM/migration library):

-- New table: customer_addresses
CREATE TABLE customer_addresses (
  id          BIGINT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,

  -- Foreign key — always indexed, always has ON DELETE behaviour
  customer_id BIGINT NOT NULL REFERENCES customers(id) ON DELETE CASCADE,

  -- Use precise types — not TEXT for everything
  line_1      VARCHAR(255) NOT NULL,
  line_2      VARCHAR(255),
  city        VARCHAR(100) NOT NULL,
  country     CHAR(2)      NOT NULL,      -- ISO 3166-1 alpha-2
  postcode    VARCHAR(20)  NOT NULL,

  -- Money in DECIMAL, not FLOAT
  -- Status in enum/check, not free text
  is_default  BOOLEAN      NOT NULL DEFAULT FALSE,

  -- Audit columns on every mutable table
  created_at  TIMESTAMPTZ  NOT NULL DEFAULT NOW(),
  updated_at  TIMESTAMPTZ  NOT NULL DEFAULT NOW()
);

-- Index all FKs
CREATE INDEX idx_customer_addresses_customer_id ON customer_addresses(customer_id);

-- Unique constraint: one default address per customer
CREATE UNIQUE INDEX idx_customer_addresses_default
  ON customer_addresses(customer_id)
  WHERE is_default = TRUE;

3. Review Against the Design Checklist

For each new or changed table:

  • [ ] Surrogate primary key (not a natural key as PK)
  • [ ] All foreign keys have an index and an explicit ON DELETE action
  • [ ] No money stored as FLOAT — use DECIMAL(19,4) or integer cents
  • [ ] No comma-separated values in a column (violates 1NF)
  • [ ] No status/type as free text — use VARCHAR with CHECK or native ENUM
  • [ ] Timestamps are TIMESTAMPTZ (not timezone-naive TIMESTAMP)
  • [ ] created_at and updated_at on every mutable table
  • [ ] Columns that will be filtered or sorted frequently have indexes
  • [ ] Unique constraints defined where business rules require uniqueness

4. Draw or Describe the ERD

For any design involving ≥ 3 tables, produce an entity-relationship description:

customers ──< customer_addresses (many addresses per customer)
customers ──< orders             (many orders per customer)
orders    ──< order_items        (many items per order)
order_items >── products         (each item references one product)

Save this in docs/data-models/[domain-name].md.

5. Get a Second Opinion

Before writing migration files:

  • Share the SQL with a teammate or use the data-engineer agent for a review
  • Specifically ask: "What query patterns does this not support? What will hurt when the table reaches 10 million rows?"

Exit Criteria

  • [ ] Domain entities and relationships clearly documented
  • [ ] Schema written in SQL and reviewed against the design checklist
  • [ ] All constraints, indexes, and foreign keys defined
  • [ ] ERD or relationship description saved to docs/data-models/
  • [ ] Design approved (second review done)

Next Step

→ Proceed to Step 2 · Migrate and Apply