Step 3 · Validate and Monitor
Goal
Confirm the data is correct after migration, establish ongoing data quality monitoring, and close the loop on the data change.
Instructions
You are in workflow step 3 of the data-cycle. The migration has been applied. Your job is to verify correctness, catch any data quality issues, and set up signals to detect future drift.
Tasks to Perform
1. Post-Migration Validation
Run these checks immediately after applying the migration:
-- Row counts: do new tables have the expected number of rows?
SELECT COUNT(*) FROM customer_addresses;
-- Did backfills complete? (check for null values that should be populated)
SELECT COUNT(*) FROM orders WHERE customer_id IS NULL;
-- Expected: 0
-- Verify foreign key integrity (no orphaned rows)
SELECT COUNT(*) FROM order_items oi
LEFT JOIN orders o ON o.id = oi.order_id
WHERE o.id IS NULL;
-- Expected: 0
-- Verify index creation succeeded
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'customer_addresses';
-- Check for constraint violations
SELECT conname, contype
FROM pg_constraint
WHERE conrelid = 'customer_addresses'::regclass;
2. Application-Level Smoke Tests
Run these after the migration, against the connected application (not raw SQL):
# Health check
curl -sf https://yourapp.com/health | python3 -m json.tool
# Test the specific feature that touches the new data
# (use Postman, HTTPie, or your existing integration test suite)
# For a new table
curl -X POST https://yourapp.com/api/customers/123/addresses \
-H "Content-Type: application/json" \
-d '{"line_1": "1 Test St", "city": "London", "country": "GB", "postcode": "EC1A 1BB"}'
curl https://yourapp.com/api/customers/123/addresses
3. Check for Query Performance Regressions
After schema changes, slow queries sometimes appear immediately:
-- PostgreSQL: check for new slow query patterns
SELECT
left(query, 100) AS query,
calls,
round(mean_exec_time) AS avg_ms,
round(total_exec_time / 1000) AS total_sec
FROM pg_stat_statements
WHERE mean_exec_time > 100 -- queries averaging > 100ms
ORDER BY mean_exec_time DESC
LIMIT 20;
-- Check if new indexes are being used
SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read
FROM pg_stat_user_indexes
WHERE tablename = 'customer_addresses'
ORDER BY idx_scan DESC;
If a query is unexpectedly slow, run EXPLAIN ANALYZE and check for missing indexes.
4. Set Up Data Quality Monitoring
For any critical table, set up ongoing assertion-based monitoring:
-- Example: create a monitoring view or schedule these as health check queries
-- Check: orders without a customer
SELECT 'orders_without_customer' AS check_name, COUNT(*) AS violations
FROM orders WHERE customer_id IS NULL;
-- Check: negative amounts (should never happen)
SELECT 'negative_order_amounts' AS check_name, COUNT(*) AS violations
FROM orders WHERE amount_cents < 0;
-- Check: records created today vs same day last week (anomaly detection)
SELECT
DATE_TRUNC('day', created_at) AS day,
COUNT(*) AS record_count
FROM orders
WHERE created_at > NOW() - INTERVAL '14 days'
GROUP BY 1
ORDER BY 1;
Add a monitoring task to the project:
## Data quality checks — run daily via cron or monitoring script
```bash
php artisan data:quality-check # (create this command if it doesn't exist)
# or schedule the SQL assertions via your APM / alerting tool
5. Update Documentation
# Regenerate schema snapshot
pg_dump --schema-only $DATABASE_URL > docs/data-models/schema.sql
# Update the ERD / relationship description in docs/
# docs/data-models/[domain-name].md
Add the new model to the data catalog if the project maintains one.
6. Close the Cycle
Update TODO.md to mark all data tasks complete:
- [x] Model customer_addresses schema _(ref: workflows/data-cycle/01-model-and-design.md)_
- [x] Write and apply migration _(ref: workflows/data-cycle/02-migrate-and-apply.md)_
- [x] Validate data and monitor _(ref: workflows/data-cycle/03-validate-and-monitor.md)_
Exit Criteria
- [ ] Post-migration SQL validations all pass (row counts, nulls, FK integrity)
- [ ] Application smoke tests pass against production
- [ ] No new slow queries introduced (checked via pg_stat_statements or APM)
- [ ] Data quality assertions documented and scheduled
- [ ] Schema documentation / ERD updated
- [ ] All TODO items for this data change marked done
← Return to Step 1 · Model and Design for the next data change.