TB-17 — Archive Document Backfill
Summary
Parse the 12,495 files already linked into /data/projects/ (from TB-08) and backfill project data in Supabase. Four tiers: Tier 1 permits+plans (438 files, Vision API), Tier 2 certs (918 files), Tier 3 sift the “Other” bucket (2,852 files), Tier 4 EXIF photo dates (8,264 files, no API cost). All tiers produce review manifests before writing.
What it produced
- Backfilled permit numbers, jurisdictions, code cycles, SOW lines, inspection dates across historical projects
- Reclassified files from OTHER/ into correct category folders
- EXIF date extraction for inspection date confirmation
Connections
- depends on: TB-08b-historical-file-linkage — files already linked into /data/projects/
- depends on: Supabase — backfill target for project data enrichment
- depends on: Anthropic — Claude Vision for document parsing
- produced: enriched historical project data in Supabase
- feeds into: TB-18-permit-package-consolidation — parsing results inform merge strategy