Schema reference, field descriptions, and sourcing notes for every dataset in this project.
All data lives in two canonical CSV files โ data/programs.csv and data/organizations.csv. A Python pipeline validates and joins them into data.js, the JavaScript module loaded by the app at runtime.
Pipeline order: edit CSVs โ python scripts/validate_data.py (check for errors) โ python scripts/build_data_js.py (regenerate data.js). Utility scripts like backfill_age_grade.py and fetch_descriptions.py run before validation to fill in missing fields.
The unified program registry. Each row is one program (one session of one camp, or one afterschool program). Foreign key org_id must match a row in organizations.csv โ validated by validate_data.py before building.
| Field | Type | Description |
|---|---|---|
program_id | string | Unique program identifier (kebab-case slug, e.g. bgc-burlington-camp-2026-1). Auto-generated; never rename. |
org_id | string (FK) | Foreign key referencing organizations.csv. Must match an org_id exactly. Use the Admin dropdown โ never type free text. |
program_name | string | Full display name of the program or session. |
program_type | enum | camp or afterschool. |
program_year | integer | Program year (e.g. 2026). |
description | text | Description of the program. May be auto-fetched via scripts/fetch_descriptions.py. |
session_type | enum | day, residential, hybrid, or drop-in. |
schedule_type | enum | seasonal, weekly, daily, or year-round. |
grades_min | string | Minimum grade served (PK, K, 1โ12). Backfilled from age_min by backfill_age_grade.py. |
grades_max | string | Maximum grade served. |
age_min | integer | Minimum age for eligibility. |
age_max | integer | Maximum age for eligibility. |
start_date | date (YYYY-MM-DD) | First day of the session. Used by the Calendar view. |
end_date | date (YYYY-MM-DD) | Last day of the session. |
days_of_week | string | Days the program runs (e.g., "MonโFri"). |
start_time | string (HH:MM AM/PM) | Daily start time. Normalized by normalize_times.py. |
end_time | string (HH:MM AM/PM) | Daily end time. |
pre_after_care | string | Extended care availability (before/after hours). |
cost_raw | string | Cost as listed by the provider โ raw, unnormalized. |
cost_per_week | numeric | Normalized weekly cost in dollars. 0 = free or unknown. |
cost_notes | string | Sliding scale, sibling discounts, or other cost notes. |
meals_provided | boolean | TRUE if meals are included. |
transportation_provided | boolean | TRUE if transportation is available. |
transportation_notes | string | Transportation details. |
activities | comma-separated | Canonical activity tags from the allowed list (see validate_data.py). |
site_address | string | Street address of the program site (if different from org address). Enriched by enrich_locations.py. |
site_city | string | City/town where the program runs. Must match a town name in data/Vermont_Town_GEOID_RPC_County.geojson (use TOWNNAMEMC spelling, e.g. Burlington, Saint Albans City). Used for City filter, map, and auto-county lookup. |
site_county | string | Vermont county. Filled automatically by infer_counties.py from the GeoJSON townโcounty map. Used for County filter. |
registration_url | URL | Link to the program's registration or info page. |
registration_opens | date (YYYY-MM-DD) | When general/public registration opens. Shown on program cards and detail modal. Leave blank for afterschool programs (enrollment is ongoing). |
registration_opens_early | date (YYYY-MM-DD) | Optional earlier window for a priority group (returning campers, town residents, siblings, etc.). Must be on or before registration_opens. |
registration_notes | text | Who qualifies for early registration, rolling/waitlist info, or any other registration context. Shown in the detail modal alongside the dates. Examples: "Early reg for returning campers", "Rolling admission", "Waitlist only as of Jan 15". |
funding_source | string | Funding mechanism (e.g., "21st Century Community Learning Centers" for free afterschool programs). |
confidence | enum | confirmed, likely, or inactive. Programs with inactive are excluded from the public site. |
verified_date | date (YYYY-MM-DD) | When the data was last verified against the source. |
notes | text | Internal notes; not shown publicly. Use for data quality reminders, follow-up items, or anything that doesn't fit other fields. |
The organization registry. Every org_id referenced in programs.csv must exist here. Organization contact info (phone, email, website) is merged into program records at build time and shown in the detail modal.
| Field | Type | Description |
|---|---|---|
org_id | string (PK) | Unique kebab-case slug (e.g., bgc-burlington). Once set, never renamed โ programs reference it as a foreign key. |
org_name | string | Organization's full display name. |
org_type | enum | nonprofit, municipal, school, private, university, or faith-based. |
website | URL | Organization's main website. |
phone | string | Main contact phone number. |
email | string | Main contact email address. |
street_address | string | Organization's primary office street address. |
city | string | City or town. |
county | string | Vermont county. |
state | string | State abbreviation (VT). |
zip | string | ZIP code. |
financial_aid_available | boolean | TRUE if the organization offers any financial assistance. |
financial_aid_notes | text | Details on sliding scale, subsidies, or scholarship processes. |
registration_policy | string | Notes on registration approach (e.g., "First-come first-served"). |
registration_opens | string | When registration typically opens each year. |
confidence | enum | confirmed, likely, or inactive. |
verified_date | date (YYYY-MM-DD) | When org data was last confirmed. |
notes | text | Internal notes; not shown in the UI. |
Lookup table used by scripts/backfill_age_grade.py to bidirectionally fill in missing age_min/age_max or grades_min/grades_max in programs.csv.
| Field | Type | Description |
|---|---|---|
Start Age | integer | Lower bound of the age range for this grade. |
End Age | integer | Upper bound of the age range for this grade. |
Grade | string | Corresponding grade level (K, 1โ12). |
All scripts live in scripts/ and operate on CSVs in data/. Run from the project root.
Validates programs.csv and organizations.csv for foreign key integrity, required fields, and enum values. Exit code 0 = pass; exit code 1 = errors that block the build. Run this before build_data_js.py.
Runs validate_data.py first (aborts on errors), then joins programs.csv + organizations.csv and writes data.js โ the JavaScript module loaded by the app. Run this after editing any CSV.
Fills missing grades_min/grades_max from age_min/age_max (and vice versa) in programs.csv using data/age_to_grade.csv. Run before build_data_js.py when adding programs with only age or only grade data.
Geocodes and standardizes site_address in programs.csv by querying the OpenStreetMap Nominatim API. Fills in missing addresses and normalizes format to "Street, City, VT ZIP".
Fetches program descriptions from registration_url for rows with a blank description in programs.csv. Extracts the <meta name="description"> tag or first paragraph from each URL. Saves progress every 25 rows.
Standardizes start_time and end_time strings in programs.csv to 12-hour format (HH:MM AM/PM), handling inconsistent inputs like "9am", "09:00", "9:00 AM", etc.
Local development server. Serves static files and supports saving CSV edits from the Admin page via a POST /__save_csv endpoint. Run with python scripts/dev_server.py then open http://127.0.0.1:8000.