๐Ÿ“Š

Data Documentation

Schema reference, field descriptions, and sourcing notes for every dataset in this project.

Overview

All data lives in two canonical CSV files โ€” data/programs.csv and data/organizations.csv. A Python pipeline validates and joins them into data.js, the JavaScript module loaded by the app at runtime.

Pipeline order: edit CSVs โ†’ python scripts/validate_data.py (check for errors) โ†’ python scripts/build_data_js.py (regenerate data.js). Utility scripts like backfill_age_grade.py and fetch_descriptions.py run before validation to fill in missing fields.

Browse data/ on GitHub

Programs

data/programs.csv

Rows: 401 programs (391 camps, 10 afterschool) Updated: Annually (spring for camps; fall for afterschool) Source: Manual research โ€” program websites, direct outreach, Vermont AOE grant data

The unified program registry. Each row is one program (one session of one camp, or one afterschool program). Foreign key org_id must match a row in organizations.csv โ€” validated by validate_data.py before building.

FieldTypeDescription
program_idstringUnique program identifier (kebab-case slug, e.g. bgc-burlington-camp-2026-1). Auto-generated; never rename.
org_idstring (FK)Foreign key referencing organizations.csv. Must match an org_id exactly. Use the Admin dropdown โ€” never type free text.
program_namestringFull display name of the program or session.
program_typeenumcamp or afterschool.
program_yearintegerProgram year (e.g. 2026).
descriptiontextDescription of the program. May be auto-fetched via scripts/fetch_descriptions.py.
session_typeenumday, residential, hybrid, or drop-in.
schedule_typeenumseasonal, weekly, daily, or year-round.
grades_minstringMinimum grade served (PK, K, 1โ€“12). Backfilled from age_min by backfill_age_grade.py.
grades_maxstringMaximum grade served.
age_minintegerMinimum age for eligibility.
age_maxintegerMaximum age for eligibility.
start_datedate (YYYY-MM-DD)First day of the session. Used by the Calendar view.
end_datedate (YYYY-MM-DD)Last day of the session.
days_of_weekstringDays the program runs (e.g., "Monโ€“Fri").
start_timestring (HH:MM AM/PM)Daily start time. Normalized by normalize_times.py.
end_timestring (HH:MM AM/PM)Daily end time.
pre_after_carestringExtended care availability (before/after hours).
cost_rawstringCost as listed by the provider โ€” raw, unnormalized.
cost_per_weeknumericNormalized weekly cost in dollars. 0 = free or unknown.
cost_notesstringSliding scale, sibling discounts, or other cost notes.
meals_providedbooleanTRUE if meals are included.
transportation_providedbooleanTRUE if transportation is available.
transportation_notesstringTransportation details.
activitiescomma-separatedCanonical activity tags from the allowed list (see validate_data.py).
site_addressstringStreet address of the program site (if different from org address). Enriched by enrich_locations.py.
site_citystringCity/town where the program runs. Must match a town name in data/Vermont_Town_GEOID_RPC_County.geojson (use TOWNNAMEMC spelling, e.g. Burlington, Saint Albans City). Used for City filter, map, and auto-county lookup.
site_countystringVermont county. Filled automatically by infer_counties.py from the GeoJSON townโ†’county map. Used for County filter.
registration_urlURLLink to the program's registration or info page.
registration_opensdate (YYYY-MM-DD)When general/public registration opens. Shown on program cards and detail modal. Leave blank for afterschool programs (enrollment is ongoing).
registration_opens_earlydate (YYYY-MM-DD)Optional earlier window for a priority group (returning campers, town residents, siblings, etc.). Must be on or before registration_opens.
registration_notestextWho qualifies for early registration, rolling/waitlist info, or any other registration context. Shown in the detail modal alongside the dates. Examples: "Early reg for returning campers", "Rolling admission", "Waitlist only as of Jan 15".
funding_sourcestringFunding mechanism (e.g., "21st Century Community Learning Centers" for free afterschool programs).
confidenceenumconfirmed, likely, or inactive. Programs with inactive are excluded from the public site.
verified_datedate (YYYY-MM-DD)When the data was last verified against the source.
notestextInternal notes; not shown publicly. Use for data quality reminders, follow-up items, or anything that doesn't fit other fields.

Organizations

data/organizations.csv

Rows: 148 organizations Updated: As needed Source: Compiled from program websites and direct outreach

The organization registry. Every org_id referenced in programs.csv must exist here. Organization contact info (phone, email, website) is merged into program records at build time and shown in the detail modal.

FieldTypeDescription
org_idstring (PK)Unique kebab-case slug (e.g., bgc-burlington). Once set, never renamed โ€” programs reference it as a foreign key.
org_namestringOrganization's full display name.
org_typeenumnonprofit, municipal, school, private, university, or faith-based.
websiteURLOrganization's main website.
phonestringMain contact phone number.
emailstringMain contact email address.
street_addressstringOrganization's primary office street address.
citystringCity or town.
countystringVermont county.
statestringState abbreviation (VT).
zipstringZIP code.
financial_aid_availablebooleanTRUE if the organization offers any financial assistance.
financial_aid_notestextDetails on sliding scale, subsidies, or scholarship processes.
registration_policystringNotes on registration approach (e.g., "First-come first-served").
registration_opensstringWhen registration typically opens each year.
confidenceenumconfirmed, likely, or inactive.
verified_datedate (YYYY-MM-DD)When org data was last confirmed.
notestextInternal notes; not shown in the UI.

Age to Grade Mapping

data/age_to_grade.csv

Rows: 13 Updated: Rarely Source: Vermont standard age-grade equivalencies

Lookup table used by scripts/backfill_age_grade.py to bidirectionally fill in missing age_min/age_max or grades_min/grades_max in programs.csv.

FieldTypeDescription
Start AgeintegerLower bound of the age range for this grade.
End AgeintegerUpper bound of the age range for this grade.
GradestringCorresponding grade level (K, 1โ€“12).

Data Pipeline Scripts

All scripts live in scripts/ and operate on CSVs in data/. Run from the project root.

scripts/validate_data.py

Validates programs.csv and organizations.csv for foreign key integrity, required fields, and enum values. Exit code 0 = pass; exit code 1 = errors that block the build. Run this before build_data_js.py.

scripts/build_data_js.py

Runs validate_data.py first (aborts on errors), then joins programs.csv + organizations.csv and writes data.js โ€” the JavaScript module loaded by the app. Run this after editing any CSV.

scripts/backfill_age_grade.py

Fills missing grades_min/grades_max from age_min/age_max (and vice versa) in programs.csv using data/age_to_grade.csv. Run before build_data_js.py when adding programs with only age or only grade data.

scripts/enrich_locations.py

Geocodes and standardizes site_address in programs.csv by querying the OpenStreetMap Nominatim API. Fills in missing addresses and normalizes format to "Street, City, VT ZIP".

scripts/fetch_descriptions.py

Fetches program descriptions from registration_url for rows with a blank description in programs.csv. Extracts the <meta name="description"> tag or first paragraph from each URL. Saves progress every 25 rows.

scripts/normalize_times.py

Standardizes start_time and end_time strings in programs.csv to 12-hour format (HH:MM AM/PM), handling inconsistent inputs like "9am", "09:00", "9:00 AM", etc.

scripts/dev_server.py

Local development server. Serves static files and supports saving CSV edits from the Admin page via a POST /__save_csv endpoint. Run with python scripts/dev_server.py then open http://127.0.0.1:8000.