Data Access

There are many different ways to download or access OEPS data, find the one that works best for you. You may also be interested in our code resources page with notebooks and tutorials.

Direct Download

Aggregated Data Packages

OEPS contains data for 300+ variables split across 50+ CSV files, at 4 different geography levels (state, county, tract, and zip-code tabulation area). The scope of this content can make data access complex, so we have created two consolidated data packages (DSuite2018 and DSuite2023) to make it as easy as possible to get started. Each package contains:

  • Four data CSVs, one each for geography level.
  • A data dictionary (MS Excel format) summarizing which data year is included for each variable.
  • A corresponding geometry file for each CSV (zipped shapefile).
  • Markdown-formatted metadata documents for each variable.
  • JSON schemas for all content, following Frictionless Data Package v1 specifications.

DSuite2018

The 2018 data package is centered around the 2018 ACS. In some cases variables are not available in that year, so we have included content from within a few years. The year for each data point is listed in the data dictionary.

DSuite2023

The 2023 data package is centered around the 2023 ACS. In some cases variables are not available in that year, so we have included content from within a few years. The year for each data point is listed in the data dictionary.

All data by year

Looking for historical data, or years outside of what is included in the data packages above? Use the individual CSVs listed below to find what you need. CSVs are grouped by geography (state, county, tract, zcta), and tend to be consolidated one-per-year. However, you may see some exceptions to this where data from the same year must be joined to different geography files.

We have generated a data dictionary for each geography level that summarizes what variables and which years are available.

State

Data YearFileJoin to
1980state-1980.csvgeo-2010-state
1990state-1990.csvgeo-2010-state
2000state-2000.csvgeo-2010-state
2010state-2010.csvgeo-2018-state
2010state-providers-2010.csvgeo-2018-state
2013state-2013.csvgeo-2018-state
2014state-2014.csvgeo-2018-state
2015state-2015.csvgeo-2018-state
2016state-2016.csvgeo-2018-state
2017state-2017.csvgeo-2018-state
2018state-2018.csvgeo-2018-state
2019state-2019.csvgeo-2018-state
2020state-2020.csvgeo-2018-state
2021state-2021.csvgeo-2018-state
2022state-2022.csvgeo-2020-state
2023state-2023.csvgeo-2020-state
2025state-2025.csvgeo-2020-state

County

Data YearFileJoin to
1980county-1980.csvgeo-2010-county
1990county-1990.csvgeo-2010-county
2000county-2000.csvgeo-2010-county
2010county-2010.csvgeo-2010-county
2010county-providers-2010.csvgeo-2018-county
2014county-2014.csvgeo-2018-county
2015county-2015.csvgeo-2018-county
2016county-2016.csvgeo-2018-county
2017county-2017.csvgeo-2018-county
2018county-2018.csvgeo-2018-county
2019county-2019.csvgeo-2018-county
2020county-2020.csvgeo-2018-county
2021county-2021.csvgeo-2018-county
2022county-2022.csvgeo-2020-county
2023county-2023.csvgeo-2020-county
2025county-2025.csvgeo-2020-county

Tract

Data YearFileJoin to
1980tract-1980.csvgeo-2010-tract
1990tract-1990.csvgeo-2010-tract
2000tract-2000.csvgeo-2010-tract
2010tract-ruca-2010.csvgeo-2018-tract
2010tract-2010.csvgeo-2018-tract
2010tract-providers-2010.csvgeo-2018-tract
2014tract-sdoh-2014.csvgeo-2018-tract
2014tract-2014.csvgeo-2018-tract
2018tract-2018.csvgeo-2018-tract
2019tract-2019.csvgeo-2018-tract
2020tract-2020.csvgeo-2018-tract
2021tract-2021.csvgeo-2018-tract
2022tract-2022.csvgeo-2020-tract
2023tract-2023.csvgeo-2020-tract
2025tract-2025.csvgeo-2020-tract

Zip Code Tabulation Area (ZCTA)

Data YearFileJoin to
2010zcta-ruca-2010.csvgeo-2018-zcta
2018zcta-2018.csvgeo-2018-zcta
2019zcta-2019.csvgeo-2018-zcta
2020zcta-2020.csvgeo-2018-zcta
2021zcta-2021.csvgeo-2018-zcta
2022zcta-2022.csvgeo-2020-zcta
2023zcta-2023.csvgeo-2020-zcta
2025zcta-2025.csvgeo-2020-zcta

Geography Files

For spatial analysis, OEPS CSVs must be joined to geographic data from the US Census Bureau's Cartographic Boundary files (500k scale). Direct download links are provided below for the following formats: Shapefile, GeoJSON, or PMTiles. Make sure to pick the correct join file for your dataset, based on the tables above.

IdDownload links
geo-2010-countyshp, geojson, pmtiles
geo-2010-stateshp, geojson, pmtiles
geo-2010-tractshp, geojson, pmtiles
geo-2018-countyshp, geojson, pmtiles
geo-2018-stateshp, geojson, pmtiles
geo-2018-tractshp, geojson, pmtiles
geo-2018-zctashp, geojson, pmtiles
geo-2020-countyshp, geojson, pmtiles
geo-2020-stateshp, geojson, pmtiles
geo-2020-tractshp, geojson, pmtiles
geo-2020-zctashp, geojson, pmtiles

You can learn more about how we prepare and generate these geography files at geodata.healthyregions.org ↗ .

Tips for joining to geography files:

  • Make sure to link each CSV with the proper geometry file, using the tables above.
  • Use the HEROP_ID field to join, it is present in all CSV and geometry files (other common identifiers like ZIP5 or FIPS may also be available).
  • In Connecticut: For county and tract data from 2022 or later you should use 2022 geographies (not yet provided here) because county (and therefore tract) FIPS ids changed between 2021 and 2022. To translate 2022 tracts back to 2020 geometries, you can use these crosswalks from CT Data Collaborative.

Programmatic Access

oepsData — R Package

We maintain a small R package called oepsData. This package is the best way for researchers who use R to load and analyze OEPS data directly, without the need to download CSVs or Shapefiles and worry about joins.

  • Documentation: Learn how to install and use the package.
  • Usage examples: Within the package docs we have a few examples of what it looks like to load and use OEPS data.
  • GitHub: Use the GitHub repo to report issues you have with the package, or suggest new features or datasets.

Current release: v0.1

Google BigQuery

We have loaded the OEPS data warehouse into Google BigQuery, a data storage platgorm that provides the ability for researchers to run SQL queries (including spatial queries) to retrieve or perform analysis on specific data subsets. Google publishes many different clients through which you can access a BigQuery database, and for R users there is bigrquery. Here's how to get started: