Drinking Water Data for the Nation: Version 1.0 of the National Drinking Water Dataset + Explorer Tool Is Here

the tl;dr — After piloting in Texas, building a national data pipeline covering 821 variables from 30+ datasets, and engaging stakeholders across the drinking water ecosystem, EPIC is releasing Version 1.0 of the National Drinking Water Dataset + Explorer Tool. It's free, open source, and built for anyone who needs to understand drinking water systems across the U.S. Explore the tool here.


All Americans deserve access to safe, reliable, and affordable drinking water. But turning these  principles into reality requires untangling a complex web of interdependent factors — and that requires data. Not just any data: connected, usable, trustworthy data that doesn't require a data science degree to put to work.

That's the gap we've been working to close. Today, we're releasing Version 1.0 of the National Drinking Water Dataset + Explorer Tool, a free, web-based platform that brings together 30+ state and national datasets — covering water system compliance, funding history, environmental hazards, census demographics, climate vulnerability, and more — into a single, harmonized, and openly accessible resource. This effort is part of the Digital Service for the Planet (DSP) initiative with our partners, New America.

Access the tool via our website

 Access the tool via our website

From Texas to the Nation

This work started in 2024 with a pilot data pipeline and application in Texas — a state with enormous demand for water infrastructure investment. That tool helped utilities, policymakers, and technical assistance providers answer questions that previously required juggling data from a half-dozen sources. The feedback was strong, and the most common comment we heard was: "I don't live in Texas."

So we built it for everyone.

Over the past year, we undertook national discovery — learning from researchers, regulators, utilities, technical assistance providers, community organizations, and funders about the data they use, the questions they need to answer, and what they wished existed. Then we built a transparent and durable data pipeline capable of delivering it, and a tool to put it in people's hands.

The Data: 821 Variables, 30+ Datasets, One Unified Pipeline

The dataset is a product in its own right. Our data pipeline pulls from sources across three interconnected domains:

Water systems — EPA's Safe Drinking Water Information System (SDWIS), disaggregated rule violations over 5 and 10 years, State Revolving Fund award history, EPA Service Area Boundaries, and drinking water advisories collected from 13 states (boil water, do-not-drink, and do-not-use orders — the largest publicly accessible inventory of its kind in the U.S.).

Communities served — Census demographics and 10-year change for 50+ variables, annual median household income, estimated water rates, the CDC Social Vulnerability Index, and the Climate and Economic Justice Screening Tool (CEJST).

Environmental context — Well and intake watersheds, impaired waterways, underground storage tanks, facilities with Risk Management Plans, and NPDES permits — all linked to the water systems drawing from those sources.

We investigated 821 variables and distilled them into approximately 130 well-documented fields available in the tool. Everything is summarized at the water system level using Public Water System ID (PWSID), meaning data from watersheds, census geographies, and tabular sources has all been harmonized to a common unit. We used EPA's crosswalk and a mix of areal, household-weighted, and population-weighted interpolation methods to do this right — and we documented every step.

The pipeline runs on AWS, updating key datasets daily, quarterly, or annually. The tool itself updates on a quarterly basis, after datasets pass data quality checks and expert review. All code is open source and available on our public GitHub repository.

How the National Drinking Water Dataset spatially combines different sources

Built for Real Work

Snapshot of EPIC’s Public Comment using the National Drinking Water Dataset - Texas’s HB500 $1B investment in water infrastructure

Snapshot of EPIC’s Public Comment using the National Drinking Water Dataset - Texas’s HB500 $1B investment in water infrastructure

The tool is designed for the full spectrum of drinking water stakeholders — and different users will use it differently. Think of the dataset and tool as an engine that can power all kinds of vehicles.

For technical assistance providers, it eliminates the "too many tabs open" problem. Before engaging a community, our Funding Navigator team used to piece together Census data, SRF records, compliance history, and more from separate sources. The tool puts all of that in one place, freeing up time to actually support communities.

For policymakers, it enables more rigorous, data-grounded analysis. Using an earlier iteration of our national tool, EPIC submitted public comment on Texas SB500 — a $1 billion grant investment in water infrastructure. Our analysis showed that proposed funding allocations disproportionately favored large systems, while small and very small systems serve the greatest share of Texans. Read more about how the Texas Water Development Board implemented our analysis - dedicating $42 million to systems serving under 1,000 people - and increasing project proposal caps. 

For researchers and academics, the tool lowers the barrier to move from question to analysis. Teams at ASU, Stanford, and UCLA have already been testing methods and generating insights with earlier versions.

For regulators, it supports more equitable deployment of funds by making system performance, community characteristics, and funding history visible in one place.

What You Can Do With It

The tool's main features include an interactive map with filters, a data table, dataset cards with documentation, and data downloads. Some quick use cases:

  • Find all drinking water systems in a given state, filtered by size, ownership type, or compliance status

  • Identify systems with open violations and climate vulnerability

  • Find small systems serving low-income communities that haven't received SRF funding

  • Download filtered datasets for your own analysis

  • Link directly to authoritative EPA compliance data for any system

The tool was developed in collaboration with the Center for Neighborhood Technology (CNT).

What We've Learned

This work is more ambitious than we initially imagined. Scaling from Texas to all 50 states wasn't 50 times the work — it was to the 50th power. A few things have become clear along the way:

Iteration matters, and it has to happen in the open. We won't get it right the first time. The pipeline is modular and open source precisely to make improvement efficient and community-driven.

Different users need different things. The drinking water space spans utility operators, community advocates, federal policymakers, and academics — all with different questions and different data needs. Walking the line between too much and too little complexity is core to the project's success.

The dataset and tool are both products. We built a data pipeline, not just a one-off dataset. When you turn on the tap here, the data should flow.

What's Next

We're now moving into Version 1.1 development this spring and summer, driven by feedback from users like you. We're also gearing up to launch usability sessions, and group steering opportunities — dedicated spaces to engage practitioners, researchers, and advocates on what works now, and what to build next

Here's how you can get involved:

Check it out! National Drinking Water Dataset + Explorer Tool

💻 Interested in collaborating? Quick Sign up to get invited to events and sessions!

👋 Want to learn more? View the release webinar recording and slide deck.

🎥 We built this with you, and we're continuing to build it with you. If you've found the work useful — or if you see what else is possible — we want to hear from you.

Previous
Previous

From Weather to Wildfire:  Lessons for Building a New Wildfire Intelligence Capability

Next
Next

National Approach to Wildfire Data and Technology: Operations-Centered Innovation Pathways