AB 1755: Open and Transparent Water Data Platform for California
What is the Open and Transparent Data Act?
The Open and Transparent Water Data Act (AB 1755, Dodd) requires the Department of Water Resources, in consultation with the California Water Quality Monitoring Council, the State Water Resources Control Board, and the California Department of Fish and Wildlife, to create, operate, and maintain a statewide integrated water data platform; and to develop protocols for data sharing, documentation, quality control, public access, and promotion of open-source platforms and decision support tools related to water data.
A team of partner agencies is collaborating with and learning from others – including State and federal agencies, data experts, data providers, and data consumers – to chart a successful path forward.
Why is an open data platform important?
- Integration of existing water and ecological/fisheries data will: support analysis across datasets and disciplines; help water managers operate more efficiently; and help water users make informed decisions based on water availability and allocation
- State agencies should promote openness and interoperability of water data
- Increased transparency of public data is good government
- Water data and information technology tools and applications developed and gathered using state funds should be made publicly accessible and open-source, whenever possible
- Increased access to data will support better-informed decisions and cost-effective investments
- AB 1755 will support greater use of data collected and increase awareness of the importance of data in water management
- Making information accessible, discoverable, and usable by the public can foster entrepreneurship, innovation, and scientific discovery
- More comprehensive and interoperable datasets will provide unique opportunities to develop data-search and data-packaging products and services
Useful data for sound, sustainable water resource management.
In support of the vision, four goals have been articulated, as follows:
- Data are sufficient: Data are sufficient to support water resources management and answer water resource-related questions.
- Data are accessible: Data are available for use and discoverable.
- Data are useful: Data are available in a form that facilitates use in various models, visualizations, and reports.
- Data are used: Data are put to work in decision-making and innovation.
The implementing agencies developed a Strategic Plan for Assembly Bill 1755, the Open and Transparent Water Data Act. This Strategic plan will help to guide implementation of the program to achieve the Vision, Goal, Objectives, and Strategic Actions as described in the plan.
An interagency implementation plan was developed to identify specific actions required to achieve strategic plan goals. The details of the implementation plan can be found in the April 2018 Progress Report (available upon request).
Preliminary Protocols for AB 1755
To support the initial implementation of AB 1755, the partner agencies have outlined three initial minimum protocols, consistent with available open data platforms, to guide early implementation of the program. The intent is to develop only what is necessary to facilitate early implementation to avoid creating barriers to sharing of data through an open data portal. These protocols will necessarily adapt over time in response to both changing software capabilities and the needs of the users of the open data portals to support a more efficient and transparent use of data.
Open Water Information Architecture
Additional protocols and data standards will be developed in the future, informed by the Open Water Information Architecture (OWIA, available upon request) and developed use cases, with a goal of increasing the interoperability of systems and datasets.
The OWIA document addresses the intended outcomes (functional requirements) and system details (technical requirements) to ensure that both executives and engineers remain aligned in common purpose. The OWIA outlines protocols, procedures, resources, governance, and minimum standard of technology required to meet the needs of California's water community, while also promoting greater levels of openness, transparency, and comparability for the information needed to manage water-related resources more effectively.
Business Requirement: All datasets published by Partner Agencies on the open platform have Partner Agency “owners,” who are responsible for maintaining and curating them for users.
To facilitate dissemination of information and avoid orphaned datasets, each dataset on the open data platform must have a data steward assigned to it from the appropriate agency. The data steward is responsible for the data and for meeting any related data requests. This protocol allows for multiple levels of data stewardship, such as a data creator or author (originator of the data), data caretaker (inheritor or external sponsor of the data), data sub-steward (person responsible for a subset of the data), and other roles beyond what is defined here. This protocol does not define specific roles for data stewards, it simply indicates the need to have at least one accessible person identified, and prescribes minimum required information for each data steward:
- Name of steward
- Contact information
Business Requirement: All datasets published by Partner Agencies on the open platform have a place where they can be discovered.
For data to be discoverable, they must be published to, or made available to, an open data platform. To be published, all data must meet the minimum documentation standards outlined in this section, including the metadata standard, the data dictionary requirements, and the guidelines for optional descriptive text. Requiring minimum documentation helps ensure these items can be found by users of an open data portal, and once a user has found the dataset, that sufficient documentation on the dataset is available to answer most of the user's questions.
To support initial implementation of AB 1755 on the California Open Data Portal and the California Natural Resources Agency (CNRA) Open Data Platform, the metadata requirements are to comply with the metadata standards identified by those portals. As additional necessary metadata elements are identified, they will be added to the existing metadata requirements using a block structure format with the appropriate block elements, depending on the type of data.
Data Dictionary Requirements
Similar to the metadata requirements, the data dictionary requirements are to follow those required by the respective open data portals, the California Open Data Portal and the CNRA Open Data Platform.
Machine Readable Data Requirement
All tabular datasets published on an open platform must be machine readable. The federal Office of Management and Budget describes machine readable format in Circular A-11 Part 6 as: “a standard computer language (not English text) that can be read automatically by a web browser or computer system. (e.g.; xml). Traditional word processing documents, hypertext markup language (HTML) and portable document format (PDF) files are easily read by humans but typically are difficult for machines to interpret. Other formats such as extensible markup language (XML), (JSON), or spreadsheets with header columns that can be exported as comma separated values (CSV) are machine readable formats. It is possible to make traditional word processing documents and other formats machine readable but the documents must include enhanced structural elements.”
Business Requirement: All datasets published by Partner Agencies on the open platform are machine readable, well documented, and accessible to users.
Well-documented, published data with an appropriate data steward is not useful to the larger water community unless it is also accessible to the user of that data. To support this need, a sample workflow for user access to a dataset is provided below. While this workflow pertains to the user accessing the dataset, it has significant implications on how state agencies should build platforms and organize data to support accessibility.
A Sample Workflow for Accessing Data on an Open Data Portal
- Access open data portal via internet
- Search using keywords or tags
- Generate results list sorted by relevance to search terms
- Select desired dataset from search results
- Take user directly to data or to data location
- Query and visualize results using basic in-browser tools
- Download full or relevant queried portion of the dataset
- Connect to dataset directly via API
Who needs what data in what form to make what decisions?
A use case is a tool for assessing stakeholder data needs in specific decision contexts, and communicating those needs to technical developers. Additionally, the development of use cases helps clarify the questions that water resource decision-makers must answer and identifies the data necessary to answer those questions. Cooperative development of use cases allows water resource managers to examine a decision from multiple perspectives; it yields transparent documentation that water resource managers can use to communicate their decision-making process with stakeholders, to solicit improvements, and to foster trust in decisions; and it can help decision-makers articulate expectations of a data system in such a way that technical experts can analyze and address those expectations.
Data for Water Decision Making
A lack of data and information has limited our ability to understand, let alone better manage, all aspects of our water resources. This report supports California’s efforts to develop modern water data systems. It argues that simply providing more data is not enough, and that generating useful and usable information hinges on the development of data systems based on end users’ needs. The report describes lessons learned from a process of stakeholder engagement focused on defining and clarifying uses of water data, and how knowledge of these uses can inform the development of water data systems.
Initial Draft Use Cases
As an example of the type of content needed to inform a decision-driven water-data system, researchers and water management professionals engaged in the interactive development of 20 draft use cases. The 20 use cases are not intended to span the entire decision space of water and environmental management, nor are they designed to be a comprehensive library of possible water management decisions. Rather, they serve as a proof-of-concept to demonstrate that developing use cases is an effective means of focusing decision-making and prioritizing data collection and publication. The draft use cases are a starting point for the kinds of decisions to which the evolving federated, interoperable open data portals of AB 1755 must respond. Though the initial effort to develop the 20 use cases is complete, the partner agencies invite anyone involved in water management decision-making to contribute additional use cases, share available datasets for publication, and federate additional existing data portals.
Federated Interoperable Open Data Portals
In a federated open data system, participating open data nodes will be discoverable and accessible through a federated data catalog. This is analogous to an inter-library loan system, in which the ability of a user to discover holdings is key. The partner agencies are working to populate and federate two State-hosted portals, specifically the California Open Data Portal and the CNRA Open Data Platform, to allow users improved access to available water and ecological datasets. State agencies have made over 1,000 datasets and numerous data visualizations available on these two portals. Both portals will offer additional functionality, as well as new and updated data on a continuous basis, based on availability, technology developments, and user feedback.
California Natural Resources Agency Open Data Platform
The CNRA Open Data Platform has been developed to provide data to State of California citizens, agencies, and interested stakeholders in a transparent and useful manner. The Open Data Platform supports the CNRA organizations’ and programs’ missions by providing an environment to publish and share useful data that can be effectively utilized by all.
California Open Data Portal
The California Open Data Portal, hosted by GovOps and the California Department of Technology, features data from many different State agencies on a wide variety of topics. The California Open Data Portal will bring government closer to citizens and start a new shared conversation for growth and progress in our great state.
Implementation of AB 1755 must include thoughtful choices about long-term governance and funding for the open data platform. In the absence of a resilient governance structure and stable funding model, the open data platform is not sustainable. In creating, operating, and maintaining the platform, numerous actions and decisions will have to be coordinated— from system architecture to protocols adoption to platform development and use. A governance structure is the body of established rules and expectations about how these things will be decided and coordinated.
When it comes to funding the operation and maintenance of the open data system, the Open and Transparent Water Data Act established the Water Data Administration Fund, which can receive voluntary contributions, but the legislation did not include a funding appropriation. This highlights one of the biggest challenges the AB 1755 open data platform will face, the need for dedicated resources to keep it running.
During the first quarter of 2018, an independent exploration was conducted, in consultation with the partner agencies, to consider and evaluate governance and funding options for the platform described in AB 1755. The exploration was initiated by the Water Foundation and conducted by Redstone Strategy Group, with funding from the S.D. Bechtel, Jr. Foundation, and it included the voluntary participation of a range of distinguished professionals throughout California. This group met to discuss three key questions:
- What are the governance needs associated with implementation of AB 1755?
- What organizational structure(s) would best meet these governance needs?
- How can governance promote a sustainable funding model for AB 1755?
Governance and Funding for Open and Transparent Water Data
Together with the Water Foundation, Redstone Strategy Group explored sustainable governance and funding structures for the implementation of AB 1755. The exploration is detailed in this report. California’s efforts build on the blueprint laid out in the Aspen Institute’s Report on Sharing and Integrating Water Data for Sustainability and provide a powerful example for other states looking to leverage the power of data to meet the challenge of managing water resources for growing populations in an increasingly variable climate.
The Governor’s Office of Planning and Research (OPR) led an effort, with active participation of executive management at CNRA, CalEPA, DWR, SWRCB, DFW, and GovOps, to consider and evaluate governance and funding options, informed by the independent report described above. This group of partner agency executives was charged with developing specific, actionable recommendations for long-term platform governance and funding for consideration and decision by the Administration. A factsheet is available that describes State recommendations regarding water data governance.
OPR convened a Water Data Advisory Council to provide guidance and recommendations on water data governance to potential funders and the partner agencies. The Council held three meetings from fall 2018 to spring 2019. The first meeting occurred November 14, 2018, the second meeting occurred January 16, 2019, and the third meeting occurred February 15, 2019.
- AB 1755 Implementation Journal
- Letter from CNRA and CalEPA Secretaries related to the Water Data Consortium
- Water Data Advisory Council Recommendations
- Strategic Plan for AB 1755
- Protocols for AB 1755
- Open Water Information Architecture - System Requirements Document
- Open Water Information Architecture - Standard Operating Procedures
- Exploration of Governance and Funding for Open Water Data in California
- Data for Water Decision Making - Use Cases
- Aspen Dialog Series - Internet of Water