EPA Data Web 2.0 Wiki Pilot > System of Registries

System of Registries

Background

The EPA System of Registries (SoR) and Repositories support EPA's information architecture, data standards program, and data exchange with stakeholders through network nodes. They provide information for objects like data elements, XML tags, data standards, substances (chemicals, biological organisms, and physical properties), terms, facilities, regulations, and data sets.

Purpose

The SoR is undergoing review and revision and this work in a Web 2.0 Wikis designed to aid in that effort by the application of enterprise and data architecture and Semantic Web (Ontology) concepts and methods. See Ken Orr Special Session. The result will be an improved systems enginnering to achieve the Google-Wikipedia effect with this valuable content that is hidden in databases not accessible to Web search engines.

Structure and Interface 

Concept (original URL) Definition Specific Example Status
Environmental Data Registry (EDR) Data standards, XML, and application metadata

Web 2.0 Wiki

Web 2.0 Wiki

               
Facility Registry System (FRS) Facilities, sites, or places subject to environmental regulations or of environmental interest Web 2.0 Wiki                  
Registry of EPA Applications and Databases (READ) Application inventory, organization hierarchy, and other information resources Web 2.0 Wiki                   
Substance Registry System (SRS) Substances regulated by the EPA such as chemicals, organisms, and physical characteristics Web 2.0 Wiki                  
XML Registry  XML Schemas, Namespaces, and Supporting Files Web 2.0 Wiki                
Science Inventory [formerly Environmental Information Management System (EIMS)] Data Sets, Models, and Spatial Data Web 2.0 Wiki                 
Environmental Terminology System and Services (ETSS) [also Terminology Reference System (TRS)] Environmental terms, definitions, and hierarchies

Web 2.0 Wiki

Web 2.0 Wiki

               
Web Registry (WR) Web page metadata Web 2.0 Wiki                

         

Fact Sheet

Introduction

The application of sound information management principles and proper use of high quality information assets are necessary to achieve EPA’s mission to protect human health and the environment.  The EPA System of Registries (SoR) is an integral part of the EPA quality system and promotes the delivery of consistent, accurate, and understandable information for EPA, its State and Tribal partners, other Federal agencies, and the public.  It provides tools and automated services to support the discovery, validation, and exchange of EPA information.   Contents include metadata, simply defined as “data about data,” that includes identification and descriptive information for agency systems, datasets, models, service components, and environmental terminology.  The SoR also includes authoritative lists of chemicals, biological organisms, facilities, and various code sets used by systems supporting environmental programs.  SoR management practices include a formal registration process for all contents, stewardship by data owners and communities of interest, and support for quality reviews and assurance.  In summary, the System of Registries enables the development and correct use of environmental information.

Purpose of New Enhancements

The SoR represents EPA’s long-term commitment to manage information resources as business assets to support sound decision making.  EPA’s Office of Information Collection (OIC) manages the SoR and provides support to users, data stewards and stakeholders.  Over the past several years, OIC has initiated a project to reengineer its registries to better support environmental business processes and provide new automated services.  One objective is to enable distributed stewardship of registry content in order to enhance data quality.  A second objective is to restructure the registries to allow efficient use of web services.  This permits ongoing updates of registry data within programmatic systems and validation and transformation of environmental data against authoritative information in the registries.  Web services will also assist with presenting registry contents to a wider community of users in formats that meet specific business needs.  As part of the SoR project, Commercial-Off-The-Shelf (COTS) software tools have been incorporated that provide additional web service functionality for users.  A third objective is to improve the search and retrieval capabilities of the registries.  New interfaces will significantly improve reporting capability by allowing users to access, view, and download information into various formats.  Finally, reusable component services are being considered for system developers and others to find information technology components including web services, data models, data flows, schema, and templates.  This resource will incorporate the eXtensible Markup Language (XML) registry now found within the SoR.  When completed, it will assist EPA in moving toward a Service Oriented Architecture.

Adding Concepts and Meaning to Registry Data

Critical to the expanded use of the registries is the collection and recording of the meaning of Agency data and terminology.  The context in which data and information is to be used is equally important.  To aid in this, concepts are managed in the registries and linked to data elements and terms as appropriate.  Meanings are defined in plain English and provided in addition to legal and other definitions.  More accurate concepts and meanings help improve searches over both structured (databases) and unstructured (documents) data.  The use of concepts further clarifies the proper usage of information.

Additional Uses of the Registries

Information contained within the registries helps EPA and its partners meet important business needs.  The registries are identified within the Agency’s enterprise architecture and provide essential services to the Environmental Information Exchange Network and Agency information integration initiatives. The SoR also provides services that support the goals of the Federal Enterprise Architecture.

Registries

• Data Registry Services – provides a comprehensive, authoritative reference for information about environmental data, including: definitions, sources, formats, values and uses. It also provides standard data elements and values for download and use in application system design.

• Substance Registry Services – provides identification information on substances and their relationships to statutes, regulations, and program office information systems.

• Terminology Services – provides an Agency repository of environmental terms and definitions, taxonomies, and glossaries by compiling collections of terms from EPA and other sources.  It provides tools to develop terminologies and manage taxonomic relationships among terms of interest to EPA business areas.  The Environmental Terminology System and Services (ETSS) will replace the current Terminology Reference System (TRS).

• Registry of EPA Applications, Models, and Datasets (READ) – provides authoritative data for identifying, locating, describing, and accessing Agency information resources.

• Reusable Component Services – will provide access to a multitude of components and services, catalogued in many registries, and stored in many repositories.  This service will, enable reuse, reduce costs, and speed development, and help bring about higher quality systems and applications. This registry will include an enhanced version of the XML Registry.

• Facility Registry Services – provides identification information for facilities, sites or places subject to environmental regulations.

Future Activities

• Substance Registry Services will have a new user front end to assist in the management and stewardship of lists of chemicals - available June 2008.

• Terminology Services will have a new user front end to improve search and retrieval of terms, glossaries, and taxonomies - available June 2008.

• Data Registration Services will have new functionality to improve data dictionary and code set management and conformance with EPA data standards - available August 2008.

• Reusable Component Services - Workshops and interviews with EPA’s Enterprise Architecture office, members of the Exchange Network, and EPA’s program office stakeholders will be conducted to finalize requirements. In addition, software tools will be evaluated to select those that might be used for this set of services – first release available October 2008.

For More Information, Contact:

Michael Pendleton
Program Analyst, Data Standards Branch
Phone:  (202) 566-1658
E-mail:  pendleton.michael@epa.gov

The SoR is available via the Internet: www.epa.gov/sor

November 2007

Tag page
Viewing 3 of 3 comments: view all
Viewing 3 of 3 comments: view all
You must login to post a comment.
Wik.is
Wik.is portal where you can upload your ideas and interests and educational mind set for a free website account please apply here