|United Nations System-Wide
REPORT OF THE FIRST IEA/GEO CORE DATA WORKING GROUP (DWG) MEETING
Sponsored by the UN Environment
Table of Contents
2 Summary of the Discussions
2.1 Core data
2.2 Data quality
2.3 Data gaps and shortcomings
2.5 Data sources
3 Strategy and Actions
4 Concluding Remarks
I Agenda of the UNEP/UNDPCSD IEA/GEO Core Data Working Group Meeting
II List of Participants
III Background paper for the UNEP/UNDPCSD IEA/GEO Core Data Working Group meeting, as prepared by UNEP and RIVM
IV Revised IEA/GEO Core Data Sets/Variables Matrix
V Brief summary of individual presentations made during the opening session of the meeting
VI Identified Data Gaps and Shortcomings
VII Potential Meta-data Attributes for Core Data Sets
VIII Potential Sources of Data and Background Information
The first meeting of the Core Data Working Group for Integrated Environment Assessment (IEA)/Global Environment Outlook (GEO) studies took place on 22-23 January 1996 at UNDPCSD offices in New York. The meeting was attended by over 20 representatives of UN agencies, inter-governmental organizations and private research institutions active in the field of environmental data, including most major global data reporting agencies (Annex I and II give the Agenda and list of participants).
The Core Data Working Group was assembled in the context of preparations for UNEP's Global Environment Outlook (GEO) report series, which will replace the more traditional State-of-the- Environment Reports. For the GEO process, four working groups have been set-up: Scenarios (led by SEI); Modelling (led by RIVM); Policy (not yet started), and Data, currently led by UNEP, with strong inputs from RIVM. There are, however, many other institutions preparing global data sets and/or reporting on global issues using global or regional data sets. Examples are the World Bank with the World Development Report, UNDP with the Human Development Report, WRI with the WRI/UNDP/UNEP World Resources Report, the Earth Council with the Earth Audit Report, plans of UNDPCSD for a global report on progress following up Agenda 21, and more sectoral global reports from agencies like FAO, WHO and the like. Indicators of sustainable development, which are being developed through broad-based consultation and consensus, and coordinated by UNDPCSD on behalf of the UN Commission on Sustainable Development, will also need to rely heavily on the availability of good core data sets. All of these efforts would profit from better access to existing data sets, more efficiency in improving or developing new data sets, more comparability between them and other advantages. Thus, UNEP felt it would be more effective, efficient, and desirable to join efforts and agree on a process for coping with data-related issues in global reporting which are of common interest to most if not all global data producing and reporting agencies and institutions.
The specific objectives of the meeting were:
- to list a limited number of existing core data sets for Integrated Environment Assessments and Global Environment Outlook studies, and identify major data gaps and shortcomings;
- to devise a realistic strategy and agree on joint actions to make such data more easily accessible, more freely and openly available to major global data producing and reporting agencies and institutions and developing countries in general, and to collaborating scientific centers working with UNEP to prepare the GEO studies in particular.
Brief introductions were given on the objectives of GEO and recent events which led up to this Meeting, such as the symposium sponsored by NASA/UNDP/UNEP on "Core Data Needs for Environmental Assessment and Sustainable Development Strategies" which took place in Bangkok, Thailand in November 1994. Each of the participants then briefed the Meeting on their institution's involvement in core data activities, and voiced their expectations of the meeting. A Background Paper for the Meeting and an initial "Core Data Sets/Variables Matrix" were presented, which set the scene for further discussions. The Background Paper, a revised version of the Core Data Sets Matrix, as well as a brief summary of individual presentations given during the opening session are included as Annexes III, IV and V respectively.
The Meeting spent considerable time on discussing such issues as global data set quality, data gaps and shortcomings, meta-data, accessibility/availability, standards, minimum core data sets and the like. Much of the discussion took place as the participants went through the initial version of the Core Data Sets/Variables Matrix. Since comparable points came up repeatedly during different stages of the meeting, the discussions are summarized not sequentially but by topic (see Section 2 of this report).
In summary, there were strongly recognized needs to improve the quality of existing data sets, add meta-data to them, and deal with access, harmonization and cost issues associated with data management. Joint action was identified and agreed regarding core data sets for IEA/GEO studies. In Section 3 an agreed-upon timetable and skeletal workplan for action is presented that could be implemented over the next six months to one year. It was clearly recognized though that this pilot should be seen as a process which will require much follow-up and broader participation in future.
The initial "Core Data Sets Matrix" was discussed extensively and practical suggestions were offered for improvements, most of which have been incorporated in the new version included in Annex V:
- a column for "Actions Needed"
on each data set has been added;
It should be noted that the ranking exercise was a very approximate one, and in many cases the expertise necessary to judge quality and currency may not have been present in the Meeting. Thus, they should at this point be treated as preliminary only.
All "Environmental Information" variables were ranked as A variables. However, a brief evaluation of available data sets showed a severe lack or at least inadequacy of existing data. This theme, perhaps more than any other, needs a vast amount of data improvement and "gap filling" for IEA/GEO studies.
Several additional data areas related to Indicators of Sustainable Development (ISDs) that have emerged through the consultation process coordinated by UNDPCSD for use by decision-makers at the national level are missing in the current list. These include: (i) financial mechanisms and resources; (ii) human settlements (housing etc.); (iii) income distribution; (iv) institutional issues; (v) public awareness; and (vi) radioactivity. These will have to be taken into consideration at some point in future, even if to collect information on these variables will be neither easy nor straightforward.
There were two different attitudes expressed concerning data quality. The predominant school of thought believed that the important thing is to make data accessible, and then see what feedback is prompted, even in the case where data are (much) less than perfect. The other school was more cautious, and felt that only fully validated data should be offered for publication and use. Their arguments: the overall error in many data sets is so high as to render them mostly if not entirely useless and even when data are reasonably up-to-date, they are often needed in much greater detail than can currently be found.
While these positions are somewhat oversimplified here, it was useful that such a discussion took place, since data should profit both from scientific rigor and the exposure of less-than-perfect data to the world community for their relevant feedback and subsequent improvement. The issue of data quality is clearly of critical importance, and remains one of the most difficult to resolve satisfactorily. There is a need to improve the dialogue between experts and non-experts, and between disciplines, in carrying out data verification. It was concluded that data sets should indeed be made available as much and as soon as possible, but accompanied by ample documentation and quality reports, keeping in mind that data have multiple uses and, thus, can have multiple quality requirements.
The question was also raised as to whether there wasn't a reliable supply mechanism for these core data sets existing. This issue will need to be explored in the proposed pilot study (see Section 3 below). Many institutions are secondary data collectors (RIVM, UNEP, UNSD, the World Bank, WRI), but even then they do not necessarily have the resources to maintain more than a limited number of core data sets.
One of the objectives of the meeting was to identify data gaps and shortcomings, and solicit possible solutions or work-arounds to fill these gaps. Current data gaps and shortcomings were identified based on two different lists of missing or inadequate data sets: one compiled during the Meeting as a result of core data sets ranked as 'A' but currently missing or less than adequate; and another compiled by RIVM, based on their data support work for IEA/GEO modelling studies being carried out by RIVM (both lists are included as Annex VI).
It was noted that 12 of the data variables listed as gaps or shortcomings are maintained by FAO. Thus, the absence of an FAO representative at the Meeting was very much regretted, and it was agreed that FAO should be kept well-informed of the Meeting discussions, results and follow-up plans so that they will be able to collaborate as much as possible. It is hoped that the conclusions of such Meetings will encourage FAO in their continuous efforts to improve many of their much-needed global data sets and to participate directly in future.
Several suggestions for work-arounds were made on how best to fill data gaps and, though little actual progress was made, some ideas were given of where to find alternatives for missing or inadequate data sets. Suggestions offered were: (i) try to fill data gaps with the best data currently available, though it was realized that little significant progress could be made in the short-term or "on the cheap", because most of the gaps are not easy or inexpensive to complete; (ii) RIVM to prepare a list of potential gap-filling data sets and techniques for the Data Working Group to examine; and (iii) the GEO collaborating scientific centers could be requested to assist in gap-filling.
There was full agreement that without accompanying, explanatory meta-data, core data sets themselves would not be useable. Thus, one of the first steps which the Data Working Group will have to initiate is an evaluation of existing meta-data systems (examples are systems of CIESIN, NASA, RIVM, UNEP, WRI) so that a synthesis and recommendations can be made.
There is already a certain level of compatibility or at least overlap between the contents and functionality of many of the systems referred to above, and most of the same systems do focus on what can be considered "core data sets", either geo-referenced or statistical or both. A more extensive study will have to be carried out specific to the needs of IEA/GEO and the participating agencies and institutions, before more exact requirements of any such system can be precisely determined. As a start the attendee from WRI prepared an initial list of basic meta-data attributes to be included in any such system (see Annex VII). Others present, such as RIVM, UNEP, CIESIN, will contribute to this evaluation exercise based on their own extensive experience in this area.
During the discussions concerning availability/accessibility of core data sets for IEA/GEO studies, there were several prominent sources of data which were repeatedly mentioned. These included DESIPA/UNSD for a wide variety of statistical data sets, mostly socio-economic in nature; FAO for data relating to agriculture, fisheries and forestry; UNEP and RIVM for a wide variety of geo-referenced (typically global and regional) data sets relating to the terrestrial surface, oceans and atmosphere; and other more primary data collectors and providers for specific data sets.
The Group realized that most if not all of the institutions mentioned above would have a role to play in the provision of data for IEA/GEO studies, and more specifically for the pilot phase which the Data Working Group envisions during the coming year (see section 3 below).
The Strategy and Actions discussion focussed on a plan and concrete steps which should be taken to make core data sets (more easily) available to data reporting agencies, GEO collaborating centers and institutions in developing countries. Several reasons were given as to why such action should be taken, ranging from global and regional reporting needs, to a need for greater cooperation within the UN family, to the need for data sets for national level decision-making, to the common need for an agreed process, and finally, to achieve cost-effectiveness.
It was agreed that in a one year pilot a number of best available core data sets (10-15) should be produced on an appropriate electronic medium in a common format with accompanying meta-data and a database access system. Such a package will be for use by GEO collaborating scientific centers, by major international data-producing and reporting agencies and institutions and by other regional and national groups. Data included should be the best available at the time from DESIPA/UNSD, FAO, RIVM, UNEP, the World Bank, WRI and others. In the pilot a specific product is envisaged, but the process itself is important as well, because if/when the data are supplied to well targeted users, major discrepancies, distortions and errors will be detected and feedback or reactions provided.
It was agreed that the institutional responsibility for such a mechanism needs to be within the UN system (UNEP was mentioned as an option, with possibly a "political window" at UNDPCSD). Large NGOs such as WRI and CIESIN or private companies can be involved in actual implementation.
In order to formally introduce the pilot, it was decided to start with writing a scoping paper in which the effort will be outlined. The paper will serve as a vehicle to search for financial support both within and outside the agencies and institutions involved. The paper will include:
- a discussion of how to set up a distributed data or information system;
- whether any related meta-database should be centralized or distributed/virtual;
- other possible support mechanisms and feasible links for such a system, including those at various institutions who are not electronically linked;
- logistical requirements which were tentatively identified as:
(i) one full-time coordinator;
The table below lists concrete steps to be taken in the next six months and a slightly longer-term time-frame to create such a product. For sake of simplicity activities and outputs have sometimes been mixed.
|a) Core Data Working Group Meeting Report - first draft||UNEP||end Jan 96|
|b) New matrix column on link to CSD Indicators on Sust. Dev.||UNEP||mid Feb 96|
|c) Core Data Meeting Group Meeting Report - final draft||UNEP||end Feb 96|
|d) Scoping paper - first draft*||World Bank||end Feb 96|
|e) New matrix column on "Action needed on data sets"||WRI + RIVM||mid Mar 96|
|f) Scoping paper - final draft||World Bank||end Mar 96|
|g) Propose 10-15 core data sets for inclusion in the "Product"||UNEP + TexA&M||end Mar 96|
|h) Approach potential data-supplying agencies to join forces||UNEP (Mooneyhan)||Apr/May 96|
|i) Using Scoping Paper recommendations: propose prototype for a unified IEA/GEO information system, covering both meta-data descriptions and core data sets themselves, making use of existing systems as a starting point||CIESIN, WRI, UNEP||Apr/June 96|
|j) Interim coordination||UNEP||Feb/Jun 96|
|k) Hold next Data Working Group Meeting||UNEP||May 96|
|l) Technical work for production of system prototype :
·Cleaning/(re-)formatting data sets (agreed format)
·Write meta-data for core data sets (agreed format)
· Creation of a query system to access actual data
·Proposal for data up-dating/improving mechanism
·Design and incorporation of a product user feedback
|to be decided||Jul/Dec 96|
|m) Select expert team to advise on publishing/technical aspects of production (convening and/or communicating regularly)||UNEP and/or WRI, CIESIN ?||Jul/Dec 96|
|n) Select expert review team to verify core data set contents (convening and/or communicating regularly)||UNEP=coord. + agency contrib. ?||Jul/Dec 96|
|o) Identify operational group within UN system, to keep the activity and the product going||all||towards end 96|
|p) Information system prototype completed||to be decided||Dec 96/Jan 97|
|q) Further populate the prototype||to be decided||Feb/Jun 97|
|r) CD-ROM or other electronic publication ("the Product") (ready for opening of General Assembly at the latest)||to be decided||Jun or Sep 97|
|* draft will also be circulated to the Earth Council, FAO, IDRC, UNDP/DHRO and UNESCO|
The Core Data Working Group Meeting for IEA/GEO studies recognized that fundamental improvements in basic data sets are essential, and that a number of key data sets are currently unavailable altogether. One of the important dimensions of the exercise is to identify and help set priorities for the improvements that will be required in specific subject areas (agriculture, demographics etc.) in order to promote their use in international, interdisciplinary studies. Despite the improvements that are clearly needed, cooperation on core data sets during the pilot phase will:
(i) make use of existing data and meta-data (and modest improvements that can be made in these); and
(ii) emphasize systems and content review mechanisms that will both promote access to the summary, global-level "core" data sets, and ensure responses from well-defined groups of user (e.g. GEO collaborating centers).
The proposed pilot phase should be seen and understood by all of the agency and institutional participants as a learning process. When "the Product" is delivered next year, this will have to be accompanied by a thorough report on how a broader exercise should be pursued in future. This Data Working Group is most certainly aware of the many ongoing relevant activities that can and should contribute both to the pilot phase and to any follow-up thereafter. While this initial effort will focus mainly on existing procedures and tools, it will also document prospective candidates (data sets, methods, tools, initiatives, institutions) that may not have been included "up front", but certainly warrant inclusion in a broader effort in future.
FOR THE IEA/GEO CORE DATA WORKING GROUP (DWG) MEETING
Day 1 - Monday 22 January 1996
09:00 - Opening of Meeting
11:00 - Presentation of background paper for the meeting, the proposed list of "core data sets" and various critical definitions to avoid confusion (UNEP-Ron Witt/RIVM-Jaap v Woerden)
11:30 - Begin discussion on characteristics and definition of core data sets; their status/availability/ quality/utility for global reports; baseline versus derived data sets; their resolution in space and time, etc.
12:30 - LUNCH
14:00 - Continue discussion on "core data sets" for global reports
15:00 - Begin discussion
on a series of actions required to meet the needs of "core data sets"
for global reporting, including:
17:00 - Closure of Day One
Day 2 - 23 January 1996
09:00 - Continue discussion
on series of actions required to meet the needs associated with "core
data sets" for global reporting:
12:30 - LUNCH
14:00 - Discussion concerning
how to proceed and resources needed:
17:00 - Closure of First DWG Meeting
CORE DATA WORKING GROUP (DWG) MEETING
LIST OF PARTICIPANTS
(regrets from: CGIAR; Earth Council; FAO; IDRC-Canada; UNESCO)
Mr. Vincent Abreu
Mr. Jan Bakkes
Mr. Giovanni Carissimo
Mr. Arthur Dahl
Mr. Paul T. Dyke
Mr. Peter Gilruth
Mr. Andreas Kahnert
Mr. Gerry Leach
Mr. D. Wayne Mooneyhan
Mr. Lars Mortensen
Mr. John D. Northcut
Mr. John O'Connor
Mr. Philippe Pelt
Mr. Eric Rodenburg
Mr. Jan Rotmans
Ms. Miriam Schomaker
Ms. Reena Shah
Ms. Mary Pat Williams Silveira
Mr. A. Singh
Mr. David Stanners
Ms. Veerle Vandeweerd
Mr. Jaap Van Woerden
Mr. Ronald G. Witt
PAPER FOR THE IEA/GEO CORE DATA WORKING GROUP (DWG) MEETING
I Introduction/Background/Rationale for IEA Data-related Activities
Global environment assessment and reporting activities carried out or coordinated by a global network of data-producing and reporting agencies require that a wide variety of coherent and consistent data be available for integrated environmental assessments (IEAs). As an example, UNEP's recently initiated Global Environmental Outlook (GEO) provides a major impetus for data management activities, since along with other IEAs it requires that specific global and regional data layers be made accessible to major scientific collaborators for modelling purposes, scenario development and other relevant research.
While it is understood that IEA is a long-term process, it is necessary to assure the provision of critical data sets and associated information which are required for the production of IEA outputs during the next several years, even if this means working with sub-optimal data sets at the current time. The globally coordinated IEA-related data management activities will help to make available, in a cost-effective manner, the distributed series of key environmental and other data sets which are essential inputs to reports generated by a series of global reporting agencies and institutions (such as DPCSD, UNDP, UNEP, the World Bank, WRI etc.).
It should be noted that such activities will also provide a service in the establishment of "common (and) compatible data systems" mentioned in Chapter 40 of Agenda 21. The DPCSD in particular has referred to the importance of developing among UN agencies a "system of access to their respective databases, in order to share data fully, to streamline the collection and interpretation of data and identify data gaps, for the purpose of providing more comprehensive and integrated data to decision-makers at national, regional and international levels."
Several international agencies have recently embarked on entirely new series of IEAs, with far-reaching implications for their programmes in the realm of data. For example, the GEO series of reports are to provide insights into environment and development interactions, and serve as the basis of the decadal UNEP State of the Environment report for the year 2002. Thus, the GEO report series will require many socio-economic data sets while addressing the topic of sustainable development from a purely environmental perspective.
While the nature of this task is vital, there is great opportunity and risk in moving beyond the traditional types of assessment and state-of-environment reporting on status and trends, to an examination of physical mechanisms and dynamic processes. The major objectives of IEAs in general, and the GEO project in particular, have been elucidated as follows:
- provide insight into the interaction between environment and socio-economic and institutional factors, particularly at global and regional levels, using new methods & tools for the analysis of these interactions;
- assess, through an iterative process, progress made towards sustainable development;
- identify strategic and emerging issues that require international attention, amongst others, through projections into the future;
- support international policy setting and action taking on priority issues; and
- strengthen capacities, particularly in developing countries, for integrated, policy-relevant assessments.
(after V. Vandeweerd, "Proposal for Annotated Outline and Workplan for the first edition of GEO", May 1995.)
Thus, IEA reports analyse issues of international importance, but often with a major emphasis on regional perceptions and priorities, and do so using new or innovative methodologies. The integrative nature and breadth of IEA activities which are undertaken by the international agencies necessitate a collaborative approach, with significant participation from all of the major reporting agencies and institutions. These analyses require vast amounts of data on a wide variety of themes as input to models, for scenario generation and evaluations of indicators, without which proper results will not be forthcoming for IEA reports.
Thus, there is a common need among major data-producing and reporting agencies and institutions for access to timely, accurate and quality- controlled data and information relating to both human and physical environments, as fundamental inputs for their respective IEAs. This includes provision of best-available (to date) global and regional data sets, and coordination of their improvement where needed.
II Summary of Major Global Data Initiatives
Many global organisations having a mandate to report on development and environment-related activities publish major data reports on an annual or less frequent basis. These include many UN agencies and specialised bodies thereof (UNDP, UNEP, UNESCO, WHO, WMO, FAO etc.), intergovernmental organisations (the World Bank), and private/public research institutes (the World Resources Institute, for example). These reports are a vast source of published information on most if not all sectors of the global economy and society, including a variety of topics related to both environment and development, and typically present a fairly up-to-date picture of important global issues and trends such as agricultural and economic production, land degradation, pollution of air, soil and water, human health and welfare, etc.
These reports include a great deal of information in the form of tables, graphs and charts as collected by the agencies themselves or third parties, often supplied by countries at the national level or by other governmental bodies at the regional level. These basic data often are not comparable or compatible from country-to-country or region-to-region, and thus need to be standardised either categorically, geographically and/or temporally, due to different means of collection or measurement, for example, before they can be aggregated into one more-or-less harmonisous and valid presentation. In some cases the data standardisation is done by the original source of the data, but in other cases it may be done by the final data publisher.
More and more, these reports are attempting to present integrated information on development and environment in the form of ISDs; indicators of sustainable development. Such indicators often allow more "basic", aggregated variables to be summarized with a single number or statistic, to simplify more complex phenomena and improve communication. 1
However, one form of data which these reports normally lack (though they may contain an occasional reference) are digital, geo-referenced environmental data sets or "computerised maps" for use in geographic information systems (GIS) and related analyses. Such data sets are one of the most basic and necessary inputs for modelling studies which attempt to predict land/oceans/ atmosphere interactions, anthropogenic impacts on and changes to these natural systems, and the complicated feedback mechanisms which affect both human and physical systems. They are also a vital component of regional and global-scale integrated environment assessments (IEAs), as well as attempts to predict and provide early warning of future environmental problems and trends.
1 Hammond, A., Adriaanse, A., Rodenburg, E., Bryant, D. and R. Woodward. "Environmental Indicators: A systematic approach to measuring and reporting on environmental policy performance in the context of sustainable development". A report of the World Resources Institute, May 1995, 43 pages plus Appendices.
III Summary of Recent Related Meetings
Over the previous few years, there have been a number of meetings held in an attempt to come to grips with and define the specific data-related needs of IEA. One of the most prominent and recent of these meetings took place in mid-November 1994 in Bangkok, Thailand under the title of "International Symposium on Core Data Needs for Environmental Assessment and Sustainable Development Strategies". The meeting, which was co-sponsored by UNDP, UNEP, U.S. EPA, NASA, USGS and USRA, issued a report including the papers given, other proceedings and results in two volumes. The stated objectives of the Bangkok meeting were as follows:
- Seek consensus on priority environmental assessment and sustainable development (hereafter, EA&SD) issues and the core data sets needed to respond to these issues;
- Define the minimum characteristics of these data in relation to national and trans-national purposes;
- Establish collaborative mechanisms to foster harmonisation of core environmental data; and
- Examine the barriers to their general access and use.2
The Bangkok meeting succeeded in identifying ten high-priority "core data sets" which are central to many types of EA&SD-related studies. In fact, the recommended list is a compilation of ten broad data themes which are common to most IEAs, as follows:
- Land use/land cover
Within these broad data themes, a series of 66 specific data sets were identified by Topical Panels and Regional groups of the Bangkok meeting. These were ranked in terms of importance and grouped under the ten major data themes, after being discussed in an open session. This allowed the participants to see which "core data sets" were deemed to have various priority levels by both the Regional and Topical Panels.
Among the recommendations of the Bangkok meeting were the following:
- That a forum be established to provide follow-up and develop action plans to carry out (these) recommendations, under the sponsorship of UNDP and UNEP, with a standing core membership and links to other, similar fora;
- UN agencies and donor organisations should influence national bodies to help create and maintain core data sets;
- International agencies, donors and national governments should work together to promote an understanding of the need for, status of and general knowledge about core data sets (ibid, pp. 3-4).
2 "Report of the International Symposium on Core Data Needs for Environmental Assessment and Sustainable Development Strategies", Bangkok, Thailand, 15-18 Nov. 1994, Volume I, Executive Summary, page 1.
Thus, the current "Core Data Sets Working Group" or DWG meeting can be seen as a logical follow-up to action either taken at or further recommended by the Bangkok meeting. Indeed, one of the major objectives of the current meeting and work carried out prior to it was to confirm the list of core data sets, and define them in terms of their general characteristics and availability for IEA-related studies. Other explicit objectives of the current DWG Meeting are as follows:
- Identify and agree upon a limited number of core data sets that can be used by all major IEA report producers;
- Agree on potential cooperation for a common, distributed database system, and if considered feasible a meta-database to document the contents of same, in order to service IEA data and information-related needs, as well as a strategy for data set distribution and maintenance;
- Determine the major critical data gaps and shortcomings for those data variables in the current list which lack corresponding known data sets;
- Agree on a realistic strategy among major data-producing agencies and institutions to fill these data gaps, taking into consideration the resources which may be available for such an effort;
Section V of this paper deals with the list of proposed core data sets/ variables themselves, and the matrix of their defining characteristics (see Annex I) which was prepared for discussion during the current DWG meeting.
During the week previous to the current DWG meeting, a series of three other meetings on UNDP/Development Watch, UNEP/Earthwatch and UNEP-DPCSD Common/ Compatible Systems of Access to Data and Information are to be held. It is also anticipated that these meetings (particularly the latter) will confirm many of the core data needs, as well as reinforce the concept that the UN and other major global data producing/reporting agencies and institutions must work closely together in the provision or sharing of core data sets, and the relevant information (meta-data) which describes and makes them useable for IEA and other studies.
Indeed, one could go a step further and suggest that the same agencies and organisations could use the opportunity of the current DWG Meeting to begin discussion of a distributed "environment and development" UN database system, which would be commonly run and maintained to the benefit of all, and follow principles of open data access enshrined in various international treaties.
IV The "Core Data Sets" Matrix and related Definitions
The general purpose of the current IEA Core Data Working Group Meeting is to progress beyond the work already completed at the Bangkok and other meetings related to "core data"; that is, not only to confirm the list of general data themes and specific data sets for IEA studies, but to identify their status in terms of an entire series of characteristics. The following descriptive items are all included in the Matrix of Potential/Proposed Core Data Sets for IEA Studies (Annex I):
- Title of major theme and variables; then, for each appropriate/corresponding data set:
- Original source of the data set;
- Current holder/provider of the same (responsible for technical coherence of the data, if not actual data contents and quality);
- Type of data (from geo-referenced to statistical, in descending order of desirability/priority):
-- Digital, geo-referenced
or "GIS-ready" data sets
- Data format, whether digital or analog only;
- Spatial coverage, normally, this will be global or regional;
- Resolution or scale, normally, as associated with raster or vector data in geographic format;
- Frequency/Reference year, that is, how often produced or published, most recent or single date if data set/map only collected/created/issued once;
- Quality indicator, this column indicates only if the data set in question is currently adequately documented and accompanied by suitable meta-data including lineage and some quality rating;
- Remarks, name of the data set and other relevant or useful information, e.g. public or proprietary data.
In order that the DWG Meeting discussions can proceed, and that future activities can take place in a coherent fashion, a number of definitions must first be accepted or agreed upon by most if not all of the participants. The following definitions are proposed, with examples given for each one:
Data Theme: a general data category or heading which can be used to group many specific data variables and data sets; e.g., Agriculture, Climatology, Demographics/Health, Economy, Infrastructure, Supporting Data.
Data Variable: a specific parameter relating to the human or physical environment, the state or condition of which can be mapped, measured or recorded; data variables can all be classed under one of the major headings; for example, all of the following:
Data Set: a specific manifestation or rendering of a data variable, that is collected, created, produced and/or published at a specific point in time (or over a known period) by an identifiable institution(s) and/or person(s), and including all relevant meta-data necessary for its proper application for a specific purpose or study; e.g., the "Major World Ecosystem Complexes based on Carbon in Live Vegetation" from Olson et al., U.S. DOE/ORNL, circa 1986.
Data file: a single computerised manifestation of the above, normally available in digital format only and without meta-data.
Data Set Collection: a series of data sets linked by theme, geography, origin or other criteria.
Basic Data Set or Variable: equivalent to "baseline, fundamental or raw" measurements from gauges or met. stations, satellites or statistical surveys. These data can be aggregated, quality-checked and refined for further use, but are neither interpreted nor converted to GIS (geographic) format; instead, they are the necessary inputs for the same; some examples are:
- monthly precipitation totals/temp.
averages from stations;
Core Data Set or Variable: one on which a consensus has been reached and or "prevailing wisdom" dictates is necessary for, as well as common to, multiple IEA and Sustainable Development studies; these are often derived from Basic Data Sets and are needed by many agencies/groups/individuals; e.g., World Gridded Elevation map such as the 'ETOPO-5' data set; long- term Monthly Average Precipitation and Temperature data sets.
Derived Data Set or Variable: as "Core Data Set" above, except derived from another data set as a "second-generation" product; e.g., a shaded relief map of the world derived from a gridded elevation data set, or climate anomalies based on Monthly Precipitation and Temperature data.
EA & SD Indicator: an aggregated, representative and/or simplified version of one or more of the above core data sets used in IEA studies, which allows for better communication about and greater understanding on a particular EA & SD issue or topic; e.g., greenhouse gas emissions, protected areas as % total land area threatened.
V Proposed Core Variables and Data Sets
Origins and Relevance
The proposed list of core data variables and data sets potentially useful for IEAs and included as Annex I of this Background Paper was initially prepared by RIVM's Informatics Service Centre (ISC), and draws heavily upon the list of 66 specific data sets compiled by the Bangkok meeting. This matrix and proposed "core data sets" list very similar, though not identical to, the one derived at the Bangkok meeting, in terms of both major headings (Data Themes) and specific Data Sets. Completion of the Data Set Matrix and suggestions for a few additional data variables was carried out by UNEP/EAD/GRID-Geneva staff, with assistance from GRID-Nairobi for the statistical data reports. Other inputs were received on a more preliminary version of the list from staff of UNEP/EAD-Nairobi, UN/ECE and the World Bank.
The Data Set Matrix was filled in partially by electronic searching for data sets via the World Wide Web (WWW), and partially by going through existing data reports of the major data-producing agencies and institutions, as well as known data archives, CD-ROM products and major data repositories of the international agencies and various national/regional governmental bodies.
At the data theme and data variable levels, it follows closely the list established by the Bangkok meeting. What is entirely new in this version, however, is the naming (identification), location and major characteristics of specific data sets proposed as preliminary "core data sets" for the global data requirements of IEA-related studies. In cases where no data set or less than adequate data only are available, this is indicated by a blank, question mark and/or a relevant comment under "Data Set Name/Remarks". Thus the matrix can also be used to identify critical data gaps and short- comings for IEA-related studies. For example, the fact that no coherent data set could be located for the variable "Morbidity" means that it needs either to be found (should it exist), or somehow compiled or created.
Question of Data Quality
It was not within the scope of the current activity to determine detailed data quality information for each data set, as this is a major task in and of itself. Instead, a "yes/no" appears under the "Quality Indicator" column to show whether or not a given data set is accompanied by documentation and/or other relevant information ("meta-data") such as on how it was developed; by whom; for what use/application(s)/purpose(s); its technical characteristics; name, date and citation of published paper or other scientific material which are necessary to render the data set useful for IEA studies. In any case, the purpose of including a field for data set "quality" is to draw attention to the critical importance of this item, which ideally should be available in a standardised format, and without which data may not be useable at all.
Question of Access/Availability
While the matrix identifies the original source and current holder of given data sets, it does not specifically comment on their actual availability or conditions of access, unless they are known to be proprietary or restricted for other reasons (such as "security"). Normally, a given data set should be available from the "Current Holder" for no charge or on a cost-recovery basis. One of the critical points the DWG Meeting should address is how to handle exceptional cases of missing and proprietary data sets (see below).
Problems posed by Proprietary Data Sets
Given the increasing trend - even within the UN system - to "privatise" or at least charge for data, it would behoove the DWG Meeting participants as a group to again call for the highest level of international cooperation to make such data more and not less accessible, particularly for IEA studies. Furthermore, any collaboration undertaken between the data-producing agencies and institutions to find or create additional core data sets should proceed on the basis of "free and open access" for all UN family members, governments at all levels, public institutions and the general public.
This is not to minimize the fact that increasingly restrictive national and international data access policies mean opposing or overcoming such a trend, but the major international data-producing and reporting agencies, inter- governmental organisations and private/public research institutions can at the very least try to set a better example. Given that the "core data sets" are required for so many purposes including IEAs and that they are typically available at global and not national scales (i.e., have limited geographic detail and thus are "non-threatening" in terms of national sovereignty), there should be no reason for any agency to restrict their availability. This is particularly the case when they have been collected or created through international efforts or programmes utilizing public funds. (see also the Bangkok meeting report, Volume I, p. 35 and p. 42.).
VI Recommendations for Future Activities/Cooperation (among the major data-producing/reporting agencies)
Core Data Sets for IEA
It is anticipated that the DWG Meeting will make progress towards final agreement on the proposed core data sets and their appropriateness, given their specific characteristics, for IEA studies. While many of those listed should be "confirmed" as vital for this purpose, others may be disputed or even eliminated from the list as not necessary. In any case, it is preferable that the core data sets list be seen as dynamic, flexible and open to addition rather than closed, as new data sets may become available and needs change through time. Thus, a second tier of desirable but non-core data sets could also be designated by the DWG Meeting, to be addressed as a second priority and only as time, opportunity and resources allow.
Strategy for filling the major data gaps
It is also anticipated that discussion about the data set matrix and its contents will reveal the existence of further data sets not yet identified, but also the need to determine the existence or collect/create key data sets which are missing, as well as to upgrade current data sets which exist only in analog or non-geographic formats to a GIS-compatible or at least more advanced, cartographic status for use in IEA-related studies.
Ideally, the DWG Meeting will attempt to develop a specific plan and timeframe for filling data set gaps, but this is dependent on the willingness of participants to take on various responsibi-lities, either working individually or together to improve and expand current collection of core data sets for IEA. During the DWG Meeting and at the appropriate time on the agenda, the chair would welcome any useful ideas or suggestions that might be forthcoming on this topic.
Strategy for Sharing Existing Data Sets (and meta-data)
Given that the major data-producing and reporting agencies and institutions have a similar need for direct access to core data sets for IEA-related studies, and that many of these are already in their possession, it is proposed that such data sets should be made available to all participants by the current holders. This also implies a responsibility to maintain the data sets in question as up-to-date as possible, and to assure that they are accompanied by coherent and comprehensive meta-data which allows for their proper use. In cases where the current holder of a given data set is unable to perform or provide such services, the DWG Meeting can attempt to identify another candidate agency or institution to do the same tasks.
A similar strategy is proposed, at least as a "default solution", for those data sets which are not currently available in geographic format. The current holders could be asked to devise a plan and timetable for rendering, e.g., tabular data in a GIS-ready format, even if this only means creating a global map of administrative polygons with some statistical attribute. Again, in cases where the current holding agency or institution is unable to take on such responsibilities, the DWG Meeting needs to discuss viable alternatives.
Formalization of the above activities
The chair of the DWG Meeting will consider any ideas or suggestions which may be forthcoming on the need or utility of "formalizing" any or all of the above-mentioned activities and tasks. For instance, the participants may decide based on impetus developed during this and the prior week's meetings (particularly on "Access to Data and Information" that a common, distributed database system should be deliberately set up or at least encouraged to evolve among/between the major data-reporting agencies and institutions. In similar fashion, consideration of a common meta-data structure and eventually a shared tool for meta-data management (input/access/update etc.) can be discussed among the participants.
The probability is that in the current situation of scarce human as well as financial resources, such a shared database of core data sets and/or related meta-data system will in any case not develop overnight. By the same token, the very scarcity of resources makes it more likely, if not imperative, that the major data-producing and reporting agencies and institutions work ever more closely together to make core data sets and their related information for IEA studies readily available and accessible, to the great advantage of all groups and individuals who are concerned in working with the same.
The IEA/GEO Core Data Working Group (DWG) Meeting will only be as successful in launching common data-related activities of the major data-producing and reporting agencies and institutions, as the participants are interested in devoting both time and resources to the development and management of "core data sets" and related information. It is certainly possible to stimulate useful discussion on this subject and perhaps agree on how next to proceed, but to achieve concrete results within a few years' time is another matter. Ultimately, how well this and other, similar meetings may serve to catalyse necessary activities among the major data-producing and reporting agencies will be measurable by the number of shared core data sets and the amount of related information transparently available, and the level of common effort devoted to create, improve and manage all of the same in the years to come.
Annex I Core Data Sets/Variables
Matrix for IEA
Annex II List of Major Global Data Reports
World Development Report
IEA/GEO CORE DATA SETS/VARIABLES MATRIX; VERSION 3.
[not presently available in a web version]
BRIEF SUMMARY OF INDIVIDUAL AGENCY PRESENTATIONS MADE DURING THE OPENING SESSION OF THE IEA/GEO DWG MEETING
After the opening of the DWG Meeting, the round of introductions and expectations from all participants and review of the agenda, Ms. Veerle Vandeweerd of UNEP/EAD presented the background of and purpose for the Meeting. This can be summarised as the need to identify and agree upon a limited number of core data sets for IEA/GEO studies, as well as how to make these same data sets available to the major global data-producing and -reporting agencies and institutions, developing countries in general, and the 15 collaborating scientific institutes in particular.
She further explained that there are four working groups for GEO: Scenarios (led by SEI); Modelling (led by RIVM); Policy, and this group, the Data Working Group, led by UNEP/EAD. Data also need too be made consistent for such projects as the Global Water Assessment. Within GEO, looking at the root causes of regional environmental problems and issues is very important. GEO will involve use of "business-as-usual" and other scenarios (best- and worst- case) and models, as well as international policy responses, to deal with these environmental problems. In any case, it is considered vital to agree on a process for coping with the data-related issues during the current DWG Meeting.
CIESIN (V. Abreu) is involved in preparing information systems, which encompass both general meta-data and detailed "guides" to data sets. These are increasingly accessible via the World Wide Web (WWW), and e-mail for the developing world. They are also working on tools for data access, extraction and ordering, CD-ROM publication, interactive communications and feedback as well as recently starting work on Land Quality Indicators (LQI). CIESIN has also established a number of nodes in Central/Eastern Europe (especially the Baltics), and Asian countries including China. The purpose of these centres is to help identify and get access to key national data sets for the CIESIN Information Cooperative.
RIVM (J. Van Woerden) through its C. I. M. (CIM) is deeply involved in integrated, quantitative assessment and forecasting studies, particularly as the lead centre for GEO Modelling. Some of the specific data needs for GEO include data sets extending back to the year 1900, with spatial resolutions and scales roughly equivalent to 10 minutes latitude/longitude, and 1:250,000. Significant amounts of data (100s of data sets) and meta-data are available through systems developed at RIVM, and they are developing as well the capacity to do quality checking of environmental data through such tools as QUACO. Data needs were determined through an internal survey in 1994.
UN/ECE (A. Kahnert) observed that "core data sets" are only a small part of the total data needed for IEAs. Surrounding arrangements relating to data access, quality and updating are equally (if not more) critical, and corrections and improvements often need to be done to data sets (this is often complicated). The big issue is data quality, and knowing what the data mean. The current/future efforts of UN/ECE in the area of environmental information can best be described as "working for an integrated package".
Developmental efforts are focussing on the countries-in-transition for which Environmental Performance Reviews (EPRs) will be undertaken, at the rate of three-four per year. At the time of EPRs, (word???) will be associated with the process to help capacity building for environmental information in these countries. Simultaneously, it could be efficient to associate other measures with the EPR process, which could help gather information useful for regional and global reports (such as GEO and others) in the same countries-in-transition.
IBRD (World Bank; J. Bakkes) needs environmental data at the project level for project quality performance. Here the Bank is working closely with UNEP/GEMS data and GRID, GEF and WRI (The latter on biodiversity). For their environmental indicators programme, they are promoting concrete plans to share data with other agencies (as opposed to big coordination meetings or new structures) and the application of indicators for sustainable development (ISDs) at the country level. As far as the Bank is concerned, their co-users of the critical core data sets are such agencies/institutions as UNEP in GEO, UNDP in their Human Development Report, the Earth Council/Earth Report, CSD in their indicators scheme/structure and WRI in the World Resources Report.
The World Bank's work with UNEP and FAO on indicators of land quality was given as an example of their innovation in the realm of environmental indicators. In addition, special attention will be given to policy indicators. This has the objective of closing the loop with the Pressure-State-Impact- Response (PSIR) model. They are also working on deriving costs and financing of environmental protection measures. Furthermore, the Bank is revising the data series it is publishing itself, though in general they are not a data supplier but a user or recompiler of others' data.
UNDP/UNSO (P. Gilruth) discussed their support for and input to the UN-DPCSD methodology sheets on (e.g.) capacity building and desertification, and the testing and evaluation of proposed ISD indicators in eight pilot countries. In a quick review of UNDP's participation in Development Watch, it was stated that the focus has to be at the national as opposed to international level, with capacity building as an important component which has to be linked back to Development Watch.
UNEP's System-wide Earthwatch (A. Dahl) commented that indeed Development Watch should help countries to start standardising their data, primarily for their own national use, but so that countries can eventually produce data which is compatible enough for usage in international reporting. The potential importance of linkages with the Global Observing Systems (GxOS; where x = C/limate, O/ceans and T/errestrial) was also mentioned, and that while these remain mostly in the planning phases, in the medium term it should be possible for these Systems to provide data needed for decision- making as well. Existing machinery within the UN system for sharing of data and searching for information related to sustainable development needs further development, but there are currently at least seven related efforts underway (also, cross-linkages between data producers and modellers).
UNSD (G. Carissimo) expressed that National Statistical Services, directly and through the Statistical Commission, have repeatedly expressed their concern with the dissemination of unofficial and undocumented statistics which are often unreliable and inaccurate and their use in international work. In response to this concern, UNDP has requested DESIPA to develop methodological standards and standard sources for a common system statistical database. Most of the principal international sources are abstracted in statistical CD-ROM and indexed in the Stat-Base Locator on diskette. An unified standard structure and access system for economic and social information is being developed by DESIPA as the United Nations Economic and Social Information System (UNESIS) with high priority for 1997 operation. UNSD is planning to publish in 1997 a Compendium on Environment Statistics and is at present working on a questionnaire on environmental indicators.
EEA (D. Stanners) stated that we all need to have confidence in core data sets, and that for most environmental data Eurostat is their most direct partner/interlocutor. The attention of the DWG Meeting was drawn to the "why" and "how" in identification of the core data sets, since this depends so much on their ultimate use. The EEA has specific "Topic Centres" (ETCs) for various data themes, which are charged with the responsibility of delivering core data sets, but the current list for the DWG Meeting was not all that recognisable or similar to what they are producing.
Finally, for EEA the issue of regionalisation is particularly important - they are already working closely with e.g. UNEP's Regional Office for Europe (ROE), and do not wish to duplicate activities there. Regional to global data integration is also crucial, but it is not always obvious how this can be done. Environmental indicators are the driving force of EEA activities in order to update their environmental plans. They are deciding at this moment which indicators to look at and therefore what are the data needs, nicely matching the stage this Meeting is at.
UNEP/EAD/GRID (A. Singh) briefly discussed the need to improve data management collection, updating etc. capabilities within the entire UN system, and mentioned that often or data sets reports are based on old or out-of-date information by the time they are made generally available (the FAO Global Forest Assessment, e.g.), due partially to the fact that adequate data often do not exist within countries. In fact, in most of the developing countries, there is a lack of capacity, mechanisms and technology for collecting data. Also, very few new global data sets are published on an annual basis (some examples are Global Land Cover database in preparation, the Global Drainage Basins, Global Elevation, Human Population etc.) by the international system (UN agencies etc.). He also showed a pie chart of "GIS Software Systems Installations" by continent.
WRI (E. Rodenburg) explained that they are a policy-oriented institute which publishes the World Resources Report (next issue due in March 1996). They try to give context to descriptive chapters by presenting (core) data tables. For WRI, these core data include economic, financial, human population and health and generally any (reliable) data which are available. WRI tries to list data which managers or decision-makers will need at global, regional and national levels; the problem is at the local level, and there is also a conflict between the "top-down" and "bottom-up" approach in data collection and integration (as well as the fact that different data are needed at the global and national levels). WRI is also involved in development of indicators.
With more and more geo-referenced data available, core data is no longer only statistical or tabular in nature but includes (increasingly) GIS-format data sets. An example of this is the "Africa Data Sampler" (D. Tunstall) at 1:1 million scale for individual African countries being distributed along with the ArcView software; another similar such product is being made for Latin America by CIAT; eventually, such continental/regional data sets put together may yield coherent global coverage(s).
Decision-makers, however, need an simple desk-top reference tool to give "geographic context" and a means of looking at data, as opposed to a sophisticated GIS tool. Finally, data quality is extremely important and always relates back to use, since many data sets are often applied to purposes they were not meant for.
NASA (USRA; W. Mooneyhan) reported on the Bangkok Core Data Needs Meeting of November 1994, which was sponsored by UNEP, UNDP and several other agencies/ institutions (see pages 3-4 of the DWG Meeting "Background Paper" in Annex II for a summary). It was explained that not just a lack of data, but lack of accessibility to basic data sets is the real problem. Also, all too often data simply do not exist at the level of detail which is needed to make certain decisions, particularly at the national and local levels, and this lack of data is greatest in the human development realm (demographics and social equity data etc.).
Following all of the above presentations, Ms. Vandeweerd in reflecting on some of the prevailing/recurring comments made stated that while there is a real problem with data collection from the "bottom up", this DWG Meeting is more concerned with the global or "top-down" approach, and needs to concentrate on the availability of at least several critical data sets, as well as new means of delivering these data to the world community.
SEI (G. Leach) re-emphasized the overwhelming importance of data quality, and explained that the overall error rate in much data is so high as to render them mostly if not entirely useless. Also, data are often needed in much greater detail than they are currently available; for example, in land use/ land cover modelling, most existing data are completely inadequate.
IDENTIFIED DATA GAPS AND SHORTCOMINGS
Data Variables ranked as 'A', but with no corresponding data sets or only data sets rated as "minus" or "zero" in terms of utility :
Forest characteristics (Forest
Not discussed were the following Data Themes/Variables:
Provided as a Viewgraph by RIVM's Jaap van Woerden
Traffic (vehicle type, per
engine/fuel type; JvW to contact AK for references)
META-DATA ATTRIBUTES FOR CORE DATA SETS
I Outline of Metadata Requirements
A Directory Entry
(A Short Description)
II Guidelines should include:
1 Database Name
Items in parentheses are examples, * = most value-added
SOURCES OF DATA AND BACKGROUND INFORMATION
FAO (agric., fisheries, forestry,
U.S. Federal Agencies
NOAA/NGDC and NCDC (climate)
NCGIA (human pop.)