Computation services at DKRZ
The German Climate Computing Centre (Deutsches Klimarechenzentrum, DKRZ) is a central service centre for German climate and earth system research. Its high performance computers, data storage and services form the central research infrastructure for simulation–based climate science in Germany. The DKRZ also hosts a multi–petabyte repository of climate data that can be accessed through the WWW.
Increasing numbers of users call for the provisioning of web based interfaces to compute the data resources hosted by DKRZ. The DKRZ thus participates in national and international joint projects and cooperations with the aim of providing compute services for climate data. In this context, PyWPS has been used to implement and deploy standardised computation services, that can be invoked and orchestrated by users. These activities have led to contributions and active support of the PyWPS and OWSLib projects.
In collaboration with Ouranos (see ahead), the DKRZ started the Birdhouse community effort [36], with the aim of developing auxiliary components for WPS projects. Birdhouse intends to facilitate the creation and usage of compute services based on PyWPS. Among the features provided, the following can be found:
-
A Cookiecutter template [37], including Dockerfiles, that allows users to create their own PyWPS compute services.
-
An Ansible script [38] to deploy a full–stack PyWPS service.
-
A Python library suitable for Jupyter notebooks [39] to interact with WPS compute services.
Canadian climate service platform
Ouranos is a consortium on regional climatology based in Montréal, whose mission is to support adaptation to climate change [40]. This involves working with engineers, biologists, urban planners and professionals from dozens of different disciplines to understand their climate data needs and provide them with clear, understandable and actionable information. To speed–up and standardize the delivery of climate services to users and facilitate collaborations with academics, Ouranos has been developing with the Centre de Recherche en Informatique de Montréal (CRIM) and the Birdhouse community a climate service platform based on WPS [40]. The platform includes a handful of thematic servers powered by PyWPS, offering tools to search data catalogues, compute climate indices, subset and aggregate climate data, run hydrological models using climate projections and a number of other specialized algorithms. These servers are spun–up as Docker containers and accessed through a proxy handling load–balancing and authorization/authentication to files and services. A high–level WPS client has been developed based on OWSLib to simplify access to those services and provide users with an interface to remote processes that look and feel like normal Python functions.
The long term vision for this platform is to facilitate trans–disciplinary collaborations by packaging state–of–the–art scientific expertise into easily accessible web processes available to non–experts. This includes not only climate algorithms but also impact models driven by weather and climate conditions. Ouranos hopes to partner with similarly minded institutions to build a federation of climate service servers which, through conventions on data and metadata formats, would be inter–operable. This would over time lead to scientists writing complex workflows chaining operations and data from multiple different institutions. Not only could this reduce time spent on tedious low level work, but more importantly, make research outputs by experts easier to discover and access by non–specialists.
ECOPOTENTIAL
Environment Systems is an environmental and agricultural data driven consultancy established in 2003 in the UK, specialising in Geo–Informatics and Earth Observation. This consultancy has been researching the potential of wrapping some of its satellite data processing algorithms in WPS processes, within the ECOPOTENTIAL project [41].
ECOPOTENTIAL is a large European project funded by the Horizon 2020 research programme of the European Commission [42]. It focuses its activities on a targeted set of internationally recognised protected areas, blending earth observations from remote sensing and field measurements, data analysis and modelling of current and future ecosystem conditions and services, ready for operational delivery. ECOPOTENTIAL considers cross–scale geosphere–biosphere interactions at regional to continental scales, addressing long–term and large–scale environmental and ecological challenges.
Environment Systems already provides analysis ready satellite data through its data services platform; the research has thus been centred on potential WPS servers that could wrap those services. Different implementations have been assessed, considering three essential aspects: (i) ease of use, (ii) interoperability with other services such as WCS or WFS, and (iii) integration with Celery [43] for scaling. The latter in particular has proved challenging, since it has not yet been possible to run Celery on Python 3.6. At present a series of synchronous WPS services are being developed, that will soon be available at the Environment Systems’ demonstrator site [44]. Even though a specific WPS server implementation is yet to be definitely selected, Environment Systems has built strong experience on PyWPS and contributed back to the project.
Data quality assessment at WOUDC
WOUDC is since 1962 the World Ozone and Ultraviolet Data Centre component of the World Meteorological Organisation (WMO)’s Global Atmospheric Watch. The WOUDC data centre is operated by the Meteorological Service of Canada, a branch of Environment and Climate Change Canada. Its data archive includes total column and vertical profile measurements of Ozone obtained through LiDAR, ozone–sonde flights, and the Umkehr technique. The archive further comprises Ultraviolet (UV) radiation measurements, including high resolution spectra. There are over 500 registered stations in the archive, contributed by more than 150 different institutions.
The WOUDC provides an online data archive, together with metadata (e.g. station location), and value added products such as graphs of total ozone time series and near real–time ozone maps. OGC standards are employed to provide standards–based–on–demand access to the archive.
In 2015 the project was renewed with a focus on data access using standards [45]. This refocus targeted not only data services, but also internal processes. The WOUDC must process, quality access and ingest into its archive the Ozone and UV datasets provided by contributing institutions. The data rejected is reported back to the contributor for subsequent correction and re–submission. WPS was employed to facilitate these procedures.
In 2016 the WOUDC deployed a PyWPS stack to expose services to process and assess data to the public via WPS. This allows contributors to quality assess their data before submitting to the WOUDC. These PyWPS services not only help reduce the internal quality assessment workload, but further provide a means of real–time data validation [46]. The functioning of each of these services is documented in detail to facilitate their use [47].
The earth observation monitor
The Earth Observation Monitor (EOM) aims to ensure easy access and analysis of spatial time–series data for land monitoring on local scales. The concept behind EOM combines the advantages of web service–based geo–processing with easy–to–use interfaces; this provides an easy access for users without specific knowledge in data processing techniques. EOM focuses on hiding existing barriers (e.g., manual data download, (pre–)processing, data conversion), through automation. This enables users can focus on the analysis and interpretation of results.
The back–end system processes Earth observation time–series data and executes analysis tools based on users’ inputs. The basis for the automated data processing and analysis is operational and automated data access, which is accomplished by introducing a multi–source data processing middleware [48]. This middleware is connected to external data providers to interoperate with requested data, including the provision of standardised OGC web services. Data processing and analysis have been made available via WPS services, implemented on PyWPS. Ready–to–use Python libraries, such as rpy2 for the R statistical language [49], as well as command line executions (with the Python library subprocess [50]), are used to run external software for data analysis and data processing.
Since the functions of the EOM are available through web services, client applications can easily interact with it. Two example clients were created to show the possibilities of such a service–based infrastructure: A web portal (webEOM) and a mobile application (mobileEOM). Both of these clients use the OGC WPS–compliant web services developed for data integration and analysis; webEOM uses the OGC Web Map Services and Web Feature Services to visualize the outputs of analysed areas.
The focus of the web portal (Fig. 2) is to provide an easy–to–use client, while making it possible to extract time–series data and execute further time–series analysis functions. At least two inputs are necessary to extract datasets for a given geometry: (1) the location of the area of interest, which can be created in the map as a point or a polygon, and (2) the name of the dataset the user is interested in (e.g., vegetation index, land surface temperature, climate station data). When using the data integration process, users can specify different parameters for the selected dataset, such as start and end dates, as well as filtering options. In addition to the extracted dataset, a time–series and a decomposition plot are generated automatically. (e.g., breakpoint detection, trend calculation). The resulting data can either be visualised directly in the web portal or can be downloaded for further usage. Spatial outputs can be interactively explored on the map, and CSV files are plotted as an interactive chart.
During fieldwork, users cannot use web–based systems developed for desktop computers. A mobile application is therefore needed to foster the use of spatial time–series tools on mobile devices, which can be more easily used in the field. The mobile application for EOM was developed to provide access to time–series data and derived analyses on mobile devices. Using their current GPS location or a manual set position, users can extract vegetation time–series data, as well as view data plots, trend, and breakpoint analysis plots directly on their mobile devices. An OGC WPS process was developed for the mobile application, providing all necessary functionalities in a single process, available as a web service. This process extracts data from Google Earth Engine and plots the time–series and decomposition figure. In a second step, time–series analyses for breakpoint detection and trend calculations are executed and plotted in a figure. The resulting output is a GeoJSON file containing the values of the analysis tools, as well as links to the generated figures.