This section describes the experience in the use of SWE standards gained by the Italian National Oceanographic Data Centre (NODC) at the Istituto Nazionale di Oceanografia e di Geofisica Sperimentale, OGS in Trieste – Italy, to share real-time data acquired by two observatories: MAMBO1 and E2M3A. We describe the data flow for managing real-time data, illustrating in detail the elements composing it.
Lastly, a short semantic description of O&M and SensorML profiles developed is illustrated.
The workflow (Fig. 2) developed for this purpose is based on six different elements, which are [7]:
-
Observatories (MAMBO1 and E2M3A);
-
Real-Time Loader (RTLoader);
-
Real-Time Database (RTDB);
-
Real-Time Web Service (RTWS);
-
Real-Time Sensor Observation Service (RTSOS), and
-
Real-Time SOS Web Client (RTWebSOS) using version 4.3 of the 52°North implementation (http://52north.org/).
The data acquired by the two observatories are sent in (near) real time to the OGS land server. Then, the RTLoader converts data coming from different kinds of instruments and with different source formats into a homogeneous format and then stores them in an RTDB. Subsequently, the stored data are evaluated by the application of different standard validation protocols and later read by an application loader that periodically queries new measurements. These data are added to the RTSOS server and visualized using RTWebSOS (REST interface). The stored data are also read by a web service (RTWS) whose goal is to produce NetCDF-CF OceanSites files for data distribution.
Observatories
Two marine observatories acquiring meteo-oceanographic data in (near) real time are currently maintained by OGS: the meteo-marine buoy MAMBO1 (Monitoraggio AMBientale Operativo1), located in the Gulf of Trieste, North Adriatic Sea, and the deep observatory named E2M3A (Eastern Mediterranean Multidisciplinary Moored Array), located in the South Adriatic Sea (Fig. 3). The coastal observatory MAMBO1, located at the outer limit of the Miramare Marine Protected Area at a depth of approximately 18 m, is equipped with a meteorological station and oceanographic sensors. Physical and biogeochemical parameters are continuously monitored at one and/or two levels (1 m and 10 m); additional information can be found at the Web page http://nettuno.ogs.trieste.it/ilter/GoTTs/. Data are acquired twice every hour and transferred via GSM modem to the shore-based receiving station. They are archived at the National Oceanographic Data Centre (NODC-OGS).
The deep-sea observatory system E2M3A (http://nettuno.ogs.trieste.it/e2-m3a), is a two-component array, composed of a surface buoy (principal array) and a subsurface mooring (secondary array). The former allows real-time transmission of meteorological parameters as well as surface marine data. The subsurface mooring is equipped with physical sensors at different nominal depths (2, 15, 120, 350, 550, 750, 900, 1000, and 1200 m) and acoustic current profilers located at 320 m and 1200 m. The observatory is deployed to monitor air–sea interactions, and physical and biochemical properties of the water mass, as well as to investigate convective events and the carbon cycle in the open sea [8]. The data are acquired and transmitted through a satellite system allowing real data transfer from the platform to the land station.
For both stations, the original data are acquired in different formats (binary, ASCII), which are set by the manufacturers of the sensors (SeaBird, Sunburst, Young, Pro-Oceanus Systems, etc.), and then stored locally on a dedicated server in OGS. Every 30 min a procedure checks if new data have been acquired and eventually, if in binary format, they are converted by using specific procedures driven by RTLoader.
Real-time loader
The process of converting, processing, and loading data from the observatories is typically asynchronous and prone to disruption due to the long transmission and processing chain. To simplify the development of this application, Real-Time Loader (RTLoader), the Java framework ‘Apache Camel’, was chosen. It offers all the components needed to carry out this kind of task with sufficient resilience, such as transactional routes and persistent queues.
The queues ensure the asynchronicity of the process, thus implying more resilience. The performance of this component is tuned to handle a large amount of small input files. In case of large files, memory consumption grows. For this reason, the configuration of the hardware was designed with particular attention.
The main steps of the process are:
-
Continuous check for updates of input data files, sent by observatories, which can also reside on remote systems;
-
Parsing of input files in several formats, defined by different sensor manufacturers (SeaBird, Sunburst, Young, Pro-Oceanus Systems, etc.), and their conversion into lists of Java objects, one object for every input row, containing measurements of several parameters (Text2Java);
-
Conversion of all these kinds of Java objects into a unique O&M format, regardless of the input formats, containing data and metadata (Java2OM). Metadata of instruments reside in the relational database and are obtained from it. The result of this phase is a list of O&M items;
-
Conversion of O&M items into JPA entities (Java Persistence API, used to map java objects to database tables) and their entry into the relational database using JPA Camel components (OM2Db);
-
Use of Camel to orchestrate all these components using several ‘routes’ (the main concept in a routing and mediation engine) and decoupling the different phases with queues. This provides the process asynchronicity needed to obtain a good level of resilience with limited resources (e.g., relational database) and a reduction of the memory footprint, useful for achieving good scalability.
Real-time database
The Real-Time Database (RTDB) is a PostgreSQL relational database used to store observations [9]. To manage information from the stations, more than 30 tables are used, distributed on three distinct schemata: a ‘public’ schema, used to store real-time data and metadata, an ‘oceansites’ schema, which includes information needed to produce NetCDF OceanSites files, and an ‘sos’ schema developed specifically to integrate sensor information needed to load measurements to the SOS server.
The structure of the database has three branches:
-
section dedicated to describing the instruments, their characteristics, and their position (deployment);
-
section used to store data and related metadata (measurements);
-
section reserved to store the vocabularies used to standardize data and metadata and the data used by quality control algorithms (common vocabularies and quality control).
Then, data are stored in the database, and a sequence of validation procedures (data quality control procedures) is applied to the information to qualify the data values [10]. The procedure has been developed following the European protocols [11, 12] and is gradually being tuned to the regional statistics [13].
The quality control procedures include the following series of automatic checks [14]: checks for missing data and data format completeness, check of the date/time and of the measuring position, check for duplicate vertical profiles or measures, check for spikes by testing the data for large differences between adjacent values, and check for invalid values by comparison with minimum and maximum values set for each parameter archived.
Real-time web service
The Real-Time Web Service (RTWS) is a RESTful Web Service [9] that accepts simple HTTP requests to extract data from the database. These requests are parameterized for each device (site), featuretype (timeseries- TS or profile - PR), datatype (mooring – MO or CTD profile - CT) and period (DAY, MONTH or YEAR) e.g. {ws application}/search/site/E2M3A/featuretype/TS/ datatype/MO/period/DAY}.
Without any further temporal parameters (start date and end date), the last day (month or year) before the request is downloaded. It is written using Java and open source libraries, such as Spring and Jersey.
Real-time sensor observation service
This standardized Real-Time Sensor Observation Service (RTSOS) is a Web service realized using version 4.3 of the 52°North (http://52north.org) Sensor Observation Service. Real-time data are loaded by an application through the RESTful standard request InsertObservation, using O&M version 2.0. Currently, the information relating to sensors is loaded manually via the SOS Client (http://nodc.ogs.trieste.it/sos/client) with a standard request (InsertSensor) by SensorML version 2.0.
The description of each different sensor can be accessed via OGC SensorML standard request (DescribeSensor). The related data are stored in a dedicated PostgreSQL/PostGIS database, which can be extracted using GetObservation requests.
Real-time SOS web client
In this work, we use the Real Time Sensor Observation Service Web client (RTWebSOS - JavaScript Sensor Web Client, version 1.0.0, https://github.com/52North/js-sensorweb-client) developed by 52°North [15], which is an application with a user-friendly interface (Fig. 4) that allows plotting and downloading data. It hides SWE protocols and gives the opportunity for anyone to interact with SWE technology. Specifically, this Web client is usable with common browsers, and provides a direct link to data, using functions such as searching, plotting, and downloading.
From the main page (http://nodc.ogs.trieste.it/sosWeb/), it is possible to search sensors, identify phenomena, and define time intervals. The user can overlay multiple time series for visual comparison and can download measurements.
Semantic interoperability
Interoperability of sensors is guaranteed by OGC Sensor Web Enablement standards, such as Observations and Measurements (O&M) and Sensor Model Language (SensorML). They provide a standard way to exchange information, whereas semantic interoperability is assured by the adoption of SeaDataNet Common Vocabularies (http://www.seadatanet.org/Standards-Software/Common-Vocabularies), defined as follows: “Common vocabularies consist of lists of standardized terms that cover a broad spectrum of disciplines of relevance to the oceanographic and wider community. Using standardized sets of terms solves the problem of ambiguities associated with data markup and also enables records to be interpreted by computers. This opens up data sets to a whole world of possibilities for computer aided manipulation, distribution and long-term reuse” [16]. These ontologies are used in several European and Italian projects (e.g. SeaDataNet, EmodNet, ODIP and Ritmare).
To adopt these vocabularies, new O&M and SensorML profiles are being developed for describing and encoding sensor observations and characteristics.
Standard definitions are included in the XML files to define sensor categories (L05), sensor devices (L22), observable properties (P01, P02 and P03), and their storage units (P06):
<!-- System Identifiers -->
<sml:IdentifierList>
<sml:identifier>
<sml:Term definition="http://vocab.nerc.ac.uk/collection/L22/current/TOOL0018">
<sml:label>Long_Name</sml:label>
<sml:value>Sea-Bird SBE 37-SMP MicroCAT C-T Sensor</sml:value>
</sml:Term>
</sml:identifier>
</sml:IdentifierList>
<!-- System Classifiers -->
<sml:classifier>
<sml:Term definition="http://vocab.nerc.ac.uk/collection/P02/current/DOXY">
<sml:label>Intended Application4</sml:label>
<sml:value>Dissolved oxygen parameters in the water column</sml:value>
</sml:Term>
</sml:classifier>
<sml:classifier>
<sml:Term definition="http://vocab.nerc.ac.uk/collection/L05/current/130">
<sml:label>Sensor Type</sml:label>
<sml:value>CTD</sml:value>
</sml:Term>
</sml:classifier>
Specifically, the following vocabularies are adopted: Parameter Usage Vocabulary (P01), SeaDataNet Parameter Discovery Vocabulary (P02), SeaDataNet Agreed Parameter Groups (P03), British Oceanographic Data Centre (BODC) data storage units (P06), SeaDataNet device categories (L05), SeaDataNet keyword types (L19), SeaVoX Device Catalogue (L22), and SeaDataNet metadata entities (L23).
To precisely identify the information on observations and characteristics of sensors, the URI of the terms is used (e.g. http://vocab.nerc.ac.uk/collection/P01/current/PHXXZZXX):
<!-- System Output -->
<sml:OutputList>
<sml:output name="pH">
<swe:Quantity definition="http://vocab.nerc.ac.uk/collection/P01/current/PHXXZZXX">
<swe:description>pH per unit volume of the water body</swe:description>
<swe:uom xlink:href="http://vocab.nerc.ac.uk/collection/P06/current/UUPH"code="pH_units"/>
</swe:Quantity>
</sml:output>
</sml:OutputList>.