Skip to main content

Table 3 Methods offered by the application programming interface of the OSHDB framework

From: OSHDB: a framework for spatio-temporal analysis of OpenStreetMap history data

Method

Description

Data Filters

areaOfInterest

Sets the geographical area of interest of the query, i.e. a bounding box or a bounding (multi)polygon.

timestamps

Sets the temporal limits of the query. Depending on the analysis view, this can either represent a list of independent timestamps (snapshots) or represent a list of time intervals.

osmType

Filters data by their OSM entity type, i.e. a node, way or relation.

osmTag

Filters data by their OSM tags. Can either filter for key=* presence of a tag with a given key, key=value presence of a specific tag (key-value combination), key=[values] presence of a tag whose value is in the given list of values, key regex presence of a tag whose value matches the given regular expression, [key=value] presence of at least one tag from the given collection of key-value pairs.

Aggregation Methods

aggregateBy

Defines a custom aggregation method. When applied on a MapReducer, it transforms it into a new MapAggregator with the same settings as the original MapReducer, but which processes the input data in chuncks defined by the indices returned by the given aggregate-by function. When applied on a MapAggregator, the already existing aggregation indices are refined further by the new aggregate-by function (where the resulting index set is defined as the cross product of the existing and the new set of indices).

aggregateByTimestamp

Aggregates results by a temporal index. This knows about the overall query timestamps parameter and if necessary associates timestamps to the respective time intervals defined for the whole query.

aggregateByGeometry

Aggregates results by their geometries (spatial position and extent). Accepts a set of arbitrary (multi)polygons which define the aggregation index. If necessary, this method splits and clips the geometries of OSM entities, when they extend over multiple polygons.

MapReduce Methods

map

Performs a data transformation step that calculates one output object for each input object of the processing stream.

flatMap

Performs a data transformation step that calculates arbitrarily many output objects for each input object of the data stream and flattens the resulting output object in the processing stream.

filter

Filters the processing stream by the given predicate.

reduce

Performs a generic reduce operation of the processing stream, which ultimately generates a single result object for the entire processing stream.

Specialized Reducers

sum

Calculates the sum of all values.

count

Returns the number of entries in the processing stream.

uniq

Returns the set of unique values in the processing stream.

countUniq

Returns the number of unique entries in the processing stream.

average

Calculates the average of all values.

weightedAverage

Calculates a weighted average over all values.

estimatedMedian

Returns an estimation of the median of all values in the processing stream, using the T-Digest method [50].

estimatedQuanitle(s)

Returns an estimation of the quantile(s) of the distribution of all values in the processing stream, using the T-Digest method [50].

collect

Returns a list of all objects in the processing stream. See the stream method below, for a less memory intensive variant of this.

Other

stream

Returns all values as a (JAVA) stream. Equivalent to the collect reducer, but doesn’t need to buffer the whole dataset in memory before returning.

groupByEntity

Special map function that groups consecutive entries of the processing stream together which belong to the same original OSM entity.