Geometry and Series Blocks¶
Geometrytype blocks contain sets of geometries, optionally with 'start'
and 'end'
fields and other properties. Internally, geometry data is stored
in GeoPandas GeoDataframes.
API Specification¶
Module containing the base geometry block classes.

class
dask_geomodeling.geometry.base.
GeometryBlock
(*args)¶ The base block for geometries
All geometry blocks must be derived from this base class and must implement the following attributes:
columns
: a set of column names to expect in the dataframe
A geometry request contains the following fields:
 mode: one of
{"intersects", "centroid", "extent"}
 geometry: limit returned objects to objects that intersect with this shapely geometry object
 projection: projection to return the geometries in as WKT string
 limit: the maximum number of geometries
 min_size: geometries with a bbox that is smaller than this on all sides are left out
 start: start date as UTC datetime
 stop: stop date as UTC datetime
 filters: dict of Django ORMlike
filters on properties (e.g.
id=598
)
The data response contains the following:
 if mode was
'intersects'
: a DataFrame of features with properties  if mode was
'extent'
: the bbox that contains all features
To be able to perform operations on properties, there is a helper type called``SeriesBlock``. This is the block equivalent of a
pandas.Series
. You can get aSeriesBlock
from aGeometryBlock
, perform operations on it, and set it back into aGeometryBlock
.
to_file
(*args, **kwargs)¶ Utility function to export data from this block to a file on disk.
You need to specify the target file path as well as the extent geometry you want to save.
Parameters:  url (str) – The target file path. The extension determines the format. For supported formats, consult GeometryFileSink.supported_extensions.
 fields (dict) – a mapping that relates column names to output file
field names field names,
{<output file field name>: <column name>, ...}
.  tile_size (int) – Optionally use this for large exports to stay within memory constraints. The export is split in tiles of given size (units are determined by the projection). Finally the tiles are merged.
 geometry (shapely Geometry) – Limit exported objects to objects whose centroid intersects with this geometry.
 projection (str) – The projection as a WKT string or EPSG code. Sets the projection of the geometry argument, the target projection of the data, and the tiling projection.
 start (datetime) – start date as UTC datetime
 stop (datetime) – stop date as UTC datetime
 **request – see GeometryBlock request specification
 Relevant settings can be adapted as follows:
>>> from dask import config >>> config.set({"geomodeling.root": '/my/output/data/path'}) >>> config.set({"temporary_directory": '/my/alternative/tmp/dir'})

class
dask_geomodeling.geometry.base.
SeriesBlock
(*args)¶ A helper block for GeometryBlocks, representing one single field

class
dask_geomodeling.geometry.base.
GetSeriesBlock
(source, name)¶ Get a column from a GeometryBlock.
Parameters:  source (GeometryBlock) – GeometryBlock
 name (string) – name of the column to get
Returns: SeriesBlock containing the property column

class
dask_geomodeling.geometry.base.
SetSeriesBlock
(source, column, value, *args)¶ Set one or multiple columns (SeriesBlocks) in a GeometryBlock.
Parameters:  source (GeometryBlock) – source to add the extra columns to
 column (string) – name of the column to be set
 value (SeriesBlock, scalar) – series or constant value to set
 args – string, SeriesBlock, …, repeated multiple times
Returns: the source GeometryBlock with additional property columns
Example
>>> SetSeriesBlock(view, 'column_1', series_1, 'column_2', series_2)
dask_geomodeling.geometry.aggregate
¶
Module containing raster blocks that aggregate rasters.

class
dask_geomodeling.geometry.aggregate.
AggregateRaster
(source, raster, statistic='sum', projection=None, pixel_size=None, max_pixels=None, column_name='agg', auto_pixel_size=False, *args)¶ Compute zonal statistics and add them to the geometry properties
Parameters:  source (GeometryBlock) – the source of geometry data
 raster (RasterBlock) – the source of raster data
 statistic (string) – the type of statistic to perform. can be
'sum', 'count', 'min', 'max', 'mean', 'median', 'p<percentile>'
.  projection (string or None) – the projection to perform the aggregation in
 pixel_size (float or None) – the pixel size to perform aggregation in
 max_pixels (int or None) – the maximum number of pixels to use for aggregation. defaults to the geomodeling.rasterlimit setting.
 column_name (string) – the name of the column to output the results
 auto_pixel_size (boolean) – determines whether the pixel_size is adjusted when a raster is too large. Default False.
Returns: GeometryBlock with aggregation results in
column_name
The currently implemented statistics are sum, count, min, max, mean, median, and percentile. If projection or max_resolution are not given, these are taken from the provided RasterBlock.
The count statistic calculates the number of active cells in the raster. A percentile statistic can be selected using text value starting with ‘p’ followed by something that can be parsed as a float value, for example
'p33.3'
.Only geometries that intersect the requested bbox are aggregated. Aggregation is done in a specified projection and with a specified pixel size.
Should the combination of the requested pixel_size and the extent of the source geometry cause the requested raster size to exceed max_pixels, the pixel_size is adjusted automatically if
auto_pixel_size = True
, else a RuntimeError is raised. The global rasterlimit setting can be adapted as follows:
>>> from dask import config >>> config.set({"geomodeling.rasterlimit": 10 ** 9})

class
dask_geomodeling.geometry.aggregate.
AggregateRasterAboveThreshold
(source, raster, statistic='sum', projection=None, pixel_size=None, max_pixels=None, column_name='agg', auto_pixel_size=False, threshold_name=None)¶ Aggregate raster values ignoring values below some threshold. The thresholds are supplied per geometry.
Parameters:  source (GeometryBlock) – the source of geometry data
 raster (RasterBlock) – the source of raster data
 statistic (string) – the type of statistic to perform. can be
'sum', 'count', 'min', 'max', 'mean', 'median', 'p<percentile>'
.  projection (string) – the projection to perform the aggregation in
 pixel_size (float) – the pixel size to perform aggregation in
 max_pixels (int) – the maximum number of pixels to use for aggregation
 column_name (string) – the name of the column to output the results
 auto_pixel_size (boolean) – determines whether the pixel_size is adjusted when a raster is too large. Default False.
 threshold_name (string) – the name of the column with the thresholds
Returns: GeometryBlock with aggregation results in
column_name
dask_geomodeling.geometry.constructive
¶
Module containing geometry block constructive operations

class
dask_geomodeling.geometry.constructive.
Buffer
(source, distance, projection, resolution=16)¶ Buffer geometries.
Parameters:  source (GeometryBlock) – the geometry source
 distance (float) – a distance measure in the given projection.
 projection (string) – an EPSG or WKT string, e.g. EPSG:28992.
 resolution (int) – quarter circle segments. Default is 16.

distance
¶ Buffer distance.
The unit (e.g. m, °) is determined by the projection.

projection
¶ Projection used for buffering.

resolution
¶ Buffer resolution.

class
dask_geomodeling.geometry.constructive.
Simplify
(source, tolerance=None, preserve_topology=True)¶ Simplify geometries up to given tolerance.
Parameters:  source (GeometryBlock) – the geometry source
 tolerance (float) – the simplification tolerance. if no tolerance is given,
the
min_size
request param is used.  preserve_topology (boolean) – whether to preserve topology. Default True.
dask_geomodeling.geometry.field_operations
¶
Module containing geometry block operations that act on nongeometry fields

class
dask_geomodeling.geometry.field_operations.
Classify
(source, bins, labels, right=True)¶ Classify a continuousvalued property into binned categories
Parameters:  source (SeriesBlock) – source data to classify
 bins (list) – a 1dimensional and monotonic list of bins. How values outside of the bins are classified, depends on the length of the labels. If len(labels) = len(bins)  1, then values outside of the bins are classified to NaN. If len(labels) = len(bins) + 1, then values outside of the bins are classified to the first and last elements of the labels list.
 labels (list) – the labels for the returned bins
 right (boolean) – whether the intervals include the right or the left bin edge

class
dask_geomodeling.geometry.field_operations.
ClassifyFromColumns
(source, value_column, bin_columns, labels, right=True)¶ Classify a continuousvalued property based on bins located in different columns.
Parameters:  source (GeometryBlock) – geometry source to classify
 value_column (string) – the column name that contains values to classify
 bin_columns (list) – column names in which the bins are stored. The bins values need to be sorted in increasing order.
 labels (list) – specifies the labels for the returned bins
 right (boolean) – whether the intervals include the right or the left bin edge Default True.

class
dask_geomodeling.geometry.field_operations.
Add
(source, other)¶ Addition of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
Subtract
(source, other)¶ Subtraction of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
Multiply
(source, other)¶ Multiplication of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
Divide
(source, other)¶ Floating division of series and other, elementwise.
Putting source in the divisor is not possible: please use the Power for that instead.

class
dask_geomodeling.geometry.field_operations.
FloorDivide
(source, other)¶ Integer (floor) division of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
Power
(source, other)¶ Power (exponent) of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
Modulo
(source, other)¶ Modulo of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
Equal
(source, other)¶ Equal to of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
NotEqual
(source, other)¶ Not equal to of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
Greater
(source, other)¶ Greater than of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
GreaterEqual
(source, other)¶ Greater than or equal to of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
Less
(source, other)¶ Less than of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
LessEqual
(source, other)¶ Less than or equal to of series and other, elementwise.

class
dask_geomodeling.geometry.field_operations.
And
(source, other)¶ Logical AND between series and other.

class
dask_geomodeling.geometry.field_operations.
Or
(source, other)¶ Logical OR between series and other.

class
dask_geomodeling.geometry.field_operations.
Xor
(source, other)¶ Logical XOR between series and other.

class
dask_geomodeling.geometry.field_operations.
Invert
(source, *args)¶ Logical NOT operation on a series.

class
dask_geomodeling.geometry.field_operations.
Where
(source, cond, other)¶ Replace values where the condition is False.
Parameters:  source (SeriesBlock) – source data
 cond (SeriesBlock) – condition that determines whether to keep values from source
 other (SeriesBlock, scalar) – entries where cond is False are replaced with the corresponding value from other.

class
dask_geomodeling.geometry.field_operations.
Mask
(source, cond, other)¶ Replace values where the condition is True.
Parameters:  source (SeriesBlock) – source data
 cond (SeriesBlock) – condition that determines whether to mask values from source
 other (SeriesBlock, scalar) – entries where cond is True are replaced with the corresponding value from other.

class
dask_geomodeling.geometry.field_operations.
Round
(source, decimals=0)¶ Round each value in a SeriesBlock to the given number of decimals
Parameters:  source (SeriesBlock) – source data
 decimals (int) – number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.
dask_geomodeling.geometry.geom_operations
¶
Module containing operations that return series from geometry fields

class
dask_geomodeling.geometry.geom_operations.
Area
(source, projection)¶ Block that calculates the area of geometries.
Parameters:  source (GeometryBlock) – geometry data
 projection (string) – projection as EPSG or WKT string to compute area in
Returns: SeriesBlock with only the computed area
dask_geomodeling.geometry.merge
¶
Module containing merge operation that act on geometry blocks

class
dask_geomodeling.geometry.merge.
MergeGeometryBlocks
(left, right, how='inner', suffixes=('', '_right'))¶ Merge two GeometryBlocks into one by index
Parameters:  left (GeometryBlock) – left geometry data to merge
 right (GeometryBlock) – right geometry data to merge
 how (string) – type of merge to be performed. One of
‘left’, ‘right’, ‘outer’, ‘inner'
. Default‘inner’
.  suffixes (tuple) – suffix to apply to overlapping column names in the left
and right side, respectively. Default
('', '_right')
.
dask_geomodeling.geometry.parallelize
¶
Module containing blocks that parallelize nongeometry fields

class
dask_geomodeling.geometry.parallelize.
GeometryTiler
(source, size, projection)¶ Parallelize operations on a GeometryBlock by tiling the request
Parameters:  source (GeometryBlock) – GeometryBlock
 size (float) – the max size of a tile in units of the projection
 projection (string) – the projection as EPSG or WKT string in which to compute tiles
Only supports ‘centroid’ and ‘extent’ request modes.
dask_geomodeling.geometry.set_operations
¶
Module containing geometry block set operations

class
dask_geomodeling.geometry.set_operations.
Difference
(source, other)¶ Block that calculates the difference of two GeometryBlocks.
The resulting GeometryBlock will have all geometries in
source
, and if there are geometries with the same ID inother
, the geometries will be adapted using the Difference operation.

class
dask_geomodeling.geometry.set_operations.
Intersection
(source, other=None)¶ Block that intersects geometries with the requested geometry.
Parameters: source (GeometryBlock) – the source of geometry data
dask_geomodeling.geometry.sources
¶
Module containing geometry sources.

class
dask_geomodeling.geometry.sources.
GeometryFileSource
(url, layer=None, id_field='id')¶ A geometry source that opens a geometry file from disk.
Parameters:  url – URL to the file. File paths have to be contained inside the current root setting. Relative paths are interpreted relative to this setting but internally stored as absolute paths).
 layer (string) – the layer_name in the json to use as source. If None, the first layer is used.
 id_field (string) – the field name to use as unique ID. Default
'id'
.
The input of these blocks is by default limited to 10000 geometries.
 Relevant settings can be adapted as follows:
>>> from dask import config >>> config.set({"geomodeling.root": '/my/data/path'}) >>> config.set({"geomodeling.geometrylimit": 100000})
dask_geomodeling.geometry.sinks
¶

class
dask_geomodeling.geometry.sinks.
GeometryFileSink
(source, url, extension='shp', fields=None)¶ Write geometry data to files in a specified directory
Use GeometryFileSink.merge_files to merge tiles into one large file.
Parameters:  source – the block the data is coming from
 url – the target directory to put the files in
 extension – the file extension (defines the format), the options depend on the platform. See GeometryFileSink.supported_extensions
 fields – a mapping that relates column names to output file field names
field names,
{<output file field name>: <column name>, ...}
.
 Relevant settings can be adapted as follows:
>>> from dask import config >>> config.set({"geomodeling.root": '/my/output/data/path'})

static
merge_files
(path, target, remove_source=False)¶ Merge files (the output of this Block) into one single file.
Optionally removes the source files.

dask_geomodeling.geometry.sinks.
to_file
(source, url, fields=None, tile_size=None, dry_run=False, **request)¶ Utility function to export data from a GeometryBlock to a file on disk.
You need to specify the target file path as well as the extent geometry you want to save.
Parameters:  source (GeometryBlock) – the block the data is coming from
 url (str) – The target file path. The extension determines the format. For supported formats, consult GeometryFileSink.supported_extensions.
 fields (dict) – a mapping that relates column names to output file field
names field names,
{<output file field name>: <column name>, ...}
.  tile_size (int) – Optionally use this for large exports to stay within memory constraints. The export is split in tiles of given size (units are determined by the projection). Finally the tiles are merged.
 dry_run (bool) – Do nothing, only validate the arguments.
 geometry (shapely Geometry) – Limit exported objects to objects whose centroid intersects with this geometry.
 projection (str) – The projection as a WKT string or EPSG code. Sets the projection of the geometry argument, the target projection of the data, and the tiling projection.
 mode (str) – one of
{"intersects", "centroid"}
, default “centroid”  start (datetime) – start date as UTC datetime
 stop (datetime) – stop date as UTC datetime
 **request – see GeometryBlock request specification
 Relevant settings can be adapted as follows:
>>> from dask import config >>> config.set({"geomodeling.root": '/my/output/data/path'}) >>> config.set({"temporary_directory": '/my/alternative/tmp/dir'})
dask_geomodeling.geometry.text
¶
Module containing text column operations that act on geometry blocks

class
dask_geomodeling.geometry.text.
ParseTextColumn
(source, source_column, key_mapping)¶ Parses a text column into (possibly multiple) value columns.
Key, value pairs need to be separated by an equal (
=
) sign.Parameters:  source (GeometryBlock) – data source
 source_column (string) – existing column in source.
 key_mapping (dict) – mapping containing pairs {key_name: column_name}: key_name: existing key in the text to be parsed. column_name: name of the new column created that contains the parsed value.