Geometry and Series Blocks
Geometry-type blocks contain sets of geometries, optionally with 'start'
and 'end'
fields and other properties. Internally, geometry data is stored
in GeoPandas GeoDataframes.
API Specification
Module containing the base geometry block classes.
- class dask_geomodeling.geometry.base.GeometryBlock(*args)
The base block for geometries
All geometry blocks must be derived from this base class and must implement the following attribute:
columns
: a set of column names to expect in the dataframe
A geometry request contains the following fields:
mode: one of
{"intersects", "centroid", "extent"}
geometry: limit returned objects to objects that intersect with this shapely geometry object
projection: projection to return the geometries in as WKT string
limit: the maximum number of geometries
min_size: geometries with a bbox that is smaller than this on all sides are left out
start: start date as UTC datetime
stop: stop date as UTC datetime
filters: dict of Django ORM-like filters on properties (e.g.
id=598
)
The data response is a dictionary with the following fields:
(if mode was
"intersects"
or"centroid"
)"features"
: aGeoDataFrame
of features with properties(if mode was
"extent"
)"extent"
: a tuple of 4 numbers(min_x, min_y, max_x, max_y)
that represents the extent of the geometries that would be returned by an"intersects"
request.(for all modes)
"projection"
: the EPSG or WKT representation of the projection.
To be able to perform operations on properties, there is a helper type called
SeriesBlock
. This is the block equivalent of apandas.Series
. You can get aSeriesBlock
from aGeometryBlock
, perform operations on it, and set it back into aGeometryBlock
.- to_file(*args, **kwargs)
Utility function to export data from this block to a file on disk.
You need to specify the target file path as well as the extent geometry you want to save. Feature properties can be saved by providing a field mapping to the
fields
argument.To stay within memory constraints or to parallelize an operation, the
tile_size
argument can be provided.- Parameters:
url (str) – The target file path. The extension determines the format. For supported formats, consult GeometryFileSink.supported_extensions.
fields (dict) – a mapping that relates column names to output file field names field names,
{<output file field name>: <column name>, ...}
.tile_size (int) – Optionally use this for large exports to stay within memory constraints. The export is split in tiles of given size (units are determined by the projection). Finally the tiles are merged.
geometry (shapely Geometry) – Limit exported objects to objects whose centroid intersects with this geometry.
projection (str) – The projection as a WKT string or EPSG code. Sets the projection of the geometry argument, the target projection of the data, and the tiling projection.
start (datetime) – start date as UTC datetime
stop (datetime) – stop date as UTC datetime
**request – see GeometryBlock request specification
- Relevant settings can be adapted as follows:
>>> from dask import config >>> config.set({"geomodeling.root": '/my/output/data/path'}) >>> config.set({"geomodeling.geometry-limit": 10000}) >>> config.set({"temporary_directory": '/my/alternative/tmp/dir'})
- class dask_geomodeling.geometry.base.GetSeriesBlock(source, name)
Obtain a single feature property column from a GeometryBlock.
Provide a GeometryBlock with one or more columns. One of these columns can be read from this source into a SeriesBlock. This SeriesBlock can be used to run for example classifications.
- Parameters:
source (GeometryBlock) – GeometryBlock with the column you want to load into the SeriesBlock.
name (str) – Name of the column to load into the SeriesBlock.
- Returns:
SeriesBlock containing the single property column
- class dask_geomodeling.geometry.base.SeriesBlock(*args)
A block that represents one column from a GeometryBlock.
Use this helper class to modify (or to use logic on) a specific feature property.
Use :class:
dask_geomodeling.geometry.base.GetSeriesBlock
to retrieve a SeriesBlock from a GeometryBlock and :class:dask_geomodeling.geometry.base.SetSeriesBlock
to add a SeriesBlock to a GeometryBlock.
- class dask_geomodeling.geometry.base.SetSeriesBlock(source, column, value, *args)
Add one or multiple property columns (SeriesBlocks) to a GeometryBlock.
Provide the GeometryBlock that you want to add more properties to. Then provide the SeriesBlock(s) which you want to add to the GeometryBlock. The values of the SeriesBlock will be added to the features in the GeometryBlock automatically (if they are derived from the same geometries in previous operations, the features will have matching indexes so that each property is matched to the correct feature).
The value which is set can also be a single value, in which case each feature will get the same value as a property.
- Parameters:
source (GeometryBlock) – The base GeometryBlock to which the SeriesBlock is added as a new column.
column (str) – The name of the new column (if it exists, it will be overwritten)
value (SeriesBlock, number, str, bool) – The SeriesBlock or constant value that has to be inserted in the destination column.
*args – It is possible to repeat the
"column"
and"value"
arguments multiple times to insert more than one column.
Example
Add two columns to an existing
view
like this:SetSeriesBlock(view, "column_1", series_1, "column_2", series_2)
.- Returns:
The source GeometryBlock with additional property columns
dask_geomodeling.geometry.aggregate
Module containing raster blocks that aggregate rasters.
- class dask_geomodeling.geometry.aggregate.AggregateRaster(source, raster, statistic='sum', projection=None, pixel_size=None, max_pixels=None, column_name='agg', auto_pixel_size=False, *args)
Compute statistics of a raster for each geometry in a geometry source.
A statistic is computed in a specific projection and with a specified raster cell size. If
projection
orpixel_size
are not given, these default to the native projection of the provided raster source. The following cells are selected to perform the statistic (e.g. mean) on:Polygons: all raster cells whose center is inside the polygon
Points: the raster cell (singular) that contains the point
Linestrings: Bresenham’s line algorithm is used
If this assignment leads to the situation that a geometry covers no raster cells (for instance with a polygon much smaller than the raster cell size), the geometry is reduced to a point by taking its centroid.
Should the combination of the requested pixel_size and the extent of the source geometry cause the required raster size to exceed
max_pixels
, thepixel_size
can be adjusted automatically ifauto_pixel_size
is set toTrue
, else (the default) a RuntimeError is raised.Please note that for any field operation on the result of this block a GetSeriesBlock should be used to retrieve data from the added column. The name of the added column is determined by the
column_name
parameter.- Parameters:
source (GeometryBlock) – The geometry source for which the statistics are determined.
raster (RasterBlock) – The raster source that is sampled.
statistic (str) – The type of statistical analysis that should be performed. The options are:
{"sum", "count", "min", "max", "mean", "median", "p<percentile>"}
. Percentiles are provided for example as follows:"p50"
. Default"sum"
.projection (str, optional) – Projection to perform the aggregation in, for example
"EPSG:28992"
. Defaults to the native projection of the supplied raster.pixel_size (float, optional) – The raster cell size used in the aggregation. Defaults to the cell size of the supplied raster.
max_pixels (int, optional) – The maximum number of pixels (cells) in the aggregation. Defaults to the
geomodeling.raster-limit
setting.column_name (str, optional) – The name of the column where the result should be placed. Defaults to
"agg"
.auto_pixel_size (boolean) – Determines whether the pixel size is adjusted automatically when
"max_pixels"
is exceeded. Default False.
- Returns:
GeometryBlock with aggregation results in an added column
- The global raster-limit setting can be adapted as follows:
>>> from dask import config >>> config.set({"geomodeling.raster-limit": 10 ** 9})
- class dask_geomodeling.geometry.aggregate.AggregateRasterAboveThreshold(source, raster, statistic='sum', projection=None, pixel_size=None, max_pixels=None, column_name='agg', auto_pixel_size=False, threshold_name=None)
Compute statistics of a per-feature masked raster for each geometry in a geometry source.
Per feature, a threshold can be supplied to mask the raster with. Only values that exceed the threshold of a specific feature are included for the statistical value of that feature.
See :class:
dask_geomodeling.geometry.aggregate.AggregateRaster
for further information.- Parameters:
*args – See :class:
dask_geomodeling.geometry.aggregate.AggregateRaster
threshold_name (str) – The column that holds the thresholds.
- Returns:
GeometryBlock with aggregation results in an added column
dask_geomodeling.geometry.constructive
Module containing geometry block constructive operations
- class dask_geomodeling.geometry.constructive.Buffer(source, distance, projection, resolution=16)
Buffer (‘expand’) geometries with a given value.
A GeometryBlock and a buffer distance are provided. Each feature in the GeometryBlock is buffered with the distance provided, resulting in updated geometries.
- Parameters:
source (GeometryBlock) – The source GeometryBlock whose geometry will be updated.
distance (float) – The distance used to buffer all features. The distance is measured in the unit of the given projection (e.g. m, °).
projection (str) – The projection used in the operation provided in the format:
"EPSG:28992"
.resolution (integer, optional) – The resolution of the buffer provided as the number of points used to represent a quarter of a circle. The default value is
16
.
- Returns:
GeometryBlock with buffered geometries.
- class dask_geomodeling.geometry.constructive.Simplify(source, tolerance=None, preserve_topology=True)
Simplify geometries, mainly to make them computationally more efficient.
Provide a GeometryBlock and a tolerance value to simplify the geometries. As a result all features in the GeometryBlock are simplified.
- Parameters:
source (GeometryBlock) – Source of the geometries to be simplified.
tolerance (float) – The tolerance used in the simplification. If no tolerance is given the
"min_size"
request parameter is used.preserve_topology (boolean, optional) – Determines whether the topology should be preserved in the operation. Defaults to
True
.
- Returns:
GeometryBlock which was provided as input with a simplified geometry.
dask_geomodeling.geometry.field_operations
Module containing geometry block operations that act on non-geometry fields
- class dask_geomodeling.geometry.field_operations.Add(source, other)
Element-wise addition of SeriesBlock or number to another SeriesBlock.
- Parameters:
source (SeriesBlock) – First addition term
other (SeriesBlock or number) – Second addition term
- Returns:
SeriesBlock
- class dask_geomodeling.geometry.field_operations.And(source, other)
Perform an elementwise logical AND between two SeriesBlocks.
If a feature has a True value in both SeriesBlocks, True is returned, else False is returned.
- Parameters:
source (SeriesBlock) – First boolean term
other (SeriesBlock) – Second boolean term
- Returns:
SeriesBlock with boolean values
- class dask_geomodeling.geometry.field_operations.Classify(source, bins, labels, right=True)
Classify a value column into different bins
For example: every value below 3 becomes “A”, every value between 3 and 5 becomes “B”, and every value above 5 becomes “C”.
The provided SeriesBlock will be classified according to the given classification parameters. These parameters consist of two lists, one with the edges of the classification bins (i.e.
[3, 5]
) and one with the desired class output (i.e.["A", "B", "C"]
). The input data is then compared to the classification bins. In this example a value 1 is below 3 so it gets class"A"
. A value 4 is between 3 and 5 so it gets label"B"
.How values outside of the bins are classified depends on the length of the labels list. If the length of the labels equals the length of the binedges plus 1 (the above example), then values outside of the bins are classified to the first and last elements of the labels list. If the length of the labels equals the length of the bins minus 1, then values outside of the bins are classified to ‘no data’.
- Parameters:
source (SeriesBlock) – The (numeric) data which should be classified.
bins (list) – The edges of the classification intervals (i.e.
[3, 5]
).labels (list) – The classification returned if a value falls in a specific bin (i.e.
["A", "B", "C"]
). The length of this list is either one larger or one less than the length of thebins
argument. Labels should be unique. If labels are numeric, they are always converted to float to be able to deal with NaN values.right (boolean, optional) – Determines what side of the intervals are closed. Defaults to True (the right side of the bin is closed so a value assigned to the bin on the left if it is exactly on a bin edge).
- Returns:
A SeriesBlock with classified values instead of the original numbers.
- class dask_geomodeling.geometry.field_operations.ClassifyFromColumns(source, value_column, bin_columns, labels, right=True)
Classify a continuous-valued geometry property based on bins located in other columns.
See :class:
dask_geomodeling.geometry.field_operations.Classify
for further information.- Parameters:
source (GeometryBlock) – The GeometryBlock which contains the column which should be clasified as well as columns with the bin edges.
value_column (str) – The column with (float) data which should be classified.
bin_columns (list) – A list of columns that contain the bins for the classification. The order of the columns should be from low to high values.
labels (list) – The classification returned if a value falls in a specific bin (i.e.
["A", "B", "C"]
). The length of this list is either one larger or one less than the length of thebins
argument. Labels should be unique. If labels are numeric, they are always converted to float to be able to deal with NaN values.right (boolean, optional) – Determines what side of the intervals are closed. Defaults to True (the right side of the bin is closed so a value assigned to the bin on the left if it is exactly on a bin edge).
- Returns:
A SeriesBlock with classified values instead of the original floats.
- class dask_geomodeling.geometry.field_operations.Divide(source, other)
Element-wise division of SeriesBlock or number with another SeriesBlock.
Note that if you want to divide a constant value by a SeriesBlock (like
3 / series
, you have to doMultiply(3, Power(series, -1))
.- Parameters:
source (SeriesBlock) – Numerator
other (SeriesBlock or number) – Denominator
- Returns:
SeriesBlock
- class dask_geomodeling.geometry.field_operations.Equal(source, other)
Determine whether a SeriesBlock and a second SeriesBlock or a constant value are equal.
Note that ‘no data’ does not equal ‘no data’.
- Parameters:
source (SeriesBlock) – First comparison term
other (SeriesBlock or number) – Second comparison term
- Returns:
SeriesBlock with boolean values
- class dask_geomodeling.geometry.field_operations.FloorDivide(source, other)
Element-wise integer division of SeriesBlock or number with another SeriesBlock.
The outcome of the division is converted to the closest integer below (i.e. 3.4 becomes 3, 3.9 becomes 3 and -3.4 becomes -4)
- Parameters:
source (SeriesBlock) – Numerator
other (SeriesBlock or number) – Denominator
- Returns:
SeriesBlock
- class dask_geomodeling.geometry.field_operations.Greater(source, other)
Determine for each value in a SeriesBlock whether it is greater than a comparison value from a SeriesBlock or constant.
- Parameters:
source (SeriesBlock) – First comparison term
other (SeriesBlock or number) – Second comparison term
- Returns:
SeriesBlock with boolean values
- class dask_geomodeling.geometry.field_operations.GreaterEqual(source, other)
Determine for each value in a SeriesBlock whether it is greater than or equal to a comparison value from a SeriesBlock or constant.
- Parameters:
source (SeriesBlock) – First comparison term
other (SeriesBlock or number) – Second comparison term
- Returns:
SeriesBlock with boolean values
- class dask_geomodeling.geometry.field_operations.Invert(source, *args)
Invert a boolean SeriesBlock (swap True and False)
- Parameters:
source (SeriesBlock) – SeriesBlock with boolean values.
- Returns:
SeriesBlock with boolean values
- class dask_geomodeling.geometry.field_operations.Less(source, other)
Determine for each value in a SeriesBlock whether it is less than a comparison value from a SeriesBlock or constant.
- Parameters:
source (SeriesBlock) – First comparison term
other (SeriesBlock or number) – Second comparison term
- Returns:
SeriesBlock with boolean values
- class dask_geomodeling.geometry.field_operations.LessEqual(source, other)
Determine for each value in a SeriesBlock whether it is less than or equal to a comparison value from a SeriesBlock or constant.
- Parameters:
source (SeriesBlock) – First comparison term
other (SeriesBlock or number) – Second comparison term
- Returns:
SeriesBlock with boolean values
- class dask_geomodeling.geometry.field_operations.Mask(source, cond, other)
Replace values in a SeriesBlock where values in another SeriesBlock are True.
Provide a source SeriesBlock, a conditional SeriesBlock (True/False) and a replacement value which can either be a SeriesBlock or a constant value. All entries in the source that correspond to a True value in the conditional are left unchanged. The values in the source that correspond to a True value in the conditional are replaced with the value from ‘other’.
- Parameters:
source (SeriesBlock) – Source SeriesBlock that is going to be updated
cond (SeriesBlock) – Conditional SeriesBlock that determines whether features in the source SeriesBlock will be updated. If this is not boolean (True/False), then all data values (including 0) are interpreted as True. Missing values are always interpeted as False.
other (SeriesBlock or constant) – The value that should be used as a replacement for the source SeriesBlock where the conditional SeriesBlock is True.
- Returns:
SeriesBlock with updated values where condition is True.
- class dask_geomodeling.geometry.field_operations.Modulo(source, other)
Element-wise modulo (remainder after division) of SeriesBlock or number with another SeriesBlock.
Example: if the input is
[31, 5.3, -4]
and the modulus is3
, the outcome would be[1, 2.3, 2]
. The outcome is always postive and less than the modulus.- Parameters:
source (SeriesBlock) – Number
other (SeriesBlock or number) – Modulus
- Returns:
SeriesBlock
- class dask_geomodeling.geometry.field_operations.Multiply(source, other)
Element-wise multiplication of SeriesBlock or number with another SeriesBlock.
- Parameters:
source (SeriesBlock) – First multiplication factor
other (SeriesBlock or number) – Second multiplication factor
- Returns:
SeriesBlock
- class dask_geomodeling.geometry.field_operations.NotEqual(source, other)
Determine whether a SeriesBlock and a second SeriesBlock or a constant value are not equal.
Note that ‘no data’ does not equal ‘no data’.
- Parameters:
source (SeriesBlock) – First comparison term
other (SeriesBlock or number) – Second comparison term
- Returns:
SeriesBlock with boolean values
- class dask_geomodeling.geometry.field_operations.Or(source, other)
Perform an elementwise logical OR between two SeriesBlocks.
If a feature has a True value in any of the input SeriesBlocks, True is returned, else False is returned.
- Parameters:
source (SeriesBlock) – First boolean term
other (SeriesBlock) – Second boolean term
- Returns:
SeriesBlock with boolean values
- class dask_geomodeling.geometry.field_operations.Power(source, other)
Element-wise raise a SeriesBlock to the power of a number or another SeriesBlock.
For example, the inputs
[2, 4]
and2
will give output[4, 16]
.- Parameters:
source (SeriesBlock) – Base
other (SeriesBlock or number) – Exponent
- Returns:
SeriesBlock
- class dask_geomodeling.geometry.field_operations.Round(source, decimals=0)
Round each value in a SeriesBlock to the given number of decimals
- Parameters:
source (SeriesBlock) – SeriesBlock with float data that is rounded to the provided number of decimals.
decimals (int, optional) – number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.
- Returns:
SeriesBlock with rounded values.
- class dask_geomodeling.geometry.field_operations.Subtract(source, other)
Element-wise subtraction of SeriesBlock or number with another SeriesBlock.
Note that if you want to subtract a SeriesBlock from a constant value (like
4 - series
, you have to doAdd(Multiply(series, -1), 4)
.- Parameters:
source (SeriesBlock) – First subtraction term
other (SeriesBlock or number) – Second subtraction term
- Returns:
SeriesBlock
- class dask_geomodeling.geometry.field_operations.Where(source, cond, other)
Replace values in a SeriesBlock where values in another SeriesBlock are False.
Provide a source SeriesBlock, a conditional SeriesBlock (True/False) and a replacement value which can either be a SeriesBlock or a constant value. All entries in the source that correspond to a True value in the conditional are left unchanged. The values in the source that correspond to a False value in the conditional are replaced with the value from ‘other’.
- Parameters:
source (SeriesBlock) – Source SeriesBlock that is going to be updated
cond (SeriesBlock) – Conditional SeriesBlock that determines whether features in the source SeriesBlock will be updated. If this is not boolean (True/False), then all data values (including 0) are interpreted as True. Missing values are always interpeted as False.
other (SeriesBlock or constant) – The value that should be used as a replacement for the source SeriesBlock where the conditional SeriesBlock is False.
- Returns:
SeriesBlock with updated values where condition is False.
- class dask_geomodeling.geometry.field_operations.Xor(source, other)
Perform an elementwise logical exclusive OR between two SeriesBlocks.
If a feature has a True value in precisely one of the input SeriesBlocks, True is returned, else False is returned.
- Parameters:
source (SeriesBlock) – First boolean term
other (SeriesBlock) – Second boolean term
- Returns:
SeriesBlock with boolean values
dask_geomodeling.geometry.geom_operations
Module containing operations that return series from geometry fields
- class dask_geomodeling.geometry.geom_operations.Area(source, projection)
Calculate the area of features in a GeometryBlock.
Provide a GeometryBlock and a projection. Returns the area of each individual geometry in the input block, in that projection.
- Parameters:
source (GeometryBlock) – Source GeometryBlock which contains the features.
projection (str) – Projection in which to compute the area (i.e.
"epsg:28992"
).
- Returns:
SeriesBlock with only the computed area
dask_geomodeling.geometry.merge
Module containing merge operation that act on geometry blocks
- class dask_geomodeling.geometry.merge.MergeGeometryBlocks(left, right, how='inner', suffixes=('', '_right'))
Merge two GeometryBlocks into one by index
Provide two GeometryBlocks with the same original source to make sure they can be matched on index. The additional SeriesBlocks that have been added to the GeometryBlock will be combined to one GeometryBlock that contains all the information.
- Parameters:
left (GeometryBlock) – The left GeometryBlock to be combined.
right (GeometryBlock) – The right GeometryBlock to be combined.
how (str, optional) –
The parameter that describes how the merge should be performed. There are four options:
"left"
: The resulting GeometryBlock will have all the features that are present in the left GeometryBlock, no matter the features in the right GeometryBlock."right"
: The resulting GeometryBlock will have all the features that are present in the right GeometryBlock, no matter the features in the left GeometryBlock."inner"
(default): The outcome will contain all the features that are present in both input GeometryBlocks. Features that are absent in one of the GeometryBlocks will be absent in the result.outer
: The result will contain all the features which are present in one of the input GeometryBlocks.
suffixes (tuple, optional) – Text to be added to the column names to distinguish whether they originate from the left or right GeometryBlock. Default:
("", "_right")
.
- Returns:
GeometryBlock that contains a combination of features and columns of the two input GeometryBlocks.
dask_geomodeling.geometry.parallelize
Module containing blocks that parallelize non-geometry fields
- class dask_geomodeling.geometry.parallelize.GeometryTiler(source, size, projection)
Parallelize operations on a GeometryBlock by tiling the request.
- Parameters:
source (GeometryBlock) – The source GeometryBlock
size (float) – The maximum size of a tile in units of the projection
projection (str) – The projection as EPSG or WKT string in which to compute tiles (e.g.
"EPSG:28992"
)
- Returns:
GeometryBlock that only supports
"centroid"
and"extent"
request modes.
dask_geomodeling.geometry.set_operations
Module containing geometry block set operations
- class dask_geomodeling.geometry.set_operations.Difference(source, other)
Calculate the geometric difference of two GeometryBlocks.
All geometries in the source GeometryBlock will be adapted by geometries with the same index from the second GeometryBlock. The difference operation removes any overlap between the geometries from the first geometry.
- Parameters:
source (GeometryBlock) – First geometry source.
other (GeometryBlock) – Second geometry source.
- Returns:
A GeometryBlock with altered geometries. Properties are preserved.
- class dask_geomodeling.geometry.set_operations.Intersection(source, other=None)
Calculate the intersection of a GeometryBlock with the request geometry.
Normally, geometries returned by a GeometryBlock may be partially outside of the requested geometry. This block ensures that the geometries are strictly inside the requested geometry by taking the intersection of each geometry with the request geometry.
- Parameters:
source (GeometryBlock) – Input geometry source.
- Returns:
A GeometryBlock with altered geometries. Properties are preserved.
dask_geomodeling.geometry.sources
Module containing geometry sources.
- class dask_geomodeling.geometry.sources.GeometryFileSource(url, layer=None, id_field='id')
A geometry source that opens a geometry file from disk.
The input of this blocks is by default limited by the global geomodeling.geometry-limit setting.
- Parameters:
url (str) – Path (URL) to the file. If relative, it is taken relative to the geomodeling.root setting.
layer (str, optional) – The layer name in the source to select. If None, (default) the first layer is used.
id_field (str, optional) – The field name to use as feature index. Default
"id"
.
- Relevant settings can be adapted as follows:
>>> from dask import config >>> config.set({"geomodeling.root": '/my/data/path'}) >>> config.set({"geomodeling.geometry-limit": 100000})
- class dask_geomodeling.geometry.sources.GeometryWKTSource(wkt, projection)
Converts a single geometry to a geometry source
- Parameters:
wkt (string) – the WKT representation of a geometry
projection (string) – the projection of the geometry
- Returns:
GeometryBlock
dask_geomodeling.geometry.sinks
- class dask_geomodeling.geometry.sinks.GeometryFileSink(source, url, extension='shp', fields=None)
Write geometry data to files in a specified directory
Use GeometryFileSink.merge_files to merge tiles into one large file.
- Parameters:
source (GeometryBlock) – The block the data is coming from
url (str) – The target directory to put the files in. If relative, it is taken relative to the geomodeling.root setting.
extension (str) – The file extension (defines the format), one of
{"shp", "gpkg", "geojson", "gml"}
. On some platforms, these options might be limited. For an accurate list, seeGeometryFileSink.supported_extensions
.fields (dict) – A mapping that relates column names to output file field names field names like
{<output file field name>: <column name>}
.
- Relevant settings can be adapted as follows:
>>> from dask import config >>> config.set({"geomodeling.root": '/my/output/data/path'})
- static merge_files(path, target, remove_source=False)
Merge files (the output of this Block) into one single file.
Optionally removes the source files.
- dask_geomodeling.geometry.sinks.to_file(source, url, fields=None, tile_size=None, dry_run=False, **request)
Utility function to export data from a GeometryBlock to a file on disk.
You need to specify the target file path as well as the extent geometry you want to save. Feature properties can be saved by providing a field mapping to the
fields
argument.To stay within memory constraints or to parallelize an operation, the
tile_size
argument can be provided.- Parameters:
source (GeometryBlock) – the block the data is coming from
url (str) – The target file path. The extension determines the format. For supported formats, consult GeometryFileSink.supported_extensions.
fields (dict) – a mapping that relates column names to output file field names field names,
{<output file field name>: <column name>, ...}
.tile_size (int) – Optionally use this for large exports to stay within memory constraints. The export is split in tiles of given size (units are determined by the projection). Finally the tiles are merged.
dry_run (bool) – Do nothing, only validate the arguments.
geometry (shapely Geometry) – Limit exported objects to objects whose centroid intersects with this geometry.
projection (str) – The projection as a WKT string or EPSG code. Sets the projection of the geometry argument, the target projection of the data, and the tiling projection.
mode (str) – one of
{"intersects", "centroid"}
, default “centroid”start (datetime) – start date as UTC datetime
stop (datetime) – stop date as UTC datetime
**request – see GeometryBlock request specification
- Relevant settings can be adapted as follows:
>>> from dask import config >>> config.set({"geomodeling.root": '/my/output/data/path'}) >>> config.set({"temporary_directory": '/my/alternative/tmp/dir'})
dask_geomodeling.geometry.text
Module containing text column operations that act on geometry blocks
- class dask_geomodeling.geometry.text.ParseTextColumn(source, source_column, key_mapping)
Parses a text column into (possibly multiple) value columns.
Key, value pairs need to be separated by an equal (
=
) sign.- Parameters:
source (GeometryBlock) – Data source
source_column (str) – Existing column in source.
key_mapping (dict) – Mapping containing pairs {key_name: column_name}: