convster.parallel#
This module contains various helper functions to parallelize the application of filters on a tif
Functions#
|
Apply a filter to one or more bands of a raster and export the result. |
|
Compute entropy-based heterogeneity from categorical raster bands. |
|
Compute interaction strength between categorical raster bands. |
|
Extract per-category maps from a raster, optionally apply a filter, and export to a file. |
Module Contents#
- convster.parallel.apply_filter(source, output_file, block_size, bands=None, data_in_range=None, data_as_dtype=np.uint8, data_output_range=None, replace_nan_with=None, img_filter=None, filter_params=None, filter_output_range=(0.0, 1.0), output_dtype=np.uint8, output_range=None, selector_band=None, verbose=False, output_params=None, **params)[source]#
Apply a filter to one or more bands of a raster and export the result.
This function processes the raster in blocks to allow for memory-efficient and parallelized computation. Each block is filtered and then combined into a single output raster. Optionally, categorical masking can be used with selector_band.
- Parameters:
source (str or Source) – Path to the input raster file or a Source object.
output_file (str) – Path where the filtered raster will be written.
block_size (tuple of int) – Size of the processing blocks in pixels as
(width, height).bands (list of Band, optional) – Specific bands to process. If None, all bands in the raster are used.
data_in_range (array-like, optional) – Input range used for rescaling loaded data before filtering.
data_as_dtype (type, optional, default=np.uint8) – Data type to which input data is converted before filtering.
data_output_range (array-like, optional) – Output range for data rescaling after conversion to data_as_dtype.
replace_nan_with (int or float, optional) – Value used to replace NaNs in the input before filtering.
img_filter (callable, optional) – Filter function to apply to the data (e.g., skimage.filters.gaussian).
filter_params (dict, optional) – Parameters to pass to img_filter.
filter_output_range (collection, default=(0., 1.)) – Expected output range of the applied filter function.
output_dtype (type, optional, default=np.uint8) – Data type of the filtered output raster.
output_range (tuple, optional) – Value range to rescale the final output raster.
selector_band (Band, optional) – Categorical band used as a mask to apply the filter selectively across categories.
output_params (dict, optional) –
Additional output configuration:
as_dtype : data type for filtered output (overrides output_dtype)
nodata : value for missing data (default: None)
bigtiff : bool, whether to create a BIGTIFF for >4GB files
Note
This may become standard at a later point
verbose (bool, default=False) – Print progress and debug information.
**params –
Additional optional arguments:
nbrcpu : int, number of CPU cores for multiprocessing.
start_method : str, multiprocessing start method.
compress : bool, whether to compress the final output with LZW.
- Returns:
output_file – Path to the resulting filtered raster file.
- Return type:
Notes
Deprecated parameters (output_dtype vs output_params[‘as_dtype’]) are internally handled with warnings.
Borders for filtering are automatically computed from the filter kernel using
compatible_border_size().Block decomposition uses
create_views().Worker pool is created with
get_or_set_context()and sized usingget_nbr_workers().Each block is processed by
_view_filtered()viarunner_call().
See also
extract_categories()Extract per-category maps with optional filtering.
compute_entropy()Compute spatial entropy of categorical bands.
- convster.parallel.compute_entropy(source, output_file, block_size, blur_params, categories=None, output_dtype=None, output_range=None, normed=True, max_entropy_categories=None, verbose=False, **params)[source]#
Compute entropy-based heterogeneity from categorical raster bands.
This function orchestrates a parallel computation of entropy-based heterogeneity across multiple raster blocks, combining per-block results into a single output GeoTIFF. Each block is processed independently via multiprocessing workers that read subsets of the input raster, compute entropy over selected categorical bands, and write results to a queue for aggregation.
- Parameters:
source (str or Source) – Path to the categorical raster file (e.g., land-cover map) or an initialized Source object providing access to the dataset.
output_file (str) – Path where the final entropy GeoTIFF will be written.
block_size (tuple of int) – Block dimensions
(width, height)in pixels. Determines the size of the chunks processed by each worker.blur_params (dict) – Dictionary of parameters for Gaussian blur or other preprocessing. Typically includes at least one of
'sigma'or'diameter'in spatial units (e.g., meters). Used primarily for output file naming.categories (list, optional) – List of category names or identifiers to include in entropy computation. If not provided, all available categories in the source raster are used.
output_dtype (str, type, or None, optional) – Desired data type for the output entropy raster. If not provided, defaults to
numpy.uint8.output_range (tuple of float, optional) – Minimum and maximum output values. If not set, the range is inferred from
output_dtype— [0, 1] for floating-point types, or the valid range for integer types (e.g., [0, 255] for uint8).normed (bool, default=True) – Whether to normalize the entropy values. When True, entropy is scaled relative to the maximum possible entropy determined by the number of categories.
max_entropy_categories (int, optional) – Specifies the number of categories to assume for maximum entropy normalization. Ignored if
normed=False.verbose (bool, default=False) – If True, prints progress updates and diagnostic information.
**params –
Additional optional parameters controlling multiprocessing and output:
nbrcpu : int Number of CPU cores to use. Defaults to the number of available cores minus one.
start_method : str Multiprocessing start method (e.g.,
"spawn"or"fork").entropy_as_ubyte : bool, optional Deprecated. Use
output_dtype="uint8"instead.compress : bool, default=False If True, compresses the final GeoTIFF using LZW compression.
- Returns:
output_file – Path to the resulting entropy raster file.
- Return type:
Notes
Each processing block computes entropy over the given categories using the internal
_block_entropy()function and sends results to a queue.The function
_combine_entropy_blocks()merges these intermediate results into a single raster file.The operation can be parallelized across multiple CPUs to handle large raster datasets efficiently.
Block decomposition uses
create_views().Worker pool is created with
get_or_set_context()and sized usingget_nbr_workers().Output filename is constructed with
output_filename().
See also
compute_interaction()Compute spatial interaction between categorical bands.
extract_categories()Extract per-category maps with optional filtering.
_block_entropy()Worker function computing entropy for a single block.
- convster.parallel.compute_interaction(source, output_file, block_size, blur_params, categories=None, output_dtype=None, output_range=None, standardize=False, normed=True, verbose=False, **params)[source]#
Compute interaction strength between categorical raster bands.
This function computes pairwise, three-way, or higher-order interaction measures among categorical raster bands (e.g., land-cover classes). The computation is performed block-wise using multiprocessing, where each worker processes a subset of the raster and pushes intermediate results to a queue. The results are then combined into a single output raster representing spatial interaction strength.
- Parameters:
source (str or Source) – Path to the categorical raster file (e.g., a land-cover map) or an initialized Source object providing access to the dataset.
output_file (str) – Path where the final interaction raster will be saved.
block_size (tuple of int) – Block dimensions
(width, height)in pixels that define the size of the chunks processed by each multiprocessing worker.blur_params (dict) – Dictionary containing parameters for Gaussian blur or other preprocessing. Used primarily for formatting the output filename. Should include either
'sigma'or'diameter'in spatial units.categories (list, optional) – List of categories (e.g., land-cover types) to include in the interaction computation. If not provided, all categories available in the source raster are used.
output_dtype (str, type, or None, optional) – Desired data type for the output raster. If not provided, defaults to
numpy.uint8.output_range (tuple of float, optional) – Minimum and maximum values for scaling the output. If not specified, inferred automatically from
output_dtype—[0, 1] for floats, or the valid integer range for integer types (e.g., [0, 255] for uint8).standardize (bool, default=False) – Whether to standardize the data prior to computing interactions. This can improve comparability across categories with different magnitudes or distributions.
normed (bool, default=True) – If True, normalizes the computed interaction strengths relative to their theoretical maximum values.
verbose (bool, default=False) – If True, prints detailed progress and diagnostic messages during processing.
**params –
Additional optional parameters controlling multiprocessing and output:
nbrcpu : int Number of CPU cores to use. Defaults to the number of available cores minus one.
start_method : str Multiprocessing start method (e.g.,
"spawn"or"fork").interaction_as_ubyte : bool, optional Deprecated. Use
output_dtype="uint8"instead.compress : bool, default=False If True, compresses the final GeoTIFF using LZW compression.
- Returns:
output_file – Path to the resulting interaction raster file.
- Return type:
Notes
The computation proceeds in parallel by dividing the raster into independent processing blocks.
Each worker executes an internal
_block_interaction()task and writes its results to a multiprocessing queue.The
_combine_interaction_blocks()function merges all block results into a single output file.Designed for efficient large-scale raster analysis on multicore systems.
Block decomposition uses
create_views().Worker pool is created with
get_or_set_context()and sized usingget_nbr_workers().Output filename is constructed with
output_filename().
See also
compute_entropy()Compute spatial entropy of categorical bands.
extract_categories()Extract per-category maps with optional filtering.
_block_interaction()Worker function computing interaction for a single block.
- convster.parallel.extract_categories(source, categories, output_file, block_size, img_filter=None, filter_params=None, output_dtype=None, output_params=None, filter_output_range=None, verbose=False, **params)[source]#
Extract per-category maps from a raster, optionally apply a filter, and export to a file.
This function processes a categorical raster by separating specified categories into individual bands. Optionally, a filter function (e.g., Gaussian smoothing) can be applied to each category band. The processing is performed block-wise with multiprocessing, and the results are combined into a single output raster.
- Parameters:
source (str or Source) – Path to the input .tif file or a Source object providing access to it.
categories (list) – List of categorical values to separate into individual bands.
output_file (str) – Path where the resulting raster will be written.
block_size (tuple of int) – Size of processing blocks in pixels as
(width, height).img_filter (callable, optional) – A function to apply to each category band (e.g.,
skimage.filters.gaussian). If None, no filtering is applied andfilter_paramsis ignored.filter_params (dict, optional) – Parameters to pass to img_filter. Required if img_filter is provided.
output_dtype (str or type, optional) – Data type of the output bands. Deprecated; use
output_params['as_dtype']instead.output_params (dict, optional) –
Dictionary of output settings:
as_dtype : data type for the filtered output (overrides output_dtype)
nodata : value used for missing data (default: None)
bigtiff : bool, whether to create a BIGTIFF for >4GB files
output_range : tuple, output value range for data conversion
filter_output_range (tuple, optional) – Expected value range of the filtered data. Recommended when img_filter is used.
verbose (bool, default=False) – Print processing information and progress.
**params –
Additional optional arguments:
nbrcpu : int, number of CPU cores to use (default: available cores minus one)
start_method : str, multiprocessing start method (e.g., ‘spawn’ or ‘fork’)
compress : bool, if True, compress the final output with LZW
- Returns:
output_file – Path to the resulting raster file containing extracted and optionally filtered category bands.
- Return type:
Notes
Deprecated parameters (blur_as_int, output_dtype) are internally mapped to output_params[‘as_dtype’] with warnings.
Borders for filtering are automatically computed based on the filter kernel using
compatible_border_size().Block decomposition uses
create_views().Worker pool is created with
get_or_set_context()and sized usingget_nbr_workers().
See also
compute_entropy()Compute spatial entropy of categorical bands.
compute_interaction()Compute spatial interaction between categorical bands.
apply_filter()Apply a filter to one or more bands of a raster.