convster.parallel ================= .. py:module:: convster.parallel .. autoapi-nested-parse:: This module contains various helper functions to parallelize the application of filters on a tif Functions --------- .. autoapisummary:: convster.parallel.apply_filter convster.parallel.compute_entropy convster.parallel.compute_interaction convster.parallel.extract_categories Module Contents --------------- .. py:function:: apply_filter(source, output_file, block_size, bands = None, data_in_range = None, data_as_dtype = np.uint8, data_output_range = None, replace_nan_with = None, img_filter=None, filter_params = None, filter_output_range = (0.0, 1.0), output_dtype = np.uint8, output_range = None, selector_band = None, verbose = False, output_params = None, **params) Apply a filter to one or more bands of a raster and export the result. This function processes the raster in blocks to allow for memory-efficient and parallelized computation. Each block is filtered and then combined into a single output raster. Optionally, categorical masking can be used with `selector_band`. :param source: Path to the input raster file or a `Source` object. :type source: str or Source :param output_file: Path where the filtered raster will be written. :type output_file: str :param block_size: Size of the processing blocks in pixels as ``(width, height)``. :type block_size: tuple of int :param bands: Specific bands to process. If None, all bands in the raster are used. :type bands: list of Band, optional :param data_in_range: Input range used for rescaling loaded data before filtering. :type data_in_range: array-like, optional :param data_as_dtype: Data type to which input data is converted before filtering. :type data_as_dtype: type, optional, default=np.uint8 :param data_output_range: Output range for data rescaling after conversion to `data_as_dtype`. :type data_output_range: array-like, optional :param replace_nan_with: Value used to replace NaNs in the input before filtering. :type replace_nan_with: int or float, optional :param img_filter: Filter function to apply to the data (e.g., `skimage.filters.gaussian`). :type img_filter: callable, optional :param filter_params: Parameters to pass to `img_filter`. :type filter_params: dict, optional :param filter_output_range: Expected output range of the applied filter function. :type filter_output_range: collection, default=(0., 1.) :param output_dtype: Data type of the filtered output raster. :type output_dtype: type, optional, default=np.uint8 :param output_range: Value range to rescale the final output raster. :type output_range: tuple, optional :param selector_band: Categorical band used as a mask to apply the filter selectively across categories. :type selector_band: Band, optional :param output_params: Additional output configuration: - **as_dtype** : data type for filtered output (overrides `output_dtype`) - **nodata** : value for missing data (default: None) - **bigtiff** : bool, whether to create a BIGTIFF for >4GB files .. note:: This may become standard at a later point :type output_params: dict, optional :param verbose: Print progress and debug information. :type verbose: bool, default=False :param \*\*params: Additional optional arguments: - **nbrcpu** : int, number of CPU cores for multiprocessing. - **start_method** : str, multiprocessing start method. - **compress** : bool, whether to compress the final output with LZW. :returns: **output_file** -- Path to the resulting filtered raster file. :rtype: str .. rubric:: Notes - Deprecated parameters (`output_dtype` vs `output_params['as_dtype']`) are internally handled with warnings. - Borders for filtering are automatically computed from the filter kernel using :func:`~convster.filters.gaussian.compatible_border_size`. - Block decomposition uses :func:`~riogrande.prepare.create_views`. - Worker pool is created with :func:`~riogrande.helper.get_or_set_context` and sized using :func:`~riogrande.helper.get_nbr_workers`. - Each block is processed by :func:`~convster.processing._view_filtered` via :func:`~riogrande.parallel.runner_call`. .. seealso:: :func:`extract_categories` Extract per-category maps with optional filtering. :func:`compute_entropy` Compute spatial entropy of categorical bands. .. py:function:: compute_entropy(source, output_file, block_size, blur_params, categories = None, output_dtype = None, output_range = None, normed = True, max_entropy_categories = None, verbose = False, **params) Compute entropy-based heterogeneity from categorical raster bands. This function orchestrates a parallel computation of entropy-based heterogeneity across multiple raster blocks, combining per-block results into a single output GeoTIFF. Each block is processed independently via multiprocessing workers that read subsets of the input raster, compute entropy over selected categorical bands, and write results to a queue for aggregation. :param source: Path to the categorical raster file (e.g., land-cover map) or an initialized `Source` object providing access to the dataset. :type source: str or Source :param output_file: Path where the final entropy GeoTIFF will be written. :type output_file: str :param block_size: Block dimensions ``(width, height)`` in pixels. Determines the size of the chunks processed by each worker. :type block_size: tuple of int :param blur_params: Dictionary of parameters for Gaussian blur or other preprocessing. Typically includes at least one of ``'sigma'`` or ``'diameter'`` in spatial units (e.g., meters). Used primarily for output file naming. :type blur_params: dict :param categories: List of category names or identifiers to include in entropy computation. If not provided, all available categories in the source raster are used. :type categories: list, optional :param output_dtype: Desired data type for the output entropy raster. If not provided, defaults to ``numpy.uint8``. :type output_dtype: str, type, or None, optional :param output_range: Minimum and maximum output values. If not set, the range is inferred from ``output_dtype`` — [0, 1] for floating-point types, or the valid range for integer types (e.g., [0, 255] for uint8). :type output_range: tuple of float, optional :param normed: Whether to normalize the entropy values. When True, entropy is scaled relative to the maximum possible entropy determined by the number of categories. :type normed: bool, default=True :param max_entropy_categories: Specifies the number of categories to assume for maximum entropy normalization. Ignored if ``normed=False``. :type max_entropy_categories: int, optional :param verbose: If True, prints progress updates and diagnostic information. :type verbose: bool, default=False :param \*\*params: Additional optional parameters controlling multiprocessing and output: - **nbrcpu** : int Number of CPU cores to use. Defaults to the number of available cores minus one. - **start_method** : str Multiprocessing start method (e.g., ``"spawn"`` or ``"fork"``). - **entropy_as_ubyte** : bool, optional Deprecated. Use ``output_dtype="uint8"`` instead. - **compress** : bool, default=False If True, compresses the final GeoTIFF using LZW compression. :returns: **output_file** -- Path to the resulting entropy raster file. :rtype: str .. rubric:: Notes - Each processing block computes entropy over the given categories using the internal :func:`_block_entropy` function and sends results to a queue. - The function :func:`_combine_entropy_blocks` merges these intermediate results into a single raster file. - The operation can be parallelized across multiple CPUs to handle large raster datasets efficiently. - Block decomposition uses :func:`~riogrande.prepare.create_views`. - Worker pool is created with :func:`~riogrande.helper.get_or_set_context` and sized using :func:`~riogrande.helper.get_nbr_workers`. - Output filename is constructed with :func:`~riogrande.helper.output_filename`. .. seealso:: :func:`compute_interaction` Compute spatial interaction between categorical bands. :func:`extract_categories` Extract per-category maps with optional filtering. :func:`_block_entropy` Worker function computing entropy for a single block. .. py:function:: compute_interaction(source, output_file, block_size, blur_params, categories = None, output_dtype = None, output_range = None, standardize = False, normed = True, verbose = False, **params) Compute interaction strength between categorical raster bands. This function computes pairwise, three-way, or higher-order interaction measures among categorical raster bands (e.g., land-cover classes). The computation is performed block-wise using multiprocessing, where each worker processes a subset of the raster and pushes intermediate results to a queue. The results are then combined into a single output raster representing spatial interaction strength. :param source: Path to the categorical raster file (e.g., a land-cover map) or an initialized `Source` object providing access to the dataset. :type source: str or Source :param output_file: Path where the final interaction raster will be saved. :type output_file: str :param block_size: Block dimensions ``(width, height)`` in pixels that define the size of the chunks processed by each multiprocessing worker. :type block_size: tuple of int :param blur_params: Dictionary containing parameters for Gaussian blur or other preprocessing. Used primarily for formatting the output filename. Should include either ``'sigma'`` or ``'diameter'`` in spatial units. :type blur_params: dict :param categories: List of categories (e.g., land-cover types) to include in the interaction computation. If not provided, all categories available in the source raster are used. :type categories: list, optional :param output_dtype: Desired data type for the output raster. If not provided, defaults to ``numpy.uint8``. :type output_dtype: str, type, or None, optional :param output_range: Minimum and maximum values for scaling the output. If not specified, inferred automatically from ``output_dtype``—[0, 1] for floats, or the valid integer range for integer types (e.g., [0, 255] for uint8). :type output_range: tuple of float, optional :param standardize: Whether to standardize the data prior to computing interactions. This can improve comparability across categories with different magnitudes or distributions. :type standardize: bool, default=False :param normed: If True, normalizes the computed interaction strengths relative to their theoretical maximum values. :type normed: bool, default=True :param verbose: If True, prints detailed progress and diagnostic messages during processing. :type verbose: bool, default=False :param \*\*params: Additional optional parameters controlling multiprocessing and output: - **nbrcpu** : int Number of CPU cores to use. Defaults to the number of available cores minus one. - **start_method** : str Multiprocessing start method (e.g., ``"spawn"`` or ``"fork"``). - **interaction_as_ubyte** : bool, optional Deprecated. Use ``output_dtype="uint8"`` instead. - **compress** : bool, default=False If True, compresses the final GeoTIFF using LZW compression. :returns: **output_file** -- Path to the resulting interaction raster file. :rtype: str .. rubric:: Notes - The computation proceeds in parallel by dividing the raster into independent processing blocks. - Each worker executes an internal :func:`_block_interaction` task and writes its results to a multiprocessing queue. - The :func:`_combine_interaction_blocks` function merges all block results into a single output file. - Designed for efficient large-scale raster analysis on multicore systems. - Block decomposition uses :func:`~riogrande.prepare.create_views`. - Worker pool is created with :func:`~riogrande.helper.get_or_set_context` and sized using :func:`~riogrande.helper.get_nbr_workers`. - Output filename is constructed with :func:`~riogrande.helper.output_filename`. .. seealso:: :func:`compute_entropy` Compute spatial entropy of categorical bands. :func:`extract_categories` Extract per-category maps with optional filtering. :func:`_block_interaction` Worker function computing interaction for a single block. .. py:function:: extract_categories(source, categories, output_file, block_size, img_filter = None, filter_params = None, output_dtype = None, output_params = None, filter_output_range = None, verbose = False, **params) Extract per-category maps from a raster, optionally apply a filter, and export to a file. This function processes a categorical raster by separating specified categories into individual bands. Optionally, a filter function (e.g., Gaussian smoothing) can be applied to each category band. The processing is performed block-wise with multiprocessing, and the results are combined into a single output raster. :param source: Path to the input `.tif` file or a `Source` object providing access to it. :type source: str or Source :param categories: List of categorical values to separate into individual bands. :type categories: list :param output_file: Path where the resulting raster will be written. :type output_file: str :param block_size: Size of processing blocks in pixels as ``(width, height)``. :type block_size: tuple of int :param img_filter: A function to apply to each category band (e.g., ``skimage.filters.gaussian``). If None, no filtering is applied and ``filter_params`` is ignored. :type img_filter: callable, optional :param filter_params: Parameters to pass to `img_filter`. Required if `img_filter` is provided. :type filter_params: dict, optional :param output_dtype: Data type of the output bands. Deprecated; use ``output_params['as_dtype']`` instead. :type output_dtype: str or type, optional :param output_params: Dictionary of output settings: - **as_dtype** : data type for the filtered output (overrides `output_dtype`) - **nodata** : value used for missing data (default: None) - **bigtiff** : bool, whether to create a BIGTIFF for >4GB files - **output_range** : tuple, output value range for data conversion :type output_params: dict, optional :param filter_output_range: Expected value range of the filtered data. Recommended when `img_filter` is used. :type filter_output_range: tuple, optional :param verbose: Print processing information and progress. :type verbose: bool, default=False :param \*\*params: Additional optional arguments: - **nbrcpu** : int, number of CPU cores to use (default: available cores minus one) - **start_method** : str, multiprocessing start method (e.g., 'spawn' or 'fork') - **compress** : bool, if True, compress the final output with LZW :returns: **output_file** -- Path to the resulting raster file containing extracted and optionally filtered category bands. :rtype: str .. rubric:: Notes - Deprecated parameters (`blur_as_int`, `output_dtype`) are internally mapped to `output_params['as_dtype']` with warnings. - Borders for filtering are automatically computed based on the filter kernel using :func:`~convster.filters.gaussian.compatible_border_size`. - Block decomposition uses :func:`~riogrande.prepare.create_views`. - Worker pool is created with :func:`~riogrande.helper.get_or_set_context` and sized using :func:`~riogrande.helper.get_nbr_workers`. .. seealso:: :func:`compute_entropy` Compute spatial entropy of categorical bands. :func:`compute_interaction` Compute spatial interaction between categorical bands. :func:`apply_filter` Apply a filter to one or more bands of a raster.