pygeoprocessing package
Subpackages
Submodules
Module contents
pygeoprocessing: geoprocessing routines for GIS.
__init__ module imports all the geoprocessing functions into this namespace.
- exception pygeoprocessing.ReclassificationMissingValuesError(missing_values, raster_path, value_map)[source]
Bases:
ExceptionRaised when a raster value is not a valid key to a dictionary.
- msg
- Type:
str
- missing_values
that are not keys in the dictionary
- Type:
list
- pygeoprocessing.align_bbox(geotransform, bbox)[source]
Pad a bounding box so that it aligns with the grid of a given geotransform.
Ignores row and column rotation.
- Parameters:
geotransform (list) – GDAL raster geotransform to align to
bbox (list) – bounding box in the form [minx, miny, maxx, maxy]
- Returns:
padded bounding box in the form [minx, miny, maxx, maxy]
- Raises:
ValueError if invalid geotransform or bounding box are provided –
- pygeoprocessing.array_equals_nodata(array, nodata)[source]
Check for the presence of
nodatavalues inarray.The comparison supports
numpy.nanand unset (None) nodata values.- Parameters:
array (numpy array) – the array to mask for nodata values.
nodata (number) – the nodata value to check for. Supports
numpy.nan.
- Returns:
A boolean numpy array with values of 1 where
arrayis equal tonodataand 0 otherwise.
- pygeoprocessing.calculate_slope(base_elevation_raster_path_band, target_slope_path, raster_driver_creation_tuple=('GTIFF', ('TILED=YES', 'BIGTIFF=YES', 'COMPRESS=LZW', 'BLOCKXSIZE=256', 'BLOCKYSIZE=256')))
Create a percent slope raster from DEM raster.
Base algorithm is from Zevenbergen & Thorne “Quantitative Analysis of Land Surface Topography” 1987 (https://doi.org/10.1002/esp.3290120107) although it has been modified to include the diagonal pixels by classic finite difference analysis.
For the following notation, we define each pixel’s DEM value by a letter with this spatial scheme:
a b c d e f g h i
Then the slope at
eis defined at([dz/dx]^2 + [dz/dy]^2)^0.5Where:
[dz/dx] = ((c+2f+i)-(a+2d+g)/(8*x_cell_size) [dz/dy] = ((g+2h+i)-(a+2b+c))/(8*y_cell_size)
In cases where a cell is nodata, we attempt to use the middle cell inline with the direction of differentiation (either in x or y direction). If no inline pixel is defined, we use
eand multiply the difference by2^0.5to account for the diagonal projection.- Parameters:
base_elevation_raster_path_band (string) – a path/band tuple to a raster of height values. (path_to_raster, band_index)
target_slope_path (string) – path to target slope raster; will be a 32 bit float GeoTIFF of same size/projection as calculate slope with units of percent slope.
raster_driver_creation_tuple (tuple) – a tuple containing a GDAL driver name string as the first element and a GDAL creation options tuple/list as the second. Defaults to geoprocessing.DEFAULT_GTIFF_CREATION_TUPLE_OPTIONS.
- Returns:
None
- pygeoprocessing.choose_dtype(*raster_paths)[source]
Choose an appropriate dtype for an output derived from the given inputs.
Returns the dtype with the greatest size/precision among the inputs, so that information will not be lost.
- Parameters:
*raster_paths – series of raster path strings
- Returns:
numpy dtype
- pygeoprocessing.choose_nodata(dtype)[source]
Choose an appropriate nodata value for data of a given dtype.
- Parameters:
dtype (numpy.dtype) – data type for which to choose nodata
- Returns:
number to use as nodata value
- pygeoprocessing.distance_transform_edt(base_region_raster_path_band, target_distance_raster_path, sampling_distance=(1.0, 1.0), working_dir=None, raster_driver_creation_tuple=('GTIFF', ('TILED=YES', 'BIGTIFF=YES', 'COMPRESS=LZW', 'BLOCKXSIZE=256', 'BLOCKYSIZE=256')))[source]
Calculate the euclidean distance transform on base raster.
Calculates the euclidean distance transform on the base raster in units of pixels multiplied by an optional scalar constant. The implementation is based off the algorithm described in: Meijster, Arnold, Jos BTM Roerdink, and Wim H. Hesselink. “A general algorithm for computing distance transforms in linear time.” Mathematical Morphology and its applications to image and signal processing. Springer, Boston, MA, 2002. 331-340.
The base mask raster represents the area to distance transform from as any pixel that is not 0 or nodata. It is computationally convenient to calculate the distance transform on the entire raster irrespective of nodata placement and thus produces a raster that will have distance transform values even in pixels that are nodata in the base.
- Parameters:
base_region_raster_path_band (tuple) – a tuple including file path to a raster and the band index to define the base region pixels. Any pixel that is not 0 and nodata are considered to be part of the region.
target_distance_raster_path (string) – path to the target raster that is the exact euclidean distance transform from any pixel in the base raster that is not nodata and not 0. The units are in
(pixel distance * sampling_distance).sampling_distance (tuple/list) – an optional parameter used to scale the pixel distances when calculating the distance transform. Defaults to (1.0, 1.0). First element indicates the distance traveled in the x direction when changing a column index, and the second element in y when changing a row index. Both values must be > 0.
working_dir (string) – If not None, indicates where temporary files should be created during this run.
raster_driver_creation_tuple (tuple) – a tuple containing a GDAL driver name string as the first element and a GDAL creation options tuple/list as the second. Defaults to a GTiff driver tuple defined at geoprocessing.DEFAULT_GTIFF_CREATION_TUPLE_OPTIONS.
- Returns:
None
- pygeoprocessing.iterblocks(raster_path_band, largest_block=65536, offset_only=False)[source]
Iterate across all the memory blocks in the input raster.
Result is a generator of block location information and numpy arrays.
This is especially useful when a single value needs to be derived from the pixel values in a raster, such as the sum total of all pixel values, or a sequence of unique raster values. In such cases,
raster_local_opis overkill, since it writes out a raster.As a generator, this can be combined multiple times with itertools.izip() to iterate ‘simultaneously’ over multiple rasters, though the user should be careful to do so only with prealigned rasters.
- Parameters:
raster_path_band (tuple) – a path/band index tuple to indicate which raster band iterblocks should iterate over.
largest_block (int) – Attempts to iterate over raster blocks with this many elements. Useful in cases where the blocksize is relatively small, memory is available, and the function call overhead dominates the iteration. Defaults to 2**20. A value of anything less than the original blocksize of the raster will result in blocksizes equal to the original size.
offset_only (boolean) – defaults to False, if True
iterblocksonly returns offset dictionary and doesn’t read any binary data from the raster. This can be useful when iterating over writing to an output.
- Yields:
If
offset_onlyis false, on each iteration, a tuple containing a dict of block data and a 2-dimensional numpy array are yielded. The dict of block data has these attributes:data['xoff']- The X offset of the upper-left-hand corner of the block.data['yoff']- The Y offset of the upper-left-hand corner of the block.data['win_xsize']- The width of the block.data['win_ysize']- The height of the block.
If
offset_onlyis True, the function returns only the block offset data and does not attempt to read binary data from the raster.
- pygeoprocessing.merge_bounding_box_list(bounding_box_list, bounding_box_mode)[source]
Create a single bounding box by union or intersection of the list.
- Parameters:
bounding_box_list (sequence) – a sequence of bounding box coordinates in the order [minx, miny, maxx, maxy].
mode (string) – either
'union'or'intersection'for the corresponding reduction mode.
- Returns:
A four tuple bounding box that is the union or intersection of the input bounding boxes.
- Raises:
ValueError – if the bounding boxes in
bounding_box_listdo not intersect if thebounding_box_modeis ‘intersection’.
- pygeoprocessing.raster_band_percentile(base_raster_path_band, working_sort_directory, percentile_list, heap_buffer_size=268435456, ffi_buffer_size=1024, geographic_crs_warn=False)
Calculate percentiles of a raster band based on pixel values.
- Parameters:
base_raster_path_band (tuple) – raster path band tuple to a raster that is of any integer or real type.
working_sort_directory (str) – path to a directory that does not exist or is empty. This directory will be used to create heapfiles with sizes no larger than
heap_buffer_sizewhich are written in the of the pattern N.dat where N is in the numbering 0, 1, 2, … up to the number of files necessary to handle the raster.percentile_list (list) – sorted list of percentiles to report must contain values in the range [0, 100].
heap_buffer_size (int) – defines approximately how many elements to hold in a single heap file. This is proportional to the amount of maximum memory to use when storing elements before a sort and write to disk.
ffi_buffer_size (int) – defines how many elements will be stored per heap file buffer for iteration.
geographic_crs_warn (boolean) – defaults to False. If True, a warning will be issued if the base raster has a geographic CRS.
- Returns:
A list of len(percentile_list) elements long containing the percentile values (ranging from [0, 100]) in
base_raster_path_bandwhere the interpolation scheme is “higher” (i.e. any percentile splits will select the next element higher than the percentile cutoff).
- pygeoprocessing.raster_map(op, rasters, target_path, target_nodata=None, target_dtype=None, raster_driver_creation_tuple=('GTIFF', ('TILED=YES', 'BIGTIFF=YES', 'COMPRESS=LZW', 'BLOCKXSIZE=256', 'BLOCKYSIZE=256')))[source]
Apply a pixelwise function to a series of raster inputs.
The output raster will have nodata where any input raster has nodata. Raster inputs are split into aligned blocks, and the function is applied individually to each stack of blocks (as numpy arrays).
- Parameters:
op (function) – Function to apply to the inputs. It should accept a number of arguments equal to the length of
*inputs. It should return a numpy array with the same shape as its array input(s).rasters (list[str]) – Paths to rasters to input to
op, in the order that they will be passed toop. All rasters should be aligned and have the same dimensions.target_path (str) – path to write out the output raster.
target_nodata (number) – Nodata value to use for the output raster. Optional. If not provided, a suitable nodata value will be chosen.
target_dtype (numpy.dtype) – dtype to use for the output. Optional. If not provided, a suitable dtype will be chosen.
raster_driver_creation_tuple (tuple) – a tuple containing a GDAL driver name string as the first element and a GDAL creation options tuple/list as the second. Defaults to geoprocessing.DEFAULT_GTIFF_CREATION_TUPLE_OPTIONS. If the
target_dtypeis int8, thePIXELTYPE=SIGNEDBYTEoption will be added to the creation options tuple if it is not already there.
- Returns:
None
- pygeoprocessing.raster_reduce(function, raster_path_band, initializer, mask_nodata=True, largest_block=65536)[source]
Cumulatively apply a reducing function to each block of a raster.
This effectively reduces the entire raster to a single value, but it works by blocks to be memory-efficient.
The
functionsignature should befunction(aggregator, block), whereaggregatoris the aggregated value so far, andblockis a flattened numpy array containing the data from the block to reduce next.functionis called once on each block. On the firstfunctioncall,aggregatoris initialized withinitializer. The return value from eachfunctioncall is passed in as theaggregatorargument to the subsequentfunctioncall. When all blocks have been reduced, the return value of the finalfunctioncall is returned.Example
Calculate the sum of all values in a raster:
raster_reduce(lambda total, block: total + numpy.sum(block), (raster_path, 1), 0)
Calculate a histogram of all values in a raster:
def add_to_histogram(histogram, block): return histogram + numpy.histogram(block, bins=10)[0] raster_reduce(add_to_histogram, (raster_path, 1), numpy.zeros(10))
Calculate the sum of all values in a raster, excluding nodata:
nodata = pygeoprocessing.get_raster_info(raster_path)['nodata'][0] def sum_excluding_nodata(total, block): return total + numpy.sum(block[block != nodata]) raster_reduce(sum_excluding_nodata, (raster_path, 1), 0)
- Parameters:
function (func) – function to apply to each raster block
raster_path_band (tuple) – (path, band) tuple of the raster to reduce
initializer (obj) – value to initialize the aggregator for the first function call
mask_nodata (bool) – if True, mask out nodata before aggregating. A flattened array of non-nodata pixels from each block is passed to the
function. if False, each block is passed to thefunctionwithout masking.largest_block (int) – largest block parameter to pass to
iterblocks
- Returns:
aggregate value, the final value returned from
function