Basic Usage
pygeoprocessing uses GDAL to read and write all GDAL-supported
raster and
vector file formats.
The utility functions get_raster_info
and get_vector_info read a file’s
metadata into a dictionary. These values can be used as parameters for a
variety of pygeoprocessing functions where properties like pixel sizes,
bounding boxes, and nodata values need to be defined.
>>> import pygeoprocessing
>>> raster_a_path = 'dem_utm.tif'
>>> raster_b_path = 'mswep.tif'
>>> raster_info = pygeoprocessing.get_raster_info(raster_a_path)
>>> raster_info
{'block_size': [256, 256],
'bounding_box': [364636.41313796316,
5229554.621510817,
530566.4131379632,
5362724.621510817],
'datatype': 3,
'file_list': ['dem_utm.tif'],
'geotransform': (364636.41313796316, 30.0, 0.0, 5362724.621510817, 0.0, -30.0),
'n_bands': 1,
'nodata': [-32768.0],
'numpy_type': <class 'numpy.int16'>,
'overviews': [],
'pixel_size': (30.0, -30.0),
'projection_wkt': 'PROJCS["WGS 84 / UTM zone 10N",GEOGCS["WGS '
'84",DATUM["WGS_1984",SPHEROID["WGS '
'84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-123],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","32610"]]',
'raster_size': (5531, 4439)}
Aligning rasters
align_and_resize_raster_stack
function takes a list of overlapping rasters that may all have different
extents, pixel sizes, and projections. The resulting stack of rasters will
all be aligned and ready for use in pixel-stack operations such as
raster_map or
raster_calculator.>>> base_raster_list = [raster_a_path, raster_b_path]
>>> target_raster_list = [x.replace('.tif', '_aligned.tif') for x in base_raster_list]
>>> pygeoprocessing.align_and_resize_raster_stack(
... base_raster_path_list=base_raster_list,
... target_raster_path_list=target_raster_list,
... resample_method_list=['bilinear', 'bilinear'],
... target_pixel_size=raster_info['pixel_size'],
... bounding_box_mode='intersection',
... target_projection_wkt=raster_info['projection_wkt'])
Interfacing with files
Pygeoprocessing functions typically interact with GIS datasets via their
filename. Some functions, such as
align_and_resize_raster_stack,
operate on all bands of a raster. Other times it is necessary to specify
which band of a raster, or which layer of a vector should be used.
For example, the zonal_statistics
function requires the user to specify which band of the raster from which to
calculate statistics. This is done using a tuple, or list, where the
first element is the filepath, and the second is the band index:
>>> path_band_tuple = (raster_b_path, 1) # band indices start at 1 (not 0), by GDAL convention
>>> stats_dict = pygeoprocessing.zonal_statistics(
... base_raster_path_band=path_band_tuple,
... aggregate_vector_path='watersheds.gpkg',
... aggregate_layer_name='watersheds') # if the vector only contains 1 layer, this can be `None`, or ommitted
raster_calculator:>>> raster_path_band_list = [(raster_a_path, 1), (raster_b_path, 1)]
Using GDAL’s virtual file system handlers
Pygeoprocessing reads all input data using GDAL. In addition to “regular” files located on your local file system, GDAL can read other types of files using virtual file system handlers. Using virtual file system handlers, you can directly access files hosted on a remote server over HTTP; files hosted with cloud storage services such as AWS S3 or Google Cloud Storage (including access-controlled files); zip archives; and more. See the GDAL Virtual File Systems documentation for details.
To use a virtual file system handler, you must add the appropriate /vsi/ prefix to the
file path or URL. For example, the /vsicurl/ prefix tells GDAL to read a file from the given
URL over HTTP/FTP. You can pass in paths with VSI prefixes directly to pygeoprocessing.
For example:
pygeoprocessing.get_raster_info(
'/vsicurl/https://storage.googleapis.com/natcap-data-cache/global/nasa-srtm-v3-1s/srtm-v3-1s.tif'
)
Note that support for virtual file systems is specific to each of GDAL’s file format drivers. These will work for most but not all file formats.
GDAL can also write out data to a remote location using virtual file system handlers. pygeoprocessing
does not officially support this, but it may work in some cases. Note that, even if the output is
written to a remote location, many pygeoprocessing functions will write out temporary
intermediate results to the local file system.