Skip to content

Data Format Conversion Tools

Introduction to the earthstat.data_converter module, which provides efficient solutions for converting between different geospatial data formats, catering to a variety of analysis needs.

File Format Conversions

Converting NetCDF to GeoTIFF Format

convertToTIFF(input_dir)

Converts all NetCDF files in a directory to TIFF format and saves them in a subdirectory.

Parameters:

Name Type Description Default
input_dir str

Directory containing NetCDF files to be converted.

required

Returns:

Type Description
str

Path to the output directory containing the converted TIFF files.

Utilizes multiprocessing for efficiency. Creates a 'predictor_tiff' subdirectory for outputs.

Source code in earthstat/data_converter/netcdf_to_tiff.py
def convertToTIFF(input_dir):
    """
    Converts all NetCDF files in a directory to TIFF format and saves them in a subdirectory.

    Args:
        input_dir (str): Directory containing NetCDF files to be converted.

    Returns:
        str: Path to the output directory containing the converted TIFF files.

    Utilizes multiprocessing for efficiency. Creates a 'predictor_tiff' subdirectory for outputs.
    """
    nc_files = glob.glob(os.path.join(input_dir, '*.nc'))
    output_dir = os.path.join(input_dir, 'predictor_tiff')
    os.makedirs(output_dir, exist_ok=True)

    with ProcessPoolExecutor() as executor:
        futures = [executor.submit(netCDFToTiff, file, output_dir)
                   for file in nc_files]
        for _ in tqdm(as_completed(futures), total=len(futures), desc="Converting Files"):
            pass

    return output_dir

netCDFToTiff(netcdf_file, output_dir, default_crs='EPSG:4326')

Converts a NetCDF file to TIFF format using a specified or default CRS.

Parameters:

Name Type Description Default
netcdf_file str

Path to the NetCDF file to be converted.

required
output_dir str

Directory where the converted TIFF file will be saved.

required
default_crs str

Default Coordinate Reference System in EPSG code. Defaults to 'EPSG:4326'.

'EPSG:4326'

Converts NetCDF to TIFF and applies LZW compression. Assumes NetCDF has geospatial data.

Source code in earthstat/data_converter/netcdf_to_tiff.py
def netCDFToTiff(netcdf_file, output_dir, default_crs='EPSG:4326'):
    """
    Converts a NetCDF file to TIFF format using a specified or default CRS.

    Args:
        netcdf_file (str): Path to the NetCDF file to be converted.
        output_dir (str): Directory where the converted TIFF file will be saved.
        default_crs (str): Default Coordinate Reference System in EPSG code. Defaults to 'EPSG:4326'.

    Converts NetCDF to TIFF and applies LZW compression. Assumes NetCDF has geospatial data.
    """
    output_file = os.path.join(output_dir, os.path.basename(
        netcdf_file).replace(".nc", ".tif"))

    with rasterio.open(netcdf_file) as src:
        data = src.read()
        transform = src.transform
        crs = src.crs
        if crs is None:
            crs = CRS.from_string(default_crs)
        kwargs = src.profile.copy()
        kwargs.update(
            driver='GTiff',
            height=src.height,
            width=src.width,
            count=data.shape[0],
            dtype=data.dtype,
            crs=crs,
            transform=transform,
            compress="lzw"  # compression
        )

        with rasterio.open(output_file, 'w', **kwargs) as dst:
            dst.write(data)

Transforming HDF5 to GeoTIFF Format

hdf5ToGeoTIFF()

Convert HDF5 to GeoTIFF.

Source code in earthstat/data_converter/hdf5_to_tiff.py
def hdf5ToGeoTIFF():
    """
    Convert HDF5 to GeoTIFF.

    """
    pass