wolfhece.hydrology.climate_data_hourly
Module Contents
- wolfhece.hydrology.climate_data_hourly.transform_latlon_to_lambert72_array(lat_array: numpy.ndarray, lon_array: numpy.ndarray) tuple[numpy.ndarray, numpy.ndarray][source]
Transform arrays of EPSG:4258 coordinates to Lambert 72 coordinates.
Coordinates from IRM are in EPSG:4258, and we want to convert them to Lambert 72 (EPSG:31370).
- Lat_array:
Array of latitudes in EPSG:4258
- Lon_array:
Array of longitudes in EPSG:4258
- Returns:
tuple of two arrays of x and y coordinates in Lambert 72, with the same shape as the input arrays
- wolfhece.hydrology.climate_data_hourly.convert_pixels_to_squares(pixels: tuple[numpy.ndarray, numpy.ndarray]) tuple[numpy.ndarray, scipy.spatial.KDTree][source]
From pixels coordinates, define squares around each pixel center.
Corners are defined as the average of the pixel center and its neighbors.
Returns a (NB, 4, 2) numpy array of corner coordinates and the KDTree. Each row contains 4 corners: lower-left, lower-right, upper-right, upper-left.
- class wolfhece.hydrology.climate_data_hourly.ClimateDataHourly(datadir=DATADIR, fields_file=FIELDS_FILE, analog_file=ANALOG_FILE)[source]
Based on data from IRM : https://essd.copernicus.org/articles/17/6405/2025/
Title : Hourly precipitation fields at 1 km resolution over Belgium from 1940 to 2016 based on the analog technique Authors : Elke Debrie, Jonathan Demaeyer, and Stéphane Vannitsem
- property xy[source]
Return the spatial coordinates in Lambert 72 as a tuple of two 2D numpy arrays (x, y) with the same shape as the spatial dimensions of the datasets.
- load_data(fields: bool = True, analogs: bool = True)[source]
Load the datasets from ZARR files and build the spatial geometries.
- Parameters:
fields – Whether to load the fields dataset.
analogs – Whether to load the analog dataset.
- _crop_gdf_to_region(region: shapely.geometry.Polygon)[source]
Crop the GeoDataFrame to the given region by selecting only the pixels whose square intersects the region.
- Parameters:
region – Polygon representing the area of interest in Lambert 72 coordinates
- Returns:
Cropped GeoDataFrame containing only the pixels that intersect the region.
- _find_slices_for_region(region: shapely.geometry.Polygon)[source]
Find the row and column slices that correspond to the bounding box of the given region.
- Parameters:
region – Polygon representing the area of interest in Lambert 72 coordinates
- Returns:
tuple of (row_slice, col_slice) that can be used to index the datasets.
- get_surface_and_fractions_of_pixels_in_region(region: shapely.geometry.Polygon) tuple[numpy.ndarray, float][source]
For a given region, compute the fraction of each pixel that is covered by the region, and the total surface of the region.
- Parameters:
region – Polygon representing the area of interest in Lambert 72 coordinates
- Returns:
tuple of (fractions, surface) where fractions is a numpy array of the same length as the number of pixels, containing the fraction of each pixel covered by the region, and surface is the total area of the region in square meters.
- crop_to_new_zarr(dirout: pathlib.Path, region: shapely.geometry.Polygon, filename: str = FIELDS_FILE, show_progress: bool = False, time_chunk_size: int = 720)[source]
Crop the fields dataset to region and write an optimised zarr.
The output zarr uses a single spatial chunk (the full cropped extent) and large temporal chunks (time_chunk_size, default 720 = 1 month hourly). This layout is ideal for later time- series extraction: one file read per month instead of one per 6 time-steps.
- Parameters:
dirout – Output directory.
region – Region polygon (Lambert 72).
filename – Output zarr name (without .zarr).
show_progress – Show a dask progress bar.
time_chunk_size – Number of time steps per chunk (720 = ~1 month hourly).
- rechunk_source_zarr(dirout: pathlib.Path, filename: str = FIELDS_FILE, time_chunk_size: int = 720, lat_chunk_size: int = -1, lon_chunk_size: int = -1, show_progress: bool = False)[source]
Rechunk the full source zarr with an optimised layout.
The original IRM zarr uses chunks of (6, 19, 36) which causes extremely poor read performance for time-series queries. This method rewrites the dataset with much larger chunks.
- Parameters:
dirout – Output directory.
filename – Output zarr name (without .zarr).
time_chunk_size – Time steps per chunk (720 = ~1 month).
lat_chunk_size – Latitude chunk size (-1 = full extent).
lon_chunk_size – Longitude chunk size (-1 = full extent).
show_progress – Show a dask progress bar.
- get_precipitation(region: shapely.geometry.Polygon, time_interval: tuple[datetime.datetime, datetime.datetime]) pandas.Series[source]
Get precipitation for a given bounding box and time interval.
- Region:
Polygon representing the area of interest in Lambert 72 coordinates
- Time_interval:
(start_time, end_time) as a tuple of datetime objects
- Returns:
Pandas Series with time as index and precipitation [mm/h] as values