wolfhece.hydrology.climate_data_hourly
======================================

.. py:module:: wolfhece.hydrology.climate_data_hourly


Module Contents
---------------

.. py:data:: shapely_polygons
   :value: None


.. py:data:: DATADIR

.. py:data:: FIELDS_FILE
   :value: 'radclim-analogs-median.zarr'


.. py:data:: ANALOG_FILE
   :value: 'precipitation-analogs.zarr'


.. py:function:: transform_latlon_to_lambert72_array(lat_array: numpy.ndarray, lon_array: numpy.ndarray) -> tuple[numpy.ndarray, numpy.ndarray]

   Transform arrays of EPSG:4258 coordinates to Lambert 72 coordinates.

   Coordinates from IRM are in EPSG:4258, and we want to convert them to Lambert 72 (EPSG:31370).

   :lat_array: Array of latitudes in EPSG:4258
   :lon_array: Array of longitudes in EPSG:4258
   :return: tuple of two arrays of x and y coordinates in Lambert 72, with the same shape as the input arrays


.. py:function:: convert_pixels_to_squares(pixels: tuple[numpy.ndarray, numpy.ndarray]) -> tuple[numpy.ndarray, scipy.spatial.KDTree]

   From pixels coordinates, define squares around each pixel center.

   Corners are defined as the average of the pixel center and its neighbors.

   Returns a (NB, 4, 2) numpy array of corner coordinates and the KDTree.
   Each row contains 4 corners: lower-left, lower-right, upper-right, upper-left.


.. py:class:: ClimateDataHourly(datadir=DATADIR, fields_file=FIELDS_FILE, analog_file=ANALOG_FILE)

   Based on data from IRM : https://essd.copernicus.org/articles/17/6405/2025/

   Title : Hourly precipitation fields at 1 km resolution over Belgium from 1940 to 2016 based on the analog technique
   Authors : Elke Debrie, Jonathan Demaeyer, and Stéphane Vannitsem


   .. py:attribute:: datadir


   .. py:attribute:: fields_file
      :value: 'radclim-analogs-median.zarr'


   .. py:attribute:: analog_file
      :value: 'precipitation-analogs.zarr'


   .. py:attribute:: fields_ds
      :value: None


   .. py:attribute:: analog_ds
      :value: None


   .. py:attribute:: _xy_lbt72
      :value: None


   .. py:attribute:: _gdf
      :value: None


   .. py:property:: xy

      Return the spatial coordinates in Lambert 72 as a tuple of two 2D numpy arrays (x, y)
      with the same shape as the spatial dimensions of the datasets.


   .. py:method:: load_data(fields: bool = True, analogs: bool = True)

      Load the datasets from ZARR files and build the spatial geometries.

      :param fields: Whether to load the fields dataset.
      :param analogs: Whether to load the analog dataset.


   .. py:method:: _crop_gdf_to_region(region: shapely.geometry.Polygon)

      Crop the GeoDataFrame to the given region by selecting only the pixels whose square intersects the region.

      :param region: Polygon representing the area of interest in Lambert 72 coordinates
      :return: Cropped GeoDataFrame containing only the pixels that intersect the region.


   .. py:method:: _find_slices_for_region(region: shapely.geometry.Polygon)

      Find the row and column slices that correspond to the bounding box of the given region.

      :param region: Polygon representing the area of interest in Lambert 72 coordinates
      :return: tuple of (row_slice, col_slice) that can be used to index the datasets.


   .. py:method:: get_surface_and_fractions_of_pixels_in_region(region: shapely.geometry.Polygon) -> tuple[numpy.ndarray, float]

      For a given region, compute the fraction of each pixel that is covered by the region, and the total surface of the region.

      :param region: Polygon representing the area of interest in Lambert 72 coordinates
      :return: tuple of (fractions, surface) where fractions is a numpy array of the same length as
               the number of pixels, containing the fraction of each pixel covered by the region,
               and surface is the total area of the region in square meters.


   .. py:method:: crop_to_new_zarr(dirout: pathlib.Path, region: shapely.geometry.Polygon, filename: str = FIELDS_FILE, show_progress: bool = False, time_chunk_size: int = 720)

      Crop the fields dataset to *region* and write an optimised zarr.

      The output zarr uses a single spatial chunk (the full cropped
      extent) and large temporal chunks (*time_chunk_size*, default
      720 = 1 month hourly).  This layout is ideal for later time-
      series extraction: one file read per month instead of one per
      6 time-steps.

      :param dirout: Output directory.
      :param region: Region polygon (Lambert 72).
      :param filename: Output zarr name (without .zarr).
      :param show_progress: Show a dask progress bar.
      :param time_chunk_size: Number of time steps per chunk (720 = ~1 month hourly).


   .. py:method:: rechunk_source_zarr(dirout: pathlib.Path, filename: str = FIELDS_FILE, time_chunk_size: int = 720, lat_chunk_size: int = -1, lon_chunk_size: int = -1, show_progress: bool = False)

      Rechunk the full source zarr with an optimised layout.

      The original IRM zarr uses chunks of (6, 19, 36) which causes
      extremely poor read performance for time-series queries.  This
      method rewrites the dataset with much larger chunks.

      :param dirout: Output directory.
      :param filename: Output zarr name (without .zarr).
      :param time_chunk_size: Time steps per chunk (720 = ~1 month).
      :param lat_chunk_size: Latitude chunk size (-1 = full extent).
      :param lon_chunk_size: Longitude chunk size (-1 = full extent).
      :param show_progress: Show a dask progress bar.


   .. py:method:: get_precipitation(region: shapely.geometry.Polygon, time_interval: tuple[datetime.datetime, datetime.datetime]) -> pandas.Series

      Get precipitation for a given bounding box and time interval.

      :region: Polygon representing the area of interest in Lambert 72 coordinates
      :time_interval: (start_time, end_time) as a tuple of datetime objects
      :return: Pandas Series with time as index and precipitation [mm/h] as values