wolfhece.hydrometry.rain_SPW ============================ .. py:module:: wolfhece.hydrometry.rain_SPW .. autoapi-nested-parse:: Author: HECE - University of Liege, Pierre Archambeau Date: 2024 Copyright (c) 2024 University of Liege. All rights reserved. This script and its content are protected by copyright law. Unauthorized copying or distribution of this file, via any medium, is strictly prohibited. Module Contents --------------- .. py:class:: GlobalStats .. py:attribute:: mean :type: float .. py:attribute:: median :type: float .. py:attribute:: std :type: float .. py:attribute:: min :type: float .. py:attribute:: max :type: float .. py:attribute:: q25 :type: float .. py:attribute:: q75 :type: float .. py:attribute:: skewness :type: float .. py:attribute:: kurtosis :type: float .. py:attribute:: num_of_values :type: int .. py:attribute:: total_num_of_values :type: int .. py:attribute:: coverage :type: float .. py:class:: MissingValues .. py:attribute:: number :type: int .. py:attribute:: percentage :type: float .. py:class:: DateRange .. py:attribute:: start :type: pandas.Timestamp .. py:attribute:: end :type: pandas.Timestamp .. py:attribute:: all_ranges :type: list[tuple[pandas.Timestamp, pandas.Timestamp]] .. py:class:: YearlyStats .. py:attribute:: mean :type: float .. py:attribute:: median :type: float .. py:attribute:: std :type: float .. py:attribute:: min :type: float .. py:attribute:: max :type: float .. py:attribute:: q25 :type: float .. py:attribute:: q75 :type: float .. py:attribute:: skewness :type: float .. py:attribute:: kurtosis :type: float .. py:class:: MonthlyStats .. py:attribute:: mean :type: float .. py:attribute:: median :type: float .. py:attribute:: std :type: float .. py:attribute:: min :type: float .. py:attribute:: max :type: float .. py:attribute:: q25 :type: float .. py:attribute:: q75 :type: float .. py:attribute:: skewness :type: float .. py:attribute:: kurtosis :type: float .. py:class:: YearsStats .. py:attribute:: number_of_years :type: int .. py:attribute:: number_of_complete_years :type: int .. py:class:: StatisticsSeries .. py:attribute:: name :type: str .. py:attribute:: time_step :type: datetime.timedelta :value: None .. py:attribute:: global_stats :type: GlobalStats :value: None .. py:attribute:: missing_values :type: MissingValues :value: None .. py:attribute:: date_range :type: DateRange :value: None .. py:attribute:: years :type: YearsStats :value: None .. py:attribute:: monthly_statistics :type: dict[int, MonthlyStats] :value: None .. py:attribute:: yearly_statistics :type: dict[int, YearlyStats] :value: None .. py:data:: STATS_HOURS_IRM .. py:data:: STATS_MINUTES_IRM .. py:class:: SPW_pluviographs(variable: str, timestep: str, credential: str = None, store_dir: str | pathlib.Path = None) Management of the rain data from the SPW-MI website through its data download interface The data are stored in a local directory in csv or parquet format, and can be loaded in memory as pandas Series for analysis and plotting. .. py:attribute:: _timestep_dl :value: None .. py:attribute:: _data .. py:attribute:: _store_dir .. py:attribute:: _format :value: None .. py:attribute:: _variable .. py:attribute:: _timestep .. py:attribute:: _stations :value: None .. py:method:: _get_inactive_stations() -> pandas.DataFrame All the stations available in the SPW-MI website for the given variable and timestep, but not present in self._stations, which are the active stations for the given variable and timestep. This can be useful to identify stations that have data available but are not currently active for the given variable and timestep, for example to check if there are new stations that have been added since the last update of the hydrometry structure, or if there are stations that have been deactivated for some reason. .. py:method:: force_update_hydrometry_structure() Force the update of the hydrometry structure to get the latest available stations and timeseries from the SPW-MI website. This can be useful if new stations or timeseries have been added since the last update, or if there have been changes in the station metadata (e.g., name, code, location, etc.) that are not yet reflected in the local structure. .. py:method:: get_all_series(key: Literal['tsid', 'name'] = 'name', to_sort: bool = True) -> dict[str, pandas.Series] Get all the series currently loaded in memory, with station names as keys and pandas Series as values .. py:method:: get_station_names_inside_polygon(polygon: wolfhece.PyVertexvectors.vector | shapely.geometry.Polygon) -> list[str] Get the names of the stations inside a given polygon :param polygon: a shapely Polygon or a vector of vertices defining the polygon .. py:method:: get_nearest_station_name(point: wolfhece.PyVertexvectors.wolfvertex | shapely.geometry.Point | list[float, float]) -> str Get the name of the nearest station to a given point :param point: a shapely Point, a vector of coordinates, or a list of coordinates [x, y] .. py:method:: resample(rain: pandas.Series, new_timestep: datetime.timedelta, method: str = 'sum') .. py:method:: get_names() .. py:method:: _get_names_lower() .. py:method:: get_codes() .. py:method:: get_tsid_from_name(name: str) .. py:method:: get_tsid_from_code(code: str) .. py:method:: get_tsid(name: str = '', code: str = '') .. py:method:: get_name_from_tsid(ts_id: str) .. py:method:: get_code_from_tsid(ts_id: str) .. py:method:: get_data(fromyear: int, toyear: int, code: str = '', name: str = '', filterna: bool = True, timezone: str = 'GMT+1') Get the rain data from the SPW-MI website for a given station code or name, and for a given period of years If filterna is True, the NaN values are replaced by 0. Otherwise, they are kept as NaN. :param fromyear: the starting year of the period (inclusive) :param toyear: the ending year of the period (inclusive) :param code: the station code (if name is not provided) :param name: the station name (if code is not provided) :param filterna: whether to replace NaN values by 0 (default: True) :param timezone: the timezone of the data (default: 'GMT+1') :return: a pandas Series containing the rain data for the given station and period, indexed by datetime .. py:method:: get_year_data(year: int = 2021, code: str = '', name: str = '', filterna: bool = True, timezone: str = 'GMT+1') Get the rain data from the SPW-MI website for a given station code or name, and for a given year :param year: the year of the data to retrieve :param code: the station code (if name is not provided) :param name: the station name (if code is not provided) :param filterna: whether to replace NaN values by 0 (default: True) :param timezone: the timezone of the data (default: 'GMT+1') :return: a pandas Series containing the rain data for the given station and year, indexed by datetime .. py:method:: get_month_data(month: int = 7, year: int = 2021, code: str = '', name: str = '', timezone: str = 'GMT+1') Retrieving hourly data from the SPW website (GMT+1) :param month: the month of the year for which to retrieve data (1-12) :param year: the year for which to retrieve data :param code: the station code (if name is not provided) :param name: the station name (if code is not provided) :param timezone: the timezone of the data (default: 'GMT+1') -- ('UTC', 'GMT+1', 'GMT+2', etc.) .. py:method:: compute_stats_Q(rain: pandas.Series, listhours: list[int]) -> numpy.ndarray :classmethod: Computes the maximum cumulative rainfall for different durations based on convolution with a vector of number of hours Unity : mm :param rain: the time series of rainfall to analyze :param listhours: the list of durations in hours for which to calculate the stats (e.g., [1,2,3,6,12,24]) :return: a numpy array containing the maximum values for each duration .. py:method:: compute_stats_i(rain: pandas.Series, listhours: list[int]) -> numpy.ndarray :classmethod: Computes the maximum average intensities for different durations based on convolution with a vector of number of hours Unity : mm/h :param rain: the time series of rainfall to analyze :param listhours: the list of durations in hours for which to calculate the stats (e.g., [1,2,3,6,12,24]) :return: a numpy array containing the maximum values for each duration .. py:method:: plot(data: pandas.Series, toshow=False, xbounds=None, ticks='M', label: str = None, figax=None) :classmethod: .. py:method:: plot_periodic(data: pandas.Series, origin: datetime.datetime, length_in_months: int = 12, toshow: bool = False, figax: tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes] = None) -> tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes] :classmethod: Comparison of several years over the same horizon of a total year starting from a given date, for example to compare the summer periods of several years starting from the 1st of July. The x-axis is in days from the origin date, and the y-axis is the rainfall in mm/h. :param data: the time series of rainfall to plot :param origin: the start date of the period to plot (e.g., July 1, 2021) :param offset_in_months: the number of months to plot from the start date (e.g., 3 to plot from July 1 to September 30) :param toshow: if True, displays the plot at the end of the function (default: False) :param figax: a tuple (fig, ax) of matplotlib to plot on an existing figure (default: None) :return: a tuple (fig, ax) of matplotlib containing the created or modified plot .. py:method:: save(data: pandas.Series, filename: str, format: Literal['csv', 'parquet'] = 'csv') .. py:method:: load(name: str = '', code: int = 0, filename: str = '', fromdate: datetime.datetime = None, todate: datetime.datetime = None, format: Literal['csv', 'parquet'] = 'parquet') .. py:method:: _download_one(dirout: pathlib.Path | str = None, name: str = '', fromyear: int = 2002, toyear: int = dt.datetime.now().year, timezone: str = 'GMT+1', format: Literal['csv', 'parquet'] = 'parquet', force: bool = False) .. py:method:: download_all(dirout: pathlib.Path | str = None, fromyear: int = 2002, toyear: int = dt.datetime.now().year, timezone: str = 'GMT+1', format: Literal['csv', 'parquet'] = 'parquet', force: bool = False) .. py:method:: update_all(dirin: pathlib.Path | str = None, format: Literal['csv', 'parquet'] = 'parquet') .. py:method:: update_one(ts_id: str, dirin: pathlib.Path | str = None, format: Literal['csv', 'parquet'] = 'parquet') .. py:method:: load_all(dirin: pathlib.Path | str = None, fromdate: datetime.datetime = None, todate: datetime.datetime = None, timezone: str = 'GMT+1', format: Literal['csv', 'parquet'] = 'parquet') .. py:method:: analyze(serie: pandas.Series, name: str, volume: bool = True, min_daily_rainfall: float = 0.1, force_unique_timestep: bool = False) -> StatisticsSeries :classmethod: Analyze the time serie and return statistics - Global statistics: mean, median, std, min, max - Missing values : number and percentage of missing values - Date range: start and end dates - Years : number of years, number of complete years - Yearly statistics : mean, median, std, min, max per year - Monthly statistics : mean, median, std, min, max per month :param serie: pd.Series, time series to analyze :param name: str, name of the time series :param volume: bool, if True, convert intensity to volume per month/year :param min_daily_rainfall: float, minimum daily rainfall to consider :param force_unique_timestep: bool, if True, enforce unique time step between non-NA values, otherwise, the time step is set to None if inconsistent time steps are found :return: StatisticsSeries, statistics .. py:method:: analyze_all(min_value: float = 0.1, volume: bool = True) -> list[StatisticsSeries] Analyze all the series currently loaded in memory and return a dictionary of statistics with station names as keys and statistics as values .. py:method:: plot_time_ranges(statistics: list[StatisticsSeries], figax=None) :classmethod: Create a plot showing the time ranges of the given statistics :param statistics: list of dict, statistics to plot :param figax: tuple of (fig, ax) or None :return: tuple of (fig, ax) .. py:method:: plot_yearly_statistics(statistics: StatisticsSeries, title: str = 'Yearly Statistics', figax=None) :classmethod: Plot yearly statistics with error bars (mean +/- std) :param statistics: dict, statistics to plot :param title: str, title of the plot :param figax: tuple of (fig, ax) or None :return: tuple of (fig, ax) .. py:method:: plot_yearly_boxplots(statistics: StatisticsSeries, title: str = 'Yearly Boxplots', figax=None) :classmethod: Plot yearly statistics as boxplots :param statistics: dict, statistics to plot :param title: str, title of the plot :param figax: tuple of (fig, ax) or None :return: tuple of (fig, ax) .. py:method:: plot_monthly_statistics(statistics: StatisticsSeries, title: str = 'Monthly Statistics', figax=None) :classmethod: Plot monthly statistics with error bars (mean +/- std) :param statistics: dict, statistics to plot :param title: str, title of the plot :param figax: tuple of (fig, ax) or None :return: tuple of (fig, ax) .. py:method:: plot_monthly_boxplots(statistics: StatisticsSeries, title: str = 'Monthly Boxplots', figax=None, unit_factor: float = 1.0) :classmethod: Plot monthly statistics as boxplots :param statistics: dict, statistics to plot :param title: str, title of the plot :param figax: tuple of (fig, ax) or None :param unit_factor: float, factor to convert units (e.g. from m3 to 1000 m3) :return: tuple of (fig, ax) .. py:class:: SPW_pluviographs_1h(store_dir: str | pathlib.Path = None) Bases: :py:obj:`SPW_pluviographs` .. autoapi-inheritance-diagram:: wolfhece.hydrometry.rain_SPW.SPW_pluviographs_1h :parts: 1 :private-bases: Management of the rain data from the SPW-MI website through its data download interface The data are stored in a local directory in csv or parquet format, and can be loaded in memory as pandas Series for analysis and plotting. .. py:attribute:: _timestep_dl .. py:method:: download_all(fromyear=2002, toyear=dt.datetime.now().year, timezone='GMT+1', format='parquet', force: bool = False) .. py:method:: download_one(name='', fromyear=2002, toyear=dt.datetime.now().year, timezone='GMT+1', format='parquet', force: bool = False) .. py:method:: load_all(fromdate=None, todate=None, timezone='GMT+1', format='parquet') .. py:method:: update_all(format: Literal['csv', 'parquet'] = 'parquet') .. py:method:: update_one(ts_id: str, dirin: pathlib.Path | str = None, format: Literal['csv', 'parquet'] = 'parquet') .. py:class:: SPW_pluviographs_5min(store_dir: str | pathlib.Path = None) Bases: :py:obj:`SPW_pluviographs` .. autoapi-inheritance-diagram:: wolfhece.hydrometry.rain_SPW.SPW_pluviographs_5min :parts: 1 :private-bases: Management of the rain data from the SPW-MI website through its data download interface The data are stored in a local directory in csv or parquet format, and can be loaded in memory as pandas Series for analysis and plotting. .. py:attribute:: _timestep_dl .. py:method:: download_all(fromyear=2002, toyear=dt.datetime.now().year, timezone='GMT+1', format='parquet', force: bool = False) .. py:method:: download_one(name='', fromyear=2002, toyear=dt.datetime.now().year, timezone='GMT+1', format='parquet', force: bool = False) .. py:method:: load_all(fromdate=None, todate=None, timezone='GMT+1', format='parquet') .. py:method:: update_all(format: Literal['csv', 'parquet'] = 'parquet') .. py:method:: update_one(ts_id: str, dirin: pathlib.Path | str = None, format: Literal['csv', 'parquet'] = 'parquet') .. py:data:: my_5min