wolfhece.hydrometry.rain_SPW

Author: HECE - University of Liege, Pierre Archambeau Date: 2024

Copyright (c) 2024 University of Liege. All rights reserved.

This script and its content are protected by copyright law. Unauthorized copying or distribution of this file, via any medium, is strictly prohibited.

Module Contents

class wolfhece.hydrometry.rain_SPW.GlobalStats[source]
mean: float[source]
median: float[source]
std: float[source]
min: float[source]
max: float[source]
q25: float[source]
q75: float[source]
skewness: float[source]
kurtosis: float[source]
num_of_values: int[source]
total_num_of_values: int[source]
coverage: float[source]
class wolfhece.hydrometry.rain_SPW.MissingValues[source]
number: int[source]
percentage: float[source]
class wolfhece.hydrometry.rain_SPW.DateRange[source]
start: pandas.Timestamp[source]
end: pandas.Timestamp[source]
all_ranges: list[tuple[pandas.Timestamp, pandas.Timestamp]][source]
class wolfhece.hydrometry.rain_SPW.YearlyStats[source]
mean: float[source]
median: float[source]
std: float[source]
min: float[source]
max: float[source]
q25: float[source]
q75: float[source]
skewness: float[source]
kurtosis: float[source]
class wolfhece.hydrometry.rain_SPW.MonthlyStats[source]
mean: float[source]
median: float[source]
std: float[source]
min: float[source]
max: float[source]
q25: float[source]
q75: float[source]
skewness: float[source]
kurtosis: float[source]
class wolfhece.hydrometry.rain_SPW.YearsStats[source]
number_of_years: int[source]
number_of_complete_years: int[source]
class wolfhece.hydrometry.rain_SPW.StatisticsSeries[source]
name: str[source]
time_step: datetime.timedelta = None[source]
global_stats: GlobalStats = None[source]
missing_values: MissingValues = None[source]
date_range: DateRange = None[source]
years: YearsStats = None[source]
monthly_statistics: dict[int, MonthlyStats] = None[source]
yearly_statistics: dict[int, YearlyStats] = None[source]
wolfhece.hydrometry.rain_SPW.STATS_HOURS_IRM[source]
wolfhece.hydrometry.rain_SPW.STATS_MINUTES_IRM[source]
class wolfhece.hydrometry.rain_SPW.SPW_pluviographs(variable: str, timestep: str, credential: str = None, store_dir: str | pathlib.Path = None)[source]

Management of the rain data from the SPW-MI website through its data download interface The data are stored in a local directory in csv or parquet format, and can be loaded in memory as pandas Series for analysis and plotting.

_timestep_dl = None[source]
_data[source]
_store_dir[source]
_format = None[source]
_variable[source]
_timestep[source]
_stations = None[source]
_get_inactive_stations() pandas.DataFrame[source]

All the stations available in the SPW-MI website for the given variable and timestep, but not present in self._stations, which are the active stations for the given variable and timestep.

This can be useful to identify stations that have data available but are not currently active for the given variable and timestep, for example to check if there are new stations that have been added since the last update of the hydrometry structure, or if there are stations that have been deactivated for some reason.

force_update_hydrometry_structure()[source]

Force the update of the hydrometry structure to get the latest available stations and timeseries from the SPW-MI website.

This can be useful if new stations or timeseries have been added since the last update, or if there have been changes in the station metadata (e.g., name, code, location, etc.) that are not yet reflected in the local structure.

get_all_series(key: Literal['tsid', 'name'] = 'name', to_sort: bool = True) dict[str, pandas.Series][source]

Get all the series currently loaded in memory, with station names as keys and pandas Series as values

get_station_names_inside_polygon(polygon: wolfhece.PyVertexvectors.vector | shapely.geometry.Polygon) list[str][source]

Get the names of the stations inside a given polygon

Parameters:

polygon – a shapely Polygon or a vector of vertices defining the polygon

get_nearest_station_name(point: wolfhece.PyVertexvectors.wolfvertex | shapely.geometry.Point | list[float, float]) str[source]

Get the name of the nearest station to a given point

Parameters:

point – a shapely Point, a vector of coordinates, or a list of coordinates [x, y]

resample(rain: pandas.Series, new_timestep: datetime.timedelta, method: str = 'sum')[source]
get_names()[source]
_get_names_lower()[source]
get_codes()[source]
get_tsid_from_name(name: str)[source]
get_tsid_from_code(code: str)[source]
get_tsid(name: str = '', code: str = '')[source]
get_name_from_tsid(ts_id: str)[source]
get_code_from_tsid(ts_id: str)[source]
get_data(fromyear: int, toyear: int, code: str = '', name: str = '', filterna: bool = True, timezone: str = 'GMT+1')[source]

Get the rain data from the SPW-MI website for a given station code or name, and for a given period of years If filterna is True, the NaN values are replaced by 0. Otherwise, they are kept as NaN.

Parameters:
  • fromyear – the starting year of the period (inclusive)

  • toyear – the ending year of the period (inclusive)

  • code – the station code (if name is not provided)

  • name – the station name (if code is not provided)

  • filterna – whether to replace NaN values by 0 (default: True)

  • timezone – the timezone of the data (default: ‘GMT+1’)

Returns:

a pandas Series containing the rain data for the given station and period, indexed by datetime

get_year_data(year: int = 2021, code: str = '', name: str = '', filterna: bool = True, timezone: str = 'GMT+1')[source]

Get the rain data from the SPW-MI website for a given station code or name, and for a given year

Parameters:
  • year – the year of the data to retrieve

  • code – the station code (if name is not provided)

  • name – the station name (if code is not provided)

  • filterna – whether to replace NaN values by 0 (default: True)

  • timezone – the timezone of the data (default: ‘GMT+1’)

Returns:

a pandas Series containing the rain data for the given station and year, indexed by datetime

get_month_data(month: int = 7, year: int = 2021, code: str = '', name: str = '', timezone: str = 'GMT+1')[source]

Retrieving hourly data from the SPW website (GMT+1)

Parameters:
  • month – the month of the year for which to retrieve data (1-12)

  • year – the year for which to retrieve data

  • code – the station code (if name is not provided)

  • name – the station name (if code is not provided)

  • timezone – the timezone of the data (default: ‘GMT+1’) – (‘UTC’, ‘GMT+1’, ‘GMT+2’, etc.)

classmethod compute_stats_Q(rain: pandas.Series, listhours: list[int]) numpy.ndarray[source]

Computes the maximum cumulative rainfall for different durations based on convolution with a vector of number of hours Unity : mm

Parameters:
  • rain – the time series of rainfall to analyze

  • listhours – the list of durations in hours for which to calculate the stats (e.g., [1,2,3,6,12,24])

Returns:

a numpy array containing the maximum values for each duration

classmethod compute_stats_i(rain: pandas.Series, listhours: list[int]) numpy.ndarray[source]

Computes the maximum average intensities for different durations based on convolution with a vector of number of hours Unity : mm/h :param rain: the time series of rainfall to analyze :param listhours: the list of durations in hours for which to calculate the stats (e.g., [1,2,3,6,12,24]) :return: a numpy array containing the maximum values for each duration

classmethod plot(data: pandas.Series, toshow=False, xbounds=None, ticks='M', label: str = None, figax=None)[source]
classmethod plot_periodic(data: pandas.Series, origin: datetime.datetime, length_in_months: int = 12, toshow: bool = False, figax: tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes] = None) tuple[matplotlib.pyplot.Figure, matplotlib.pyplot.Axes][source]

Comparison of several years over the same horizon of a total year starting from a given date, for example to compare the summer periods of several years starting from the 1st of July. The x-axis is in days from the origin date, and the y-axis is the rainfall in mm/h.

Parameters:
  • data – the time series of rainfall to plot

  • origin – the start date of the period to plot (e.g., July 1, 2021)

  • offset_in_months – the number of months to plot from the start date (e.g., 3 to plot from July 1 to September 30)

  • toshow – if True, displays the plot at the end of the function (default: False)

  • figax – a tuple (fig, ax) of matplotlib to plot on an existing figure (default: None)

Returns:

a tuple (fig, ax) of matplotlib containing the created or modified plot

save(data: pandas.Series, filename: str, format: Literal['csv', 'parquet'] = 'csv')[source]
load(name: str = '', code: int = 0, filename: str = '', fromdate: datetime.datetime = None, todate: datetime.datetime = None, format: Literal['csv', 'parquet'] = 'parquet')[source]
_download_one(dirout: pathlib.Path | str = None, name: str = '', fromyear: int = 2002, toyear: int = dt.datetime.now().year, timezone: str = 'GMT+1', format: Literal['csv', 'parquet'] = 'parquet', force: bool = False)[source]
download_all(dirout: pathlib.Path | str = None, fromyear: int = 2002, toyear: int = dt.datetime.now().year, timezone: str = 'GMT+1', format: Literal['csv', 'parquet'] = 'parquet', force: bool = False)[source]
update_all(dirin: pathlib.Path | str = None, format: Literal['csv', 'parquet'] = 'parquet')[source]
update_one(ts_id: str, dirin: pathlib.Path | str = None, format: Literal['csv', 'parquet'] = 'parquet')[source]
load_all(dirin: pathlib.Path | str = None, fromdate: datetime.datetime = None, todate: datetime.datetime = None, timezone: str = 'GMT+1', format: Literal['csv', 'parquet'] = 'parquet')[source]
classmethod analyze(serie: pandas.Series, name: str, volume: bool = True, min_daily_rainfall: float = 0.1, force_unique_timestep: bool = False) StatisticsSeries[source]

Analyze the time serie and return statistics

  • Global statistics: mean, median, std, min, max

  • Missing values : number and percentage of missing values

  • Date range: start and end dates

  • Years : number of years, number of complete years

  • Yearly statistics : mean, median, std, min, max per year

  • Monthly statistics : mean, median, std, min, max per month

Parameters:
  • serie – pd.Series, time series to analyze

  • name – str, name of the time series

  • volume – bool, if True, convert intensity to volume per month/year

  • min_daily_rainfall – float, minimum daily rainfall to consider

  • force_unique_timestep – bool, if True, enforce unique time step between non-NA values, otherwise, the time step is set to None if inconsistent time steps are found

Returns:

StatisticsSeries, statistics

analyze_all(min_value: float = 0.1, volume: bool = True) list[StatisticsSeries][source]

Analyze all the series currently loaded in memory and return a dictionary of statistics with station names as keys and statistics as values

classmethod plot_time_ranges(statistics: list[StatisticsSeries], figax=None)[source]

Create a plot showing the time ranges of the given statistics

Parameters:
  • statistics – list of dict, statistics to plot

  • figax – tuple of (fig, ax) or None

Returns:

tuple of (fig, ax)

classmethod plot_yearly_statistics(statistics: StatisticsSeries, title: str = 'Yearly Statistics', figax=None)[source]

Plot yearly statistics with error bars (mean +/- std)

Parameters:
  • statistics – dict, statistics to plot

  • title – str, title of the plot

  • figax – tuple of (fig, ax) or None

Returns:

tuple of (fig, ax)

classmethod plot_yearly_boxplots(statistics: StatisticsSeries, title: str = 'Yearly Boxplots', figax=None)[source]

Plot yearly statistics as boxplots

Parameters:
  • statistics – dict, statistics to plot

  • title – str, title of the plot

  • figax – tuple of (fig, ax) or None

Returns:

tuple of (fig, ax)

classmethod plot_monthly_statistics(statistics: StatisticsSeries, title: str = 'Monthly Statistics', figax=None)[source]

Plot monthly statistics with error bars (mean +/- std)

Parameters:
  • statistics – dict, statistics to plot

  • title – str, title of the plot

  • figax – tuple of (fig, ax) or None

Returns:

tuple of (fig, ax)

classmethod plot_monthly_boxplots(statistics: StatisticsSeries, title: str = 'Monthly Boxplots', figax=None, unit_factor: float = 1.0)[source]

Plot monthly statistics as boxplots

Parameters:
  • statistics – dict, statistics to plot

  • title – str, title of the plot

  • figax – tuple of (fig, ax) or None

  • unit_factor – float, factor to convert units (e.g. from m3 to 1000 m3)

Returns:

tuple of (fig, ax)

class wolfhece.hydrometry.rain_SPW.SPW_pluviographs_1h(store_dir: str | pathlib.Path = None)[source]

Bases: SPW_pluviographs

Inheritance diagram of wolfhece.hydrometry.rain_SPW.SPW_pluviographs_1h

Management of the rain data from the SPW-MI website through its data download interface The data are stored in a local directory in csv or parquet format, and can be loaded in memory as pandas Series for analysis and plotting.

_timestep_dl[source]
download_all(fromyear=2002, toyear=dt.datetime.now().year, timezone='GMT+1', format='parquet', force: bool = False)[source]
download_one(name='', fromyear=2002, toyear=dt.datetime.now().year, timezone='GMT+1', format='parquet', force: bool = False)[source]
load_all(fromdate=None, todate=None, timezone='GMT+1', format='parquet')[source]
update_all(format: Literal['csv', 'parquet'] = 'parquet')[source]
update_one(ts_id: str, dirin: pathlib.Path | str = None, format: Literal['csv', 'parquet'] = 'parquet')[source]
class wolfhece.hydrometry.rain_SPW.SPW_pluviographs_5min(store_dir: str | pathlib.Path = None)[source]

Bases: SPW_pluviographs

Inheritance diagram of wolfhece.hydrometry.rain_SPW.SPW_pluviographs_5min

Management of the rain data from the SPW-MI website through its data download interface The data are stored in a local directory in csv or parquet format, and can be loaded in memory as pandas Series for analysis and plotting.

_timestep_dl[source]
download_all(fromyear=2002, toyear=dt.datetime.now().year, timezone='GMT+1', format='parquet', force: bool = False)[source]
download_one(name='', fromyear=2002, toyear=dt.datetime.now().year, timezone='GMT+1', format='parquet', force: bool = False)[source]
load_all(fromdate=None, todate=None, timezone='GMT+1', format='parquet')[source]
update_all(format: Literal['csv', 'parquet'] = 'parquet')[source]
update_one(ts_id: str, dirin: pathlib.Path | str = None, format: Literal['csv', 'parquet'] = 'parquet')[source]
wolfhece.hydrometry.rain_SPW.my_5min[source]