Data Generation#

A. Simulated data#

Simulated data has been generated using a three-compartment conceptual model extending the one from Kirchner 2019.

In this notebook, we explain the procedure we followed to generate the 3 benchmark datasets used in our paper.

/! The Pyeto Python package should be downloaded from here and the path of its location on your laptop should be added using the command below:

[2]:
import sys
sys.path.append('{Path where you saved the Pyeto package}')

1. Loading the precipitation time series and computing potential evapotranspiration using daily (Tmax, Tmin and Tmean)#

Run the notebook data_preprocessing.ipynb located in experiments/preprocessing_simulated_data/1_saving_input_data_sites#

Note that you will need the (hourly) precipitation time series and the daily maximum, minimum and mean temperature at the site of interest. This data can be computed for example on the MeteoSwiss’ API for catchment located in Switzerland.

(Optional) Calibrating the parameters of the threebox model#

When considering precipitation data from a real site, one might want to generate streamflow values that are realistic for this site. To do so, one can calibrate the model parameters of the threebox models using both precipitation and streamflow time series at the given site (or at a similar site for which data is available).

2 - 3. Running the threebox model to get the simulated streamflow time series#

You have two possible options here:

  • either you simply want to get the streamflow time series: In this case you can use the Rscript generate_streamflow.R located in threeboxmodel/Rscripts.

  • or you also want to compute the true transfer functions: In this case you can use the Rscript simulations_computing_tfs_flashy.R located in experiments/preprocessing_simulated_data/2_running_simulator_sites_for_all_precip_events. This script will save the data related to the simulation on the complete precipitation time series in a file entitled data_{site}.txt. The Rscript then launched the simulator again by removing a single precipitation event. Note that you can specify the minimum precipitation intensity for which you want to compute the corresponding transfer function. By default, the threshold is set to 1 (mm/h). Once the Rscript is finished, you can use the python script save_tfs.py located in experiments/preprocessing_simulated_data/3_saving_true_transfer_functions_in_a_single_file to aggregate and save the transfer functions of every precipitation events of interest in a single file.

Note that in both cases, you should modify the Rscript to: - change if needed the model parameters - change the name of the site considered

4. Visualizing the data#

You can visualize the simulated data (and the computed transfer functions) by relying on the notebook visualization_data.ipynb located in the folder experiments/preprocessing_simulated_data/4_visualization_data.

B. Real data#

A complete example on a data preprocessing pipeline for real data is provided in the notebook data_fow_hourly.ipynb located in the folder experiments/preprocessing_real_data.

Screen shot of an example of simulated data.