# How many zeros is too many for reliable streamflow predictions?

**Contributed by Mark Thyer, David McInerney and Dmitri Kavetski, University of Adelaide.**

Ephemeral catchments, where there are days with zero flow, are common in many parts of the world, particular in areas with highly variable climate such as Australia (see Figure 1). Recent research has established how the number of days with zero flow impacts the reliability of probabilistic streamflow predictions in ephemeral catchments (McInerney et al., 2019). When there exists days with zero flow, producing reliable probabilistic predictions is challenging. Previous studies used specialised statistical approaches to explicitly treat zero flows but, as yet, no study has provided practical guidance on the types of catchments where these specialised approaches are warranted.

How frequent should zero flows be in a catchment before it benefits from specialised treatment of zero flows?

Essentially, the hydrological modeller has two options for treating zero flows in calibration:

**Option 1: Pragmatic approach**

The “pragmatic approach” treats days with zero flow in the same way as all other days when the hydrological model is calibrated to a catchment of interest. This option is widely used in hydrology, because it is relatively easy to implement and simplified methods are available to produce probabilistic predictions (see McInerney et al., 2018).

**Option 2: Explicit approach**

The “explicit approach” uses specialised statistical models (e.g. “censored” or “mixed discrete/continuous” distributions) that explicitly recognize that observed flows can only be zero or positive. This option takes more effort to implement, due to the extra complexity of the probability model.

**Which approach should a modeller choose? **

Our study (McInerney et al., 2019) compared the pragmatic and explicit approaches for treating zero flows across a range of ephemeral catchments in terms of the quality of the probabilistic streamflow predictions. The study used 74 Australian catchments with diverse climatic and hydrological conditions (see Figure 1) and with varying degrees of ephemerality, characterised by the proportion of zero flows varying from close to 0% and up to 90%.

**Our findings**

The key findings, supported by empirical results and statistical theory, are as follows:

**1.**For low ephemeral catchments (approx. 0-5% zero flows) the pragmatic approach is recommended

The pragmatic approach works reasonable well, providing similar predictions as the explicit approach (Figure 2).

**2.**For mid-ephemeral catchments (approx. 5-50% zero flows) the explicit approach is recommended.

The explicit approach produced more reliable predictions than the pragmatic approach. Pragmatic approach is more prone to wide, unrealistic probability limits (Figure 3), which happens for some error models (e.g. log/logsinh) more than others (BC0.2/BC0.5).

**3.**For high ephemeral catchments (>50% zero flows) neither approach is recommended.

As the proportion of zero flows approaches 50% or more both pragmatic and explicit approaches produce unreliable predictions and even more complex censoring approaches are required. These more complex approaches take into account that both the observed flow and the simulated flow (from the hydrological model) can only be zero or positive.

**4.**BC0.2 error model is recommended to produce reliable and precise probabilistic predictions for both low and mid-ephemeral catchments, irrespective of choice of pragmatic or explicit approaches

To produce reliable and precise (i.e. ‘sharp’) probabilistic predictions a key choice is the selection of data transformation to handle common residual error characteristics such as heteroscedasticity and skewness (see McInerney et al, 2017). We compared a range of transformations (‘log’, ‘log-sinh’, Box-Cox with transformation parameter fixed at 0.2, ‘BC0.2’ and fixed at 0.5, ‘BC0.5’). We found the BC0.2 error model was Pareto Optimal over multiple performance metrics (reliability, precision, bias, CRPSS) and was preferred due to better reliability than the BC0.5 error model and substantially better precision (sharpness) than the log and logsinh error model. Also, BC0.2 error model was less sensitive to the choice of pragmatic or explicit approach.

**Practical Guidance**

Our findings provide hydrological modellers with practical guidance about when and why it is important to explicitly treat zero flows in calibration, and which probability models are best suited for quantifying hydrological uncertainty in ephemeral catchments.

**Further Information**

McInerney, D., M. Thyer, D. Kavetski, J. Lerat, and G. Kuczera (2017), Improving probabilistic prediction of daily streamflow by identifying Pareto optimal approaches for modeling heteroscedastic residual errors, Water Resour. Res, 53(3), 2199-2239, https://doi.org/10.1002/2016WR019168.

McInerney, D., M. Thyer, D. Kavetski, B. Bennett, J. Lerat, M. Gibbs, and G. Kuczera (2018), A simplified approach to produce probabilistic hydrological model predictions, Environmental Modelling & Software, 109, 306-314, https://doi.org/10.1016/j.envsoft.2018.07.001

McInerney, D., D. Kavetski, M. Thyer, J. Lerat, and G. Kuczera (2019), Benefits of Explicit Treatment of Zero Flows in Probabilistic Hydrological Modeling of Ephemeral Catchments, Water Resour Res, 55(12), 11035-11060, https://doi.org/10.1029/2018WR024148

## 2 thoughts on “How many zeros is too many for reliable streamflow predictions?”

Thanks Mark, David and Dmitri for such an interesting blog and useful references!

I wish this blog (and references) had already existed in 2015… At that time, I wanted to try and include the Hugh River basin in a project. I’m pretty sure it is the red dot right in the middle of the map in Figure 1. All the data was super easy to obtain, but there were so many zeros! I couldn’t calibrate GR4J properly so I just gave up on this basin. Too little information in the streamflow series.

Next time I want to include an arid catchment in a project, I’ll know where to start my literature review!

Nice work! I just wanted to add that it gets even worse when one tries to estimate extremes in e.g. return-period analyses, as commonly done and used in forecasting. We did a study a few years ago in West Africa on this topic (https://doi.org/10.1016/j.pce.2017.02.010). We did not find a good solution yet, so perhaps this can help a bit.