Add support for the exponential univariate dist.

The exponential distribution is now supported as a `UnivDist` instance. The test suite and the docs are accordingly updated. This commit should resolve Issue #266.
damar-wicaksono · Dec 15, 2023 · b457ca0 · b457ca0
1 parent 4c77438
commit b457ca0
Show file tree

Hide file tree

Showing 10 changed files with 472 additions and 12 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -12,6 +12,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - The three-dimensional simple portfolio model from Saltelli et al. (2004).
 - The 20-dimensional polynomial test function from Alemazkoor
   and Meidani (2018).
+- The exponential distribution as a distribution of `UnivDist` instances.
 
 ## [0.4.1] - 2023-10-27
 

diff --git a/docs/_toc.yml b/docs/_toc.yml
@@ -136,6 +136,8 @@ parts:
         sections:
           - file: prob-input/univariate-distributions/beta
             title: Beta
+          - file: prob-input/univariate-distributions/exponential
+            title: Exponential
           - file: prob-input/univariate-distributions/gumbel
             title: Gumbel (max.)
           - file: prob-input/univariate-distributions/logitnormal

diff --git a/docs/prob-input/available.md b/docs/prob-input/available.md
@@ -5,14 +5,15 @@ The table below lists all the available univariate distribution types used
 to construct ``UnivDist`` instances. ``UnivDist`` are used to represent the
 univariate marginals of a (possibly, multivariate) probabilistic input model.
 
-|                                         Name                                          |     Keyword      |                     Notation                      |             Support              | Number of parameters |
-|:-------------------------------------------------------------------------------------:|:----------------:|:-------------------------------------------------:|:--------------------------------:|:--------------------:|
-|                {ref}`Beta <prob-input:univariate-distributions:beta>`                 |     `"beta"`     |       $\mathrm{Beta}(\alpha, \beta, a, b)$        | $[a, b], \; a, b \in \mathbb{R}$ |          4           |
-|           {ref}`Gumbel (max.) <prob-input:univariate-distributions:gumbel>`           |    `"gumbel"`    |           $\mathrm{Gumbel}(\mu, \beta)$           |       $(-\infty, \infty)$        |          2           |
-|         {ref}`Logit-Normal <prob-input:univariate-distributions:logitnormal>`         | `"logitnormal"`  |    $\mathcal{N}_{\mathrm{logit}}(\mu, \sigma)$    |             $(0, 1)$             |          2           |
-|           {ref}`Log-Normal <prob-input:univariate-distributions:lognormal>`           |  `"lognormal"`   |    $\mathcal{N}_{\mathrm{log}} (\mu, \sigma)$     |          $(0, \infty)$           |          2           |
-|         {ref}`Normal (Gaussian) <prob-input:univariate-distributions:normal>`         |    `"normal"`    |            $\mathcal{N}(\mu, \sigma)$             |       $(-\infty, \infty)$        |          2           |
-|          {ref}`Triangular <prob-input:univariate-distributions:triangular>`           |  `"triangular"`  |             $\mathcal{T}_r(a, b, c)$              | $[a, b], \; a, b \in \mathbb{R}$ |          3           |
-|   {ref}`Truncated Gumbel (max.) <prob-input:univariate-distributions:trunc-gumbel>`   | `"trunc-gumbel"` | $\mathrm{Gumbel}_{\mathrm{Tr}}(\mu, \beta, a, b)$ | $[a, b], \; a, b \in \mathbb{R}$ |          4           |
-| {ref}`Truncated Normal (Gaussian) <prob-input:univariate-distributions:trunc-normal>` | `"trunc-normal"` |  $\mathcal{N}_{\mathrm{Tr}}(\mu, \sigma, a, b)$   | $[a, b], \; a, b \in \mathbb{R}$ |          4           |
-|             {ref}`Uniform <prob-input:univariate-distributions:uniform>`              |   `"uniform"`    |                $\mathcal{U}(a, b)$                | $[a, b], \; a, b \in \mathbb{R}$ |          2           |
+|                                         Name                                          | Keyword value for `distribution` |                     Notation                      |             Support              | Number of parameters |
+|:-------------------------------------------------------------------------------------:|:--------------------------------:|:-------------------------------------------------:|:--------------------------------:|:--------------------:|
+|                {ref}`Beta <prob-input:univariate-distributions:beta>`                 |             `"beta"`             |       $\mathrm{Beta}(\alpha, \beta, a, b)$        | $[a, b], \; a, b \in \mathbb{R}$ |          4           |
+|         {ref}`Exponential <prob-input:univariate-distributions:exponential>`          |         `"exponential"`          |              $\mathcal{E}(\lambda)$               |          $[0, \infty)$           |          1           |
+|           {ref}`Gumbel (max.) <prob-input:univariate-distributions:gumbel>`           |            `"gumbel"`            |           $\mathrm{Gumbel}(\mu, \beta)$           |       $(-\infty, \infty)$        |          2           |
+|         {ref}`Logit-Normal <prob-input:univariate-distributions:logitnormal>`         |         `"logitnormal"`          |    $\mathcal{N}_{\mathrm{logit}}(\mu, \sigma)$    |             $(0, 1)$             |          2           |
+|           {ref}`Log-Normal <prob-input:univariate-distributions:lognormal>`           |          `"lognormal"`           |    $\mathcal{N}_{\mathrm{log}} (\mu, \sigma)$     |          $(0, \infty)$           |          2           |
+|         {ref}`Normal (Gaussian) <prob-input:univariate-distributions:normal>`         |            `"normal"`            |            $\mathcal{N}(\mu, \sigma)$             |       $(-\infty, \infty)$        |          2           |
+|          {ref}`Triangular <prob-input:univariate-distributions:triangular>`           |          `"triangular"`          |             $\mathcal{T}_r(a, b, c)$              | $[a, b], \; a, b \in \mathbb{R}$ |          3           |
+|   {ref}`Truncated Gumbel (max.) <prob-input:univariate-distributions:trunc-gumbel>`   |         `"trunc-gumbel"`         | $\mathrm{Gumbel}_{\mathrm{Tr}}(\mu, \beta, a, b)$ | $[a, b], \; a, b \in \mathbb{R}$ |          4           |
+| {ref}`Truncated Normal (Gaussian) <prob-input:univariate-distributions:trunc-normal>` |         `"trunc-normal"`         |  $\mathcal{N}_{\mathrm{Tr}}(\mu, \sigma, a, b)$   | $[a, b], \; a, b \in \mathbb{R}$ |          4           |
+|             {ref}`Uniform <prob-input:univariate-distributions:uniform>`              |           `"uniform"`            |                $\mathcal{U}(a, b)$                | $[a, b], \; a, b \in \mathbb{R}$ |          2           |
diff --git a/docs/prob-input/univariate-distributions/exponential.md b/docs/prob-input/univariate-distributions/exponential.md
@@ -0,0 +1,116 @@
+---
+jupytext:
+  formats: ipynb,md:myst
+  text_representation:
+    extension: .md
+    format_name: myst
+    format_version: 0.13
+    jupytext_version: 1.14.1
+kernelspec:
+  display_name: Python 3 (ipykernel)
+  language: python
+  name: python3
+---
+
+```{code-cell} ipython3
+:tags: [remove-cell]
+
+import uqtestfuns as uqtf
+import matplotlib.pyplot as plt
+import numpy as np
+```
+
+(prob-input:univariate-distributions:exponential)=
+# Exponential Distribution
+
+The exponential distribution is a single-parameter continuous probability
+distribution.
+
+The table below summarizes some important aspects of the distribution.
+
+|                      |                                                             |
+|---------------------:|-------------------------------------------------------------|
+|         **Notation** | $X \sim \mathcal{E}(\lambda)$                               |
+|       **Parameters** | $\lambda \in \mathbb{R}_{>0}$ (rate parameter)              |
+|  **{term}`Support`** | $\mathcal{D}_X = [0.0, \infty)$                             |
+|      **{term}`PDF`** | $f_X (x; \lambda) = \lambda e^{-\lambda x}$                 |
+|      **{term}`CDF`** | $F_X (x; \lambda) = 1 - e^{-\lambda x}$                     |
+|     **{term}`ICDF`** | $F^{-1}_X (x; \lambda) = - \frac{1}{\lambda} \ln{(1 - x)}$  |
+
+The plots of probability density functions (PDFs),
+sample histogram (of $5'000$ points),
+cumulative distribution functions (CDFs),
+and inverse cumulative distribution functions (ICDFs) for different parameter
+values are shown below.
+
+```{code-cell} ipython3
+:tags: [remove-input]
+
+import numpy as np
+import matplotlib.pyplot as plt
+import uqtestfuns as uqtf
+
+parameters = [[2.0], [1.5], [1.0], [0.5]]
+colors = ["#a6cee3", "#1f78b4", "#b2df8a", "#33a02c"]
+
+univ_dists = []
+for parameter in parameters:
+    univ_dists.append(uqtf.UnivDist(distribution="exponential", parameters=parameter))
+
+fig, axs = plt.subplots(2, 2, figsize=(10,10))
+
+# --- PDF
+xx = np.linspace(0, 5, 1000)
+for i, univ_dist in enumerate(univ_dists):
+    axs[0, 0].plot(
+        xx,
+        univ_dist.pdf(xx),
+        color=colors[i],
+        label=f"$\\lambda = {univ_dist.parameters[0]}$",
+        linewidth=2,
+    )
+axs[0, 0].legend();
+axs[0, 0].grid();
+axs[0, 0].set_title("PDF");
+
+# --- Sample histogram
+sample_size = 5000
+np.random.seed(42)
+for col, univ_dist in zip(reversed(colors), reversed(univ_dists)):
+    axs[0, 1].hist(
+        univ_dist.get_sample(sample_size),
+        color=col,
+        bins="auto",
+        alpha=0.75
+    )
+axs[0, 1].grid();
+axs[0, 1].set_xlim([0, 5]);
+axs[0, 1].set_title("Sample histogram");
+
+# --- CDF
+xx = np.linspace(0, 5, 1000)
+for i, univ_dist in enumerate(univ_dists):
+    axs[1, 0].plot(
+        xx,
+        univ_dist.cdf(xx),
+        color=colors[i],
+        linewidth=2,
+    )
+axs[1, 0].grid();
+axs[1, 0].set_title("CDF");
+
+# --- Inverse CDF
+xx = np.linspace(0, 1, 1000)
+for i, univ_dist in enumerate(univ_dists):
+    axs[1, 1].plot(
+        xx,
+        univ_dist.icdf(xx),
+        color=colors[i],
+        linewidth=2
+    )
+axs[1, 1].grid();
+axs[1, 1].set_ylim([0, 6]);
+axs[1, 1].set_title("Inverse CDF");
+
+plt.gcf().set_dpi(150)
+```
diff --git a/src/uqtestfuns/core/prob_input/univariate_distributions/exponential.py b/src/uqtestfuns/core/prob_input/univariate_distributions/exponential.py
@@ -0,0 +1,224 @@
+"""
+Module with routines involving the exponential probability distribution.
+
+The exponential distribution in UQTestFuns is parametrized by a single
+parameter: the rate parametere.
+
+The underlying implementation is based on the implementation from scipy.stats.
+In the SciPy convention, the rate parameter corresponds to the reciprocal
+of the ``scale`` parameter.
+"""
+import numpy as np
+from scipy.stats import expon
+
+from .utils import verify_param_nums, postprocess_icdf
+from ....global_settings import ARRAY_FLOAT
+
+DISTRIBUTION_NAME = "exponential"
+
+NUM_PARAMS = 1
+
+
+def verify_parameters(parameters: ARRAY_FLOAT) -> None:
+    """Verify the parameters of an exponential distribution.
+
+    Parameters
+    ----------
+    parameters : ARRAY_FLOAT
+        The parameter of an exponential distribution
+        (i.e., the rate parameter).
+
+    Returns
+    ------
+    None
+        The function exits without any return value when nothing is wrong.
+
+    Raises
+    ------
+    ValueError
+        If any of the parameter values are invalid
+        or the shapes are inconsistent.
+    """
+    # Verify overall shape
+    verify_param_nums(parameters.size, NUM_PARAMS, DISTRIBUTION_NAME)
+
+    if parameters[0] <= 0.0:
+        raise ValueError(
+            f"The corresponding rate parameter {parameters[0]}"
+            f"must be larger than 0.0!"
+        )
+
+
+def lower(parameters: ARRAY_FLOAT) -> float:
+    """Get the lower bound of an exponential distribution.
+
+    Parameters
+    ----------
+    parameters : ARRAY_FLOAT
+        The parameter of an exponential distribution.
+
+    Returns
+    -------
+    float
+        The lower bound of the exponential distribution.
+
+    Notes
+    -----
+    - The parameters are not used in determining the lower bound of
+      the distribution; it must, however, appear for interface consistency.
+      The lower bound of a lognormal distribution is finite; it is 0.0.
+    """
+    lower_bound = 0.0
+
+    return lower_bound
+
+
+def upper(parameters: ARRAY_FLOAT) -> float:
+    """Get the upper bound of an exponential distribution.
+
+    Parameters
+    ----------
+    parameters : ARRAY_FLOAT
+        The parameters of an exponential distribution
+
+    Returns
+    -------
+    float
+        The upper bound of the exponential distribution.
+
+    Notes
+    -----
+    - Strictly speaking, an exponential distribution is unbounded on the right.
+      However, for numerical reason an upper bound is set.
+    - The upper bound of the exponential distribution is chosen such that
+      tbe probability mass between the lower and upper bound is at least
+      1 - 1e-15.
+    """
+    # 36.7368005696771 is the quantile values with probability of 1-1e-16
+    # for the exponential distribution with rate parameter value 1.0
+    upper_bound = float(36.7368005696771 / parameters[0])
+
+    return upper_bound
+
+
+def pdf(
+    xx: ARRAY_FLOAT,
+    parameters: ARRAY_FLOAT,
+    lower_bound: float,
+    upper_bound: float,
+) -> ARRAY_FLOAT:
+    """Get the PDF values of an exponential distribution.
+
+    Parameters
+    ----------
+    xx : ARRAY_FLOAT
+        Sample values (realizations) of an exponential distribution.
+    parameters : ARRAY_FLOAT
+        Parameters of the exponential distribution.
+    lower_bound : float
+        Lower bound of the exponential distribution.
+    upper_bound : float
+        Upper bound of the exponential distribution.
+
+    Returns
+    -------
+    np.ndarray
+        PDF values of the exponential distribution on the sample values.
+
+    Notes
+    -----
+    - The PDF for sample with values outside the bounds are set to 0.0.
+    """
+    yy = np.zeros(xx.shape)
+    idx = np.logical_and(xx >= lower_bound, xx <= upper_bound)
+    rate = parameters[0]
+    scale = 1 / rate
+    yy[idx] = expon.pdf(xx[idx], scale=scale)
+
+    return yy
+
+
+def cdf(
+    xx: ARRAY_FLOAT,
+    parameters: ARRAY_FLOAT,
+    lower_bound: float,
+    upper_bound: float,
+) -> ARRAY_FLOAT:
+    """Get the CDF values of an exponential distribution.
+
+    Parameters
+    ----------
+    xx : ARRAY_FLOAT
+        Sample values (realizations) of an exponential distribution.
+    parameters : ARRAY_FLOAT
+        Parameters of the exponential distribution.
+    lower_bound : float
+        Lower bound of the exponential distribution.
+    upper_bound : float
+        Upper bound of the exponential distribution.
+
+    Returns
+    -------
+    ARRAY_FLOAT
+        CDF values of the exponential distribution on the sample values.
+
+    Notes
+    -----
+    - The CDF for sample with values smaller (resp. larger)
+      than the lower bound (resp. upper bound) are set to 0.0 (resp. 1.0).
+    """
+    yy = np.empty(xx.shape)
+    idx_lower = xx < lower_bound
+    idx_upper = xx > upper_bound
+    idx_rest = np.logical_and(
+        np.logical_not(idx_lower), np.logical_not(idx_upper)
+    )
+
+    yy[idx_lower] = 0.0
+    yy[idx_upper] = 1.0
+    rate = parameters[0]
+    scale = 1 / rate
+    yy[idx_rest] = expon.cdf(xx[idx_rest], scale=scale)
+
+    return yy
+
+
+def icdf(
+    xx: ARRAY_FLOAT,
+    parameters: ARRAY_FLOAT,
+    lower_bound: float,
+    upper_bound: float,
+) -> ARRAY_FLOAT:
+    """Get the inverse CDF values of an exponential distribution.
+
+    Parameters
+    ----------
+    xx : ARRAY_FLOAT
+        Sample values (realizations) in the [0, 1] domain.
+    parameters : ARRAY_FLOAT
+        Parameters of an exponential distribution.
+    lower_bound : float
+        Lower bound of the exponential distribution.
+    upper_bound : float
+        Upper bound of the exponential distribution.
+
+    Returns
+    -------
+    ARRAY_FLOAT
+        Transformed values in the domain of the normal distribution.
+
+    Notes
+    -----
+    - ICDF for sample values outside [0.0, 1.0] is set to NaN according to
+      the underlying SciPy implementation.
+    """
+
+    # Compute the ICDF
+    rate = parameters[0]
+    scale = 1 / rate
+    yy = expon.ppf(xx, scale=scale)
+
+    # Check if values are within the set bounds
+    yy = postprocess_icdf(yy, lower_bound, upper_bound)
+
+    return yy
diff --git a/src/uqtestfuns/core/prob_input/univariate_distributions/normal.py b/src/uqtestfuns/core/prob_input/univariate_distributions/normal.py
@@ -68,7 +68,7 @@ def lower(parameters: ARRAY_FLOAT) -> float:
       However, for numerical reason a lower bound is set.
     - The lower bound of the normal distribution is chosen such that
       the probability mass between the lower and upper bound is at least
-      1 - 1e.15.
+      1 - 1e-15.
     """
     # -8.222082216130435 is the quantile values with probability of 1e-16
     # for the standard Normal distribution (mu = 0.0, sigma = 1.0)