Skip to content

Commit

Permalink
Add support for the exponential univariate dist.
Browse files Browse the repository at this point in the history
The exponential distribution is now supported as
a `UnivDist` instance. The test suite and the docs
are accordingly updated.

This commit should resolve Issue #266.
  • Loading branch information
damar-wicaksono committed Dec 15, 2023
1 parent 4c77438 commit b457ca0
Show file tree
Hide file tree
Showing 10 changed files with 472 additions and 12 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- The three-dimensional simple portfolio model from Saltelli et al. (2004).
- The 20-dimensional polynomial test function from Alemazkoor
and Meidani (2018).
- The exponential distribution as a distribution of `UnivDist` instances.

## [0.4.1] - 2023-10-27

Expand Down
2 changes: 2 additions & 0 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,8 @@ parts:
sections:
- file: prob-input/univariate-distributions/beta
title: Beta
- file: prob-input/univariate-distributions/exponential
title: Exponential
- file: prob-input/univariate-distributions/gumbel
title: Gumbel (max.)
- file: prob-input/univariate-distributions/logitnormal
Expand Down
23 changes: 12 additions & 11 deletions docs/prob-input/available.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,15 @@ The table below lists all the available univariate distribution types used
to construct ``UnivDist`` instances. ``UnivDist`` are used to represent the
univariate marginals of a (possibly, multivariate) probabilistic input model.

| Name | Keyword | Notation | Support | Number of parameters |
|:-------------------------------------------------------------------------------------:|:----------------:|:-------------------------------------------------:|:--------------------------------:|:--------------------:|
| {ref}`Beta <prob-input:univariate-distributions:beta>` | `"beta"` | $\mathrm{Beta}(\alpha, \beta, a, b)$ | $[a, b], \; a, b \in \mathbb{R}$ | 4 |
| {ref}`Gumbel (max.) <prob-input:univariate-distributions:gumbel>` | `"gumbel"` | $\mathrm{Gumbel}(\mu, \beta)$ | $(-\infty, \infty)$ | 2 |
| {ref}`Logit-Normal <prob-input:univariate-distributions:logitnormal>` | `"logitnormal"` | $\mathcal{N}_{\mathrm{logit}}(\mu, \sigma)$ | $(0, 1)$ | 2 |
| {ref}`Log-Normal <prob-input:univariate-distributions:lognormal>` | `"lognormal"` | $\mathcal{N}_{\mathrm{log}} (\mu, \sigma)$ | $(0, \infty)$ | 2 |
| {ref}`Normal (Gaussian) <prob-input:univariate-distributions:normal>` | `"normal"` | $\mathcal{N}(\mu, \sigma)$ | $(-\infty, \infty)$ | 2 |
| {ref}`Triangular <prob-input:univariate-distributions:triangular>` | `"triangular"` | $\mathcal{T}_r(a, b, c)$ | $[a, b], \; a, b \in \mathbb{R}$ | 3 |
| {ref}`Truncated Gumbel (max.) <prob-input:univariate-distributions:trunc-gumbel>` | `"trunc-gumbel"` | $\mathrm{Gumbel}_{\mathrm{Tr}}(\mu, \beta, a, b)$ | $[a, b], \; a, b \in \mathbb{R}$ | 4 |
| {ref}`Truncated Normal (Gaussian) <prob-input:univariate-distributions:trunc-normal>` | `"trunc-normal"` | $\mathcal{N}_{\mathrm{Tr}}(\mu, \sigma, a, b)$ | $[a, b], \; a, b \in \mathbb{R}$ | 4 |
| {ref}`Uniform <prob-input:univariate-distributions:uniform>` | `"uniform"` | $\mathcal{U}(a, b)$ | $[a, b], \; a, b \in \mathbb{R}$ | 2 |
| Name | Keyword value for `distribution` | Notation | Support | Number of parameters |
|:-------------------------------------------------------------------------------------:|:--------------------------------:|:-------------------------------------------------:|:--------------------------------:|:--------------------:|
| {ref}`Beta <prob-input:univariate-distributions:beta>` | `"beta"` | $\mathrm{Beta}(\alpha, \beta, a, b)$ | $[a, b], \; a, b \in \mathbb{R}$ | 4 |
| {ref}`Exponential <prob-input:univariate-distributions:exponential>` | `"exponential"` | $\mathcal{E}(\lambda)$ | $[0, \infty)$ | 1 |
| {ref}`Gumbel (max.) <prob-input:univariate-distributions:gumbel>` | `"gumbel"` | $\mathrm{Gumbel}(\mu, \beta)$ | $(-\infty, \infty)$ | 2 |
| {ref}`Logit-Normal <prob-input:univariate-distributions:logitnormal>` | `"logitnormal"` | $\mathcal{N}_{\mathrm{logit}}(\mu, \sigma)$ | $(0, 1)$ | 2 |
| {ref}`Log-Normal <prob-input:univariate-distributions:lognormal>` | `"lognormal"` | $\mathcal{N}_{\mathrm{log}} (\mu, \sigma)$ | $(0, \infty)$ | 2 |
| {ref}`Normal (Gaussian) <prob-input:univariate-distributions:normal>` | `"normal"` | $\mathcal{N}(\mu, \sigma)$ | $(-\infty, \infty)$ | 2 |
| {ref}`Triangular <prob-input:univariate-distributions:triangular>` | `"triangular"` | $\mathcal{T}_r(a, b, c)$ | $[a, b], \; a, b \in \mathbb{R}$ | 3 |
| {ref}`Truncated Gumbel (max.) <prob-input:univariate-distributions:trunc-gumbel>` | `"trunc-gumbel"` | $\mathrm{Gumbel}_{\mathrm{Tr}}(\mu, \beta, a, b)$ | $[a, b], \; a, b \in \mathbb{R}$ | 4 |
| {ref}`Truncated Normal (Gaussian) <prob-input:univariate-distributions:trunc-normal>` | `"trunc-normal"` | $\mathcal{N}_{\mathrm{Tr}}(\mu, \sigma, a, b)$ | $[a, b], \; a, b \in \mathbb{R}$ | 4 |
| {ref}`Uniform <prob-input:univariate-distributions:uniform>` | `"uniform"` | $\mathcal{U}(a, b)$ | $[a, b], \; a, b \in \mathbb{R}$ | 2 |
116 changes: 116 additions & 0 deletions docs/prob-input/univariate-distributions/exponential.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
---
jupytext:
formats: ipynb,md:myst
text_representation:
extension: .md
format_name: myst
format_version: 0.13
jupytext_version: 1.14.1
kernelspec:
display_name: Python 3 (ipykernel)
language: python
name: python3
---

```{code-cell} ipython3
:tags: [remove-cell]
import uqtestfuns as uqtf
import matplotlib.pyplot as plt
import numpy as np
```

(prob-input:univariate-distributions:exponential)=
# Exponential Distribution

The exponential distribution is a single-parameter continuous probability
distribution.

The table below summarizes some important aspects of the distribution.

| | |
|---------------------:|-------------------------------------------------------------|
| **Notation** | $X \sim \mathcal{E}(\lambda)$ |
| **Parameters** | $\lambda \in \mathbb{R}_{>0}$ (rate parameter) |
| **{term}`Support`** | $\mathcal{D}_X = [0.0, \infty)$ |
| **{term}`PDF`** | $f_X (x; \lambda) = \lambda e^{-\lambda x}$ |
| **{term}`CDF`** | $F_X (x; \lambda) = 1 - e^{-\lambda x}$ |
| **{term}`ICDF`** | $F^{-1}_X (x; \lambda) = - \frac{1}{\lambda} \ln{(1 - x)}$ |

The plots of probability density functions (PDFs),
sample histogram (of $5'000$ points),
cumulative distribution functions (CDFs),
and inverse cumulative distribution functions (ICDFs) for different parameter
values are shown below.

```{code-cell} ipython3
:tags: [remove-input]
import numpy as np
import matplotlib.pyplot as plt
import uqtestfuns as uqtf
parameters = [[2.0], [1.5], [1.0], [0.5]]
colors = ["#a6cee3", "#1f78b4", "#b2df8a", "#33a02c"]
univ_dists = []
for parameter in parameters:
univ_dists.append(uqtf.UnivDist(distribution="exponential", parameters=parameter))
fig, axs = plt.subplots(2, 2, figsize=(10,10))
# --- PDF
xx = np.linspace(0, 5, 1000)
for i, univ_dist in enumerate(univ_dists):
axs[0, 0].plot(
xx,
univ_dist.pdf(xx),
color=colors[i],
label=f"$\\lambda = {univ_dist.parameters[0]}$",
linewidth=2,
)
axs[0, 0].legend();
axs[0, 0].grid();
axs[0, 0].set_title("PDF");
# --- Sample histogram
sample_size = 5000
np.random.seed(42)
for col, univ_dist in zip(reversed(colors), reversed(univ_dists)):
axs[0, 1].hist(
univ_dist.get_sample(sample_size),
color=col,
bins="auto",
alpha=0.75
)
axs[0, 1].grid();
axs[0, 1].set_xlim([0, 5]);
axs[0, 1].set_title("Sample histogram");
# --- CDF
xx = np.linspace(0, 5, 1000)
for i, univ_dist in enumerate(univ_dists):
axs[1, 0].plot(
xx,
univ_dist.cdf(xx),
color=colors[i],
linewidth=2,
)
axs[1, 0].grid();
axs[1, 0].set_title("CDF");
# --- Inverse CDF
xx = np.linspace(0, 1, 1000)
for i, univ_dist in enumerate(univ_dists):
axs[1, 1].plot(
xx,
univ_dist.icdf(xx),
color=colors[i],
linewidth=2
)
axs[1, 1].grid();
axs[1, 1].set_ylim([0, 6]);
axs[1, 1].set_title("Inverse CDF");
plt.gcf().set_dpi(150)
```
224 changes: 224 additions & 0 deletions src/uqtestfuns/core/prob_input/univariate_distributions/exponential.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
"""
Module with routines involving the exponential probability distribution.
The exponential distribution in UQTestFuns is parametrized by a single
parameter: the rate parametere.
The underlying implementation is based on the implementation from scipy.stats.
In the SciPy convention, the rate parameter corresponds to the reciprocal
of the ``scale`` parameter.
"""
import numpy as np
from scipy.stats import expon

from .utils import verify_param_nums, postprocess_icdf
from ....global_settings import ARRAY_FLOAT

DISTRIBUTION_NAME = "exponential"

NUM_PARAMS = 1


def verify_parameters(parameters: ARRAY_FLOAT) -> None:
"""Verify the parameters of an exponential distribution.
Parameters
----------
parameters : ARRAY_FLOAT
The parameter of an exponential distribution
(i.e., the rate parameter).
Returns
------
None
The function exits without any return value when nothing is wrong.
Raises
------
ValueError
If any of the parameter values are invalid
or the shapes are inconsistent.
"""
# Verify overall shape
verify_param_nums(parameters.size, NUM_PARAMS, DISTRIBUTION_NAME)

if parameters[0] <= 0.0:
raise ValueError(
f"The corresponding rate parameter {parameters[0]}"
f"must be larger than 0.0!"
)


def lower(parameters: ARRAY_FLOAT) -> float:
"""Get the lower bound of an exponential distribution.
Parameters
----------
parameters : ARRAY_FLOAT
The parameter of an exponential distribution.
Returns
-------
float
The lower bound of the exponential distribution.
Notes
-----
- The parameters are not used in determining the lower bound of
the distribution; it must, however, appear for interface consistency.
The lower bound of a lognormal distribution is finite; it is 0.0.
"""
lower_bound = 0.0

return lower_bound


def upper(parameters: ARRAY_FLOAT) -> float:
"""Get the upper bound of an exponential distribution.
Parameters
----------
parameters : ARRAY_FLOAT
The parameters of an exponential distribution
Returns
-------
float
The upper bound of the exponential distribution.
Notes
-----
- Strictly speaking, an exponential distribution is unbounded on the right.
However, for numerical reason an upper bound is set.
- The upper bound of the exponential distribution is chosen such that
tbe probability mass between the lower and upper bound is at least
1 - 1e-15.
"""
# 36.7368005696771 is the quantile values with probability of 1-1e-16
# for the exponential distribution with rate parameter value 1.0
upper_bound = float(36.7368005696771 / parameters[0])

return upper_bound


def pdf(
xx: ARRAY_FLOAT,
parameters: ARRAY_FLOAT,
lower_bound: float,
upper_bound: float,
) -> ARRAY_FLOAT:
"""Get the PDF values of an exponential distribution.
Parameters
----------
xx : ARRAY_FLOAT
Sample values (realizations) of an exponential distribution.
parameters : ARRAY_FLOAT
Parameters of the exponential distribution.
lower_bound : float
Lower bound of the exponential distribution.
upper_bound : float
Upper bound of the exponential distribution.
Returns
-------
np.ndarray
PDF values of the exponential distribution on the sample values.
Notes
-----
- The PDF for sample with values outside the bounds are set to 0.0.
"""
yy = np.zeros(xx.shape)
idx = np.logical_and(xx >= lower_bound, xx <= upper_bound)
rate = parameters[0]
scale = 1 / rate
yy[idx] = expon.pdf(xx[idx], scale=scale)

return yy


def cdf(
xx: ARRAY_FLOAT,
parameters: ARRAY_FLOAT,
lower_bound: float,
upper_bound: float,
) -> ARRAY_FLOAT:
"""Get the CDF values of an exponential distribution.
Parameters
----------
xx : ARRAY_FLOAT
Sample values (realizations) of an exponential distribution.
parameters : ARRAY_FLOAT
Parameters of the exponential distribution.
lower_bound : float
Lower bound of the exponential distribution.
upper_bound : float
Upper bound of the exponential distribution.
Returns
-------
ARRAY_FLOAT
CDF values of the exponential distribution on the sample values.
Notes
-----
- The CDF for sample with values smaller (resp. larger)
than the lower bound (resp. upper bound) are set to 0.0 (resp. 1.0).
"""
yy = np.empty(xx.shape)
idx_lower = xx < lower_bound
idx_upper = xx > upper_bound
idx_rest = np.logical_and(
np.logical_not(idx_lower), np.logical_not(idx_upper)
)

yy[idx_lower] = 0.0
yy[idx_upper] = 1.0
rate = parameters[0]
scale = 1 / rate
yy[idx_rest] = expon.cdf(xx[idx_rest], scale=scale)

return yy


def icdf(
xx: ARRAY_FLOAT,
parameters: ARRAY_FLOAT,
lower_bound: float,
upper_bound: float,
) -> ARRAY_FLOAT:
"""Get the inverse CDF values of an exponential distribution.
Parameters
----------
xx : ARRAY_FLOAT
Sample values (realizations) in the [0, 1] domain.
parameters : ARRAY_FLOAT
Parameters of an exponential distribution.
lower_bound : float
Lower bound of the exponential distribution.
upper_bound : float
Upper bound of the exponential distribution.
Returns
-------
ARRAY_FLOAT
Transformed values in the domain of the normal distribution.
Notes
-----
- ICDF for sample values outside [0.0, 1.0] is set to NaN according to
the underlying SciPy implementation.
"""

# Compute the ICDF
rate = parameters[0]
scale = 1 / rate
yy = expon.ppf(xx, scale=scale)

# Check if values are within the set bounds
yy = postprocess_icdf(yy, lower_bound, upper_bound)

return yy
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def lower(parameters: ARRAY_FLOAT) -> float:
However, for numerical reason a lower bound is set.
- The lower bound of the normal distribution is chosen such that
the probability mass between the lower and upper bound is at least
1 - 1e.15.
1 - 1e-15.
"""
# -8.222082216130435 is the quantile values with probability of 1e-16
# for the standard Normal distribution (mu = 0.0, sigma = 1.0)
Expand Down
Loading

0 comments on commit b457ca0

Please sign in to comment.