Distributions¶

The distributions module contains classes to instantiate probability distributions, which describe the likelihood of either a parameter or a datapoint taking any given value. Distribution objects are used to represent both the predicted probability distribution of the data, and also the parameters’ posteriors and priors.

Discrete Distributions¶

Bernoulli
Categorical
OneHotCategorical
Poisson
Dirichlet

Other¶

Mixture
HiddenMarkovModel

class probflow.distributions.Bernoulli(logits=None, probs=None)[source]¶

Bases: probflow.utils.base.BaseDistribution

The Bernoulli distribution.

The Bernoulli distribution is a discrete distribution defined over only two integers: 0 and 1. It has one parameter:

a probability parameter (\(0 \leq p \leq 1\)).

A random variable \(x\) drawn from a Bernoulli distribution

\[x \sim \text{Bernoulli}(p)\]

takes the value \(1\) with probability \(p\), and takes the value \(0\) with probability \(p-1\).

TODO: example image of the distribution

TODO: specifying either logits or probs

Parameters

logits (int, float, ndarray, or Tensor) – Logit-transformed probability parameter of the Bernoulli distribution (\(\p\))
probs (int, float, ndarray, or Tensor) – Logit-transformed probability parameter of the Bernoulli distribution (\(\p\))

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.Categorical(logits=None, probs=None)[source]¶

Bases: probflow.utils.base.BaseDistribution

The Categorical distribution.

The Categorical distribution is a discrete distribution defined over \(N\) integers: 0 through \(N-1\). A random variable \(x\) drawn from a Categorical distribution

\[x \sim \text{Categorical}(\mathbf{\theta})\]

has probability

\[p(x=i) = p_i\]

TODO: example image of the distribution

TODO: logits vs probs

Parameters

logits (int, float, ndarray, or Tensor) – Logit-transformed category probabilities (\(\frac{\mathbf{\theta}}{1-\mathbf{\theta}}\))
probs (int, float, ndarray, or Tensor) – Raw category probabilities (\(\mathbf{\theta}\))

prob(y)[source]¶: Doesn’t broadcast correctly when logits/probs and y are same dims

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)[source]¶: Doesn’t broadcast correctly when logits/probs and y are same dims

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.Cauchy(loc=0, scale=1)[source]¶

Bases: probflow.utils.base.BaseDistribution

The Cauchy distribution.

The Cauchy distribution is a continuous distribution defined over all real numbers, and has two parameters:

a location parameter (loc or \(\mu\)) which determines the median of the distribution, and
a scale parameter (scale or \(\gamma > 0\)) which determines the spread of the distribution.

A random variable \(x\) drawn from a Cauchy distribution

\[x \sim \text{Cauchy}(\mu, \gamma)\]

has probability

\[p(x) = \frac{1}{\pi \gamma \left[ 1 + \left( \frac{x-\mu}{\gamma} \right)^2 \right]}\]

The Cauchy distribution is equivalent to a Student’s t-distribution with one degree of freedom.

TODO: example image of the distribution

Parameters

loc (int, float, ndarray, or Tensor) – Median of the Cauchy distribution (\(\mu\)). Default = 0
scale (int, float, ndarray, or Tensor) – Spread of the Cauchy distribution (\(\gamma\)). Default = 1

mean()[source]¶

Compute the mean of this distribution.

Note that the mean of a Cauchy distribution is technically undefined.

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.Deterministic(loc=0)[source]¶

Bases: probflow.utils.base.BaseDistribution

A deterministic distribution.

A deterministic distribution is a continuous distribution defined over all real numbers, and has one parameter:

a location parameter (loc or \(k_0\)) which determines the mean of the distribution.

A random variable \(x\) drawn from a deterministic distribution has probability of 1 at its location parameter value, and zero elsewhere:

\[\begin{split}p(x) = \begin{cases} 1, & \text{if}~x=k_0 \\ 0, & \text{otherwise} \end{cases}\end{split}\]

TODO: example image of the distribution

Parameters: loc (int, float, ndarray, or Tensor) – Mean of the deterministic distribution (\(k_0\)). Default = 0

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.Dirichlet(concentration)[source]¶

Bases: probflow.utils.base.BaseDistribution

The Dirichlet distribution.

The Dirichlet distribution is a continuous distribution defined over the \(k\)-simplex, and has one vector of parameters:

concentration parameters (concentration or \(\boldsymbol{\alpha} \in \mathbb{R}^{k}_{>0}\)), a vector of positive numbers which determine the relative likelihoods of different categories represented by the distribution.

A random variable (a vector) \(\mathbf{x}\) drawn from a Dirichlet distribution

\[\mathbf{x} \sim \text{Dirichlet}(\boldsymbol{\alpha})\]

has probability

\[p(\mathbf{x}) = \frac{1}{\mathbf{\text{B}}(\boldsymbol{\alpha})} \prod_{i=1}^K x_i^{\alpha_i-1}\]

where \(\mathbf{\text{B}}\) is the multivariate beta function.

TODO: example image of the distribution

Parameters: concentration (ndarray, or Tensor) – Concentration parameter of the Dirichlet distribution (\(\alpha\)).

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.Gamma(concentration, rate)[source]¶

Bases: probflow.utils.base.BaseDistribution

The Gamma distribution.

The Gamma distribution is a continuous distribution defined over all positive real numbers, and has two parameters:

a shape parameter (shape or \(\alpha > 0\), a.k.a. “concentration”), and
a rate parameter (rate or \(\beta > 0\)).

The ratio of \(\frac{\alpha}{\beta}\) determines the mean of the distribution, and the ratio of \(\frac{\alpha}{\beta^2}\) determines the variance.

A random variable \(x\) drawn from a Gamma distribution

\[x \sim \text{Gamma}(\alpha, \beta)\]

has probability

\[p(x) = \frac{\beta^\alpha}{\Gamma (\alpha)} x^{\alpha-1} \exp (-\beta x)\]

Where \(\Gamma\) is the Gamma function.

TODO: example image of the distribution

Parameters

shape (int, float, ndarray, or Tensor) – Shape parameter of the gamma distribution (\(\alpha\)).
rate (int, float, ndarray, or Tensor) – Rate parameter of the gamma distribution (\(\beta\)).

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.HiddenMarkovModel(initial, transition, observation, steps)[source]¶

Bases: probflow.utils.base.BaseDistribution

A hidden Markov model distribution

TODO: docs

\[\begin{split}p(X_0) \text{initial probability} \\\end{split}\]

TODO: example image of the distribution

Parameters: initial (ndarray, or Tensor) – Concentration parameter of the Dirichlet distribution (\(\alpha\)).

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.InverseGamma(concentration, scale)[source]¶

Bases: probflow.utils.base.BaseDistribution

The Inverse-gamma distribution.

The Inverse-gamma distribution is a continuous distribution defined over all positive real numbers, and has two parameters:

a shape parameter (shape or \(\alpha > 0\), a.k.a. “concentration”), and
a rate parameter (rate or \(\beta > 0\), a.k.a. “scale”).

The ratio of \(\frac{\beta}{\alpha-1}\) determines the mean of the distribution, and for \(\alpha > 2\), the variance is determined by:

\[\frac{\beta^2}{(\alpha-1)^2(\alpha-2)}\]

A random variable \(x\) drawn from an Inverse-gamma distribution

\[x \sim \text{InvGamma}(\alpha, \beta)\]

has probability

\[p(x) = \frac{\beta^\alpha}{\Gamma (\alpha)} x^{-\alpha-1} \exp (-\frac{\beta}{x})\]

Where \(\Gamma\) is the Gamma function.

TODO: example image of the distribution

Parameters

concentration (int, float, ndarray, or Tensor) – Shape parameter of the inverse gamma distribution (\(\alpha\)).
scale (int, float, ndarray, or Tensor) – Rate parameter of the inverse gamma distribution (\(\beta\)).

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.Mixture(distributions, logits=None, probs=None)[source]¶

Bases: probflow.utils.base.BaseDistribution

A mixture distribution.

TODO

TODO: example image of the distribution w/ 2 gaussians

Parameters

distributions (Distribution) – Distributions to mix.
logits (|Tensor|) – Logit probabilities of the mixture weights. Either this or probs must be specified.
probs (|Tensor|) – Raw probabilities of the mixture weights. Either this or probs must be specified. Must sum to 1 along the last axis.

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.MultivariateNormal(loc, cov)[source]¶

Bases: probflow.utils.base.BaseDistribution

The multivariate Normal distribution.

The multivariate normal distribution is a continuous distribution in \(d\)-dimensional space, and has two parameters:

a location vector (loc or \(\boldsymbol{\mu} \in \mathbb{R}^d\)) which determines the mean of the distribution, and
a covariance matrix (scale or \(\boldsymbol{\Sigma} \in \mathbb{R}^{d \times d}_{>0}\)) which determines the spread and covariance of the distribution.

A random variable \(\mathbf{x} \in \mathbb{R}^d\) drawn from a multivariate normal distribution

\[\mathbf{x} \sim \mathcal{N}(\boldsymbol{\mu}, \boldsymbol{\Sigma})\]

has probability

\[p(\mathbf{x}) = (2\pi)^{-\frac{d}{2}} \det(\boldsymbol{\Sigma})^{-\frac{1}{2}} \exp \left( -\frac{1}{2} (\mathbf{x}-\boldsymbol{\mu})^\top \boldsymbol{\Sigma}^{-1} (\mathbf{x}-\boldsymbol{\mu}) \right)\]

TODO: example image of the distribution

Parameters

loc (ndarray, or Tensor) – Mean of the multivariate normal distribution (\(\boldsymbol{\mu}\)).
cov (ndarray, or Tensor) – Covariance matrix of the multivariate normal distribution (\(\boldsymbol{\Sigma}\)).

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.Normal(loc=0, scale=1)[source]¶

Bases: probflow.utils.base.BaseDistribution

The Normal distribution.

The normal distribution is a continuous distribution defined over all real numbers, and has two parameters:

a location parameter (loc or \(\mu\)) which determines the mean of the distribution, and
a scale parameter (scale or \(\sigma > 0\)) which determines the standard deviation of the distribution.

A random variable \(x\) drawn from a normal distribution

\[x \sim \mathcal{N}(\mu, \sigma)\]

has probability

\[p(x) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp \left( -\frac{(x-\mu)^2}{2 \sigma^2} \right)\]

TODO: example image of the distribution

Parameters

loc (int, float, ndarray, or Tensor) – Mean of the normal distribution (\(\mu\)). Default = 0
scale (int, float, ndarray, or Tensor) – Standard deviation of the normal distribution (\(\sigma\)). Default = 1

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.OneHotCategorical(logits=None, probs=None)[source]¶

Bases: probflow.utils.base.BaseDistribution

The Categorical distribution, parameterized by categories-len vectors.

TODO: explain

TODO: example image of the distribution

TODO: logits vs probs

Parameters

logits (int, float, ndarray, or Tensor) – Logit-transformed category probabilities
probs (int, float, ndarray, or Tensor) –

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.Poisson(rate)[source]¶

Bases: probflow.utils.base.BaseDistribution

The Poisson distribution.

The Poisson distribution is a discrete distribution defined over all non-negativve real integers, and has one parameter:

a rate parameter (rate or \(\lambda\)) which determines the mean of the distribution.

A random variable \(x\) drawn from a Poisson distribution

\[x \sim \text{Poisson}(\lambda)\]

has probability

\[p(x) = \frac{\lambda^x e^{-\lambda}}{x!}\]

TODO: example image of the distribution

Parameters: rate (int, float, ndarray, or Tensor) – Rate parameter of the Poisson distribution (\(\lambda\)).

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mean()¶

Compute the mean of this distribution

Note that this uses the mode of distributions for which the mean is undefined (for example, a categorical distribution)

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution

class probflow.distributions.StudentT(df=1, loc=0, scale=1)[source]¶

Bases: probflow.utils.base.BaseDistribution

The Student-t distribution.

The Student’s t-distribution is a continuous distribution defined over all real numbers, and has three parameters:

a degrees of freedom parameter (df or \(\nu > 0\)), which determines how many degrees of freedom the distribution has,
a location parameter (loc or \(\mu\)) which determines the mean of the distribution, and
a scale parameter (scale or \(\sigma > 0\)) which determines the standard deviation of the distribution.

A random variable \(x\) drawn from a Student’s t-distribution

\[x \sim \text{StudentT}(\nu, \mu, \sigma)\]

has probability

\[p(x) = \frac{\Gamma(\frac{\nu+1}{2})}{\sqrt{\nu x} \Gamma (\frac{\nu}{2})} \left( 1 + \frac{x^2}{\nu} \right)^ {-\frac{\nu+1}{2}}\]

Where \(\Gamma\) is the Gamma function.

TODO: example image of the distribution

Parameters

df (int, float, ndarray, or Tensor) – Degrees of freedom of the t-distribution (\(\nu\)). Default = 1
loc (int, float, ndarray, or Tensor) – Median of the t-distribution (\(\mu\)). Default = 0
scale (int, float, ndarray, or Tensor) – Spread of the t-distribution (\(\sigma\)). Default = 1

mean()[source]¶

Compute the mean of this distribution.

Note that the mean of a StudentT distribution is technically undefined when df=1.

cdf(y)¶: Cumulative probability of some data along this distribution

log_prob(y)¶: Compute the log probability of some data given this distribution

mode()¶: Compute the mode of this distribution

prob(y)¶: Compute the probability of some data given this distribution

sample(n=1)¶: Generate a random sample from this distribution