Mixture¶

Mixture loglikelihood 

Normal mixture loglikelihood 

Mixture Same Family loglikelihood This distribution handles mixtures of multivariate distributions in a vectorized manner. 

class
pymc3.distributions.mixture.
Mixture
(name, *args, **kwargs)¶ Mixture loglikelihood
Often used to model subpopulation heterogeneity
\[f(x \mid w, \theta) = \sum_{i = 1}^n w_i f_i(x \mid \theta_i)\]Support
\(\cup_{i = 1}^n \textrm{support}(f_i)\)
Mean
\(\sum_{i = 1}^n w_i \mu_i\)
 Parameters
 w: array of floats
w >= 0 and w <= 1 the mixture weights
 comp_dists: multidimensional PyMC3 distribution (e.g. `pm.Poisson.dist(…)`)
or iterable of PyMC3 distributions the component distributions \(f_1, \ldots, f_n\)
Examples
# 2Mixture Poisson distribution with pm.Model() as model: lam = pm.Exponential('lam', lam=1, shape=(2,)) # `shape=(2,)` indicates two mixture components. # As we just need the logp, rather than add a RV to the model, we need to call .dist() components = pm.Poisson.dist(mu=lam, shape=(2,)) w = pm.Dirichlet('w', a=np.array([1, 1])) # two mixture component weights. like = pm.Mixture('like', w=w, comp_dists=components, observed=data) # 2Mixture Poisson using iterable of distributions. with pm.Model() as model: lam1 = pm.Exponential('lam1', lam=1) lam2 = pm.Exponential('lam2', lam=1) pois1 = pm.Poisson.dist(mu=lam1) pois2 = pm.Poisson.dist(mu=lam2) w = pm.Dirichlet('w', a=np.array([1, 1])) like = pm.Mixture('like', w=w, comp_dists = [pois1, pois2], observed=data) # npopMixture of multidimensional Gaussian npop = 5 nd = (3, 4) with pm.Model() as model: mu = pm.Normal('mu', mu=np.arange(npop), sigma=1, shape=npop) # Each component has an independent mean w = pm.Dirichlet('w', a=np.ones(npop)) components = pm.Normal.dist(mu=mu, sigma=1, shape=nd + (npop,)) # nd + (npop,) shaped multinomial like = pm.Mixture('like', w=w, comp_dists = components, observed=data, shape=nd) # The resulting mixture is ndshaped # Multidimensional Mixture as stacked independent mixtures with pm.Model() as model: mu = pm.Normal('mu', mu=np.arange(5), sigma=1, shape=5) # Each component has an independent mean w = pm.Dirichlet('w', a=np.ones(3, 5)) # w is a stack of 3 independent 5 component weight arrays components = pm.Normal.dist(mu=mu, sigma=1, shape=(3, 5)) # The mixture is an array of 3 elements. # Each can be thought of as an independent scalar mixture of 5 components like = pm.Mixture('like', w=w, comp_dists = components, observed=data, shape=3)

infer_comp_dist_shapes
(point=None)¶ Try to infer the shapes of the component distributions, comp_dists, and how they should broadcast together. The behavior is slightly different if comp_dists is a Distribution as compared to when it is a list of Distribution`s. When it is a list the following procedure is repeated for each element in the list: 1. Look up the `comp_dists.shape 2. If it is not empty, use it as comp_dist_shape 3. If it is an empty tuple, a single random sample is drawn by calling comp_dists.random(point=point, size=None), and the returned test_sample’s shape is used as the inferred comp_dists.shape
 Parameters
 point: None or dict (optional)
Dictionary that maps rv names to values, to supply to self.comp_dists.random
 Returns
 comp_dist_shapes: shape tuple or list of shape tuples.
If comp_dists is a Distribution, it is a shape tuple of the inferred distribution shape. If comp_dists is a list of Distribution`s, it is a list of shape tuples inferred for each element in `comp_dists
 broadcast_shape: shape tuple
The shape that results from broadcasting all component’s shapes together.

logp
(value)¶ Calculate logprobability of defined Mixture distribution at specified value.
 Parameters
 value: numeric
Value(s) for which logprobability is calculated. If the log probabilities for multiple values are desired the values must be provided in a numpy array or theano tensor
 Returns
 TensorVariable

random
(point=None, size=None)¶ Draw random values from defined Mixture distribution.
 Parameters
 point: dict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
 size: int, optional
Desired size of random sample (returns one sample if not specified).
 Returns
 array

class
pymc3.distributions.mixture.
MixtureSameFamily
(name, *args, **kwargs)¶ Mixture Same Family loglikelihood This distribution handles mixtures of multivariate distributions in a vectorized manner. It is used over Mixture distribution when the mixture components are not present on the last axis of components’ distribution.
Support
\(\textrm{support}(f)\)
Mean
\(w\mu\)
 Parameters
 w: array of floats
w >= 0 and w <= 1 the mixture weights
 comp_dists: PyMC3 distribution (e.g. `pm.Multinomial.dist(…)`)
The comp_dists can be scalar or multidimensional distribution. Assuming its shape to be  (i_0, …, i_n, mixture_axis, i_n+1, …, i_N), the mixture_axis is consumed resulting in the shape of mixture as  (i_0, …, i_n, i_n+1, …, i_N).
 mixture_axis: int, default = 1
Axis representing the mixture components to be reduced in the mixture.
Notes
The default behaviour resembles Mixture distribution wherein the last axis of component distribution is reduced.

logp
(value)¶ Calculate logprobability of defined
MixtureSameFamily
distribution at specified value. Parameters
 valuenumeric
Value(s) for which logprobability is calculated. If the log probabilities for multiple values are desired the values must be provided in a numpy array or theano tensor
 Returns
 TensorVariable

random
(point=None, size=None)¶ Draw random values from defined
MixtureSameFamily
distribution. Parameters
 pointdict, optional
Dict of variable values on which random values are to be conditioned (uses default point if not specified).
 sizeint, optional
Desired size of random sample (returns one sample if not specified).
 Returns
 array

class
pymc3.distributions.mixture.
NormalMixture
(name, *args, **kwargs)¶ Normal mixture loglikelihood
\[f(x \mid w, \mu, \sigma^2) = \sum_{i = 1}^n w_i N(x \mid \mu_i, \sigma^2_i)\]Support
\(x \in \mathbb{R}\)
Mean
\(\sum_{i = 1}^n w_i \mu_i\)
Variance
\(\sum_{i = 1}^n w_i^2 \sigma^2_i\)
 Parameters
 w: array of floats
w >= 0 and w <= 1 the mixture weights
 mu: array of floats
the component means
 sigma: array of floats
the component standard deviations
 tau: array of floats
the component precisions
 comp_shape: shape of the Normal component
notice that it should be different than the shape of the mixture distribution, with one axis being the number of components.
Notes
You only have to pass in sigma or tau, but not both.
Examples
n_components = 3 with pm.Model() as gauss_mix: μ = pm.Normal( "μ", data.mean(), 10, shape=n_components, transform=pm.transforms.ordered, testval=[1, 2, 3], ) σ = pm.HalfNormal("σ", 10, shape=n_components) weights = pm.Dirichlet("w", np.ones(n_components)) pm.NormalMixture("y", w=weights, mu=μ, sigma=σ, observed=data)