# Empirical Approximation overview¶

For most models we use sampling MCMC algorithms like Metropolis or NUTS.
In PyMC3 we got used to store traces of MCMC samples and then do
analysis using them. There is a similar concept for the variational
inference submodule in PyMC3: *Empirical*. This type of approximation
stores particles for the SVGD sampler. There is no difference between
independent SVGD particles and MCMC samples. *Empirical* acts as a
bridge between MCMC sampling output and full-fledged VI utils like
`apply_replacements`

or `sample_node`

. For the interface
description, see
variational_api_quickstart. Here
we will just focus on `Emprical`

and give an overview of specific
things for the *Empirical* approximation

```
In [1]:
```

```
%matplotlib inline
import matplotlib.pyplot as plt
import theano
import numpy as np
import pymc3 as pm
np.random.seed(42)
pm.set_tt_rng(42)
```

## Multimodal density¶

Let’s recall the problem from variational_api_quickstart where we first got a NUTS trace

```
In [2]:
```

```
w = pm.floatX([.2, .8])
mu = pm.floatX([-.3, .5])
sd = pm.floatX([.1, .1])
with pm.Model() as model:
x = pm.NormalMixture('x', w=w, mu=mu, sigma=sd, dtype=theano.config.floatX)
trace = pm.sample(50000)
```

```
INFO (theano.gof.compilelock): Waiting for existing lock by process '18726' (I am process '18876')
INFO (theano.gof.compilelock): To manually release the lock, delete /Users/twiecki/.theano/compiledir_Darwin-18.2.0-x86_64-i386-64bit-i386-3.6.7-64/lock_dir
INFO (theano.gof.compilelock): Waiting for existing lock by process '18726' (I am process '18876')
INFO (theano.gof.compilelock): To manually release the lock, delete /Users/twiecki/.theano/compiledir_Darwin-18.2.0-x86_64-i386-64bit-i386-3.6.7-64/lock_dir
INFO (theano.gof.compilelock): Waiting for existing lock by process '18726' (I am process '18876')
INFO (theano.gof.compilelock): To manually release the lock, delete /Users/twiecki/.theano/compiledir_Darwin-18.2.0-x86_64-i386-64bit-i386-3.6.7-64/lock_dir
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [x]
Sampling 2 chains: 100%|██████████| 101000/101000 [00:51<00:00, 1963.57draws/s]
/Users/twiecki/anaconda3/lib/python3.6/site-packages/mkl_fft/_numpy_fft.py:1044: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
output = mkl_fft.rfftn_numpy(a, s, axes)
The estimated number of effective samples is smaller than 200 for some parameters.
```

```
In [3]:
```

```
pm.traceplot(trace);
```

Great. First having a trace we can create `Empirical`

approx

```
In [4]:
```

```
print(pm.Empirical.__doc__)
```

```
**Single Group Full Rank Approximation**
Builds Approximation instance from a given trace,
it has the same interface as variational approximation
```

```
In [5]:
```

```
with model:
approx = pm.Empirical(trace)
```

```
In [6]:
```

```
approx
```

```
Out[6]:
```

```
<pymc3.variational.approximations.Empirical at 0x1c23a8abe0>
```

This type of approximation has it’s own underlying storage for samples
that is `theano.shared`

itself

```
In [7]:
```

```
approx.histogram
```

```
Out[7]:
```

```
histogram
```

```
In [8]:
```

```
approx.histogram.get_value()[:10]
```

```
Out[8]:
```

```
array([[0.44232642],
[0.43486722],
[0.57271322],
[0.58091718],
[0.38298766],
[0.38298766],
[0.33438251],
[0.7098557 ],
[0.61921361],
[0.54169809]])
```

```
In [9]:
```

```
approx.histogram.get_value().shape
```

```
Out[9]:
```

```
(100000, 1)
```

It has exactly the same number of samples that you had in trace before.
In our particular case it is 50k. Another thing to notice is that if you
have multitrace with **more than one chain** you’ll get much **more
samples** stored at once. We flatten all the trace for creating
`Empirical`

.

This *histogram* is about *how* we store samples. The structure is
pretty simple: `(n_samples, n_dim)`

The order of these variables is
stored internally in the class and in most cases will not be needed for
end user

```
In [10]:
```

```
approx.ordering
```

```
Out[10]:
```

```
<pymc3.blocking.ArrayOrdering at 0x114162c88>
```

Sampling from posterior is done uniformly with replacements. Call
`approx.sample(1000)`

and you’ll get again the trace but the order is
not determined. There is no way now to reconstruct the underlying trace
again with `approx.sample`

.

```
In [11]:
```

```
new_trace = approx.sample(50000)
```

```
INFO (theano.gof.compilelock): Waiting for existing lock by process '18726' (I am process '18876')
INFO (theano.gof.compilelock): To manually release the lock, delete /Users/twiecki/.theano/compiledir_Darwin-18.2.0-x86_64-i386-64bit-i386-3.6.7-64/lock_dir
```

```
In [12]:
```

```
%timeit new_trace = approx.sample(50000)
```

```
1.69 s ± 68.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
```

After sampling function is compiled sampling bacomes really fast

```
In [13]:
```

```
pm.traceplot(new_trace);
```

You see there is no order any more but reconstructed density is the same.

## 2d density¶

```
In [14]:
```

```
mu = pm.floatX([0., 0.])
cov = pm.floatX([[1, .5], [.5, 1.]])
with pm.Model() as model:
pm.MvNormal('x', mu=mu, cov=cov, shape=2)
trace = pm.sample(1000)
```

```
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [x]
Sampling 2 chains: 100%|██████████| 3000/3000 [00:04<00:00, 703.93draws/s]
/Users/twiecki/anaconda3/lib/python3.6/site-packages/mkl_fft/_numpy_fft.py:1044: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
output = mkl_fft.rfftn_numpy(a, s, axes)
```

```
In [15]:
```

```
with model:
approx = pm.Empirical(trace)
```

```
In [23]:
```

```
pm.traceplot(approx.sample(10000));
```

```
In [17]:
```

```
import seaborn as sns
```

```
In [18]:
```

```
sns.kdeplot(approx.sample(1000)['x'])
```

```
/Users/twiecki/anaconda3/lib/python3.6/site-packages/seaborn/distributions.py:679: UserWarning: Passing a 2D dataset for a bivariate plot is deprecated in favor of kdeplot(x, y), and it will cause an error in future versions. Please update your code.
warnings.warn(warn_msg, UserWarning)
/Users/twiecki/anaconda3/lib/python3.6/site-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval
```

```
Out[18]:
```

```
<matplotlib.axes._subplots.AxesSubplot at 0x1c2174d400>
```

Previously we had a `trace_cov`

function

```
In [19]:
```

```
with model:
print(pm.trace_cov(trace))
```

```
[[1.00494398 0.46960795]
[0.46960795 0.97373728]]
```

Now we can estimate the same covariance using `Empirical`

```
In [20]:
```

```
print(approx.cov)
```

```
Elemwise{true_div,no_inplace}.0
```

That’s a tensor itself

```
In [21]:
```

```
print(approx.cov.eval())
```

```
[[1.0044415 0.46937314]
[0.46937314 0.97325041]]
```

Estimations are very close and differ due to precision error. We can get the mean in the same way

```
In [22]:
```

```
print(approx.mean.eval())
```

```
[-0.00523404 -0.00168637]
```