Getting started with JAGS for Bayesian Modelling
In the past if you wanted to do some Bayesian Modelling using a Gibbs sampler you had to rely on Winbugs but it isn’t open source and it only runs in Windows. The JAGS project started as a full feature alternative for the Linux platform. Here are some instructions for getting started
First install the required dependencies. In my Ubuntu 11.04, is something like:
sudo apt-get install gfortran liblapack-dev libltdl-dev r-base-core
For Ubuntu 11.04 you have to install JAGS from sources, but it seems that this version will be packaged in the next Ubuntu release. Download the software from here and install.
tar xvfz JAGS-3.1.0.tar.gz cd JAGS-3.1.0 ./configure make sudo make install
Now fire R and install the interface package rjags
$ R
...
> install.package("rjags",dependencies=TRUE)
Now let's do something interesting (although pretty simple). Let's assume we have a stream of 1s and 0s with an unknown proportion of each one. From R we can generate such distribution with the command
> points <- floor(runif(1000)+.4)
that generates a distribution with roughly 40% of 1s and 60% of 0s. So, our stream consists of a sequence of 0s and 1s generated using the
If we don't know this parameter and we try to learn it, we can assume that this parameter has prior uniform in the range [0,1] and thus the model that describes this scenario in the Winbugs language is
model
{
phi ~ dunif(0,1);
y ~ dbin(phi,N);
}
In this model N and y are known, so we provide this information in order to estimate our unknown parameter
> result <- list(y=sum(points), N=1000)
> result
$y
[1] 393
$N
[1] 1000
> library(rjags)
Loading required package: coda
Loading required package: lattice
linking to JAGS 3.1.0
module basemod loaded
module bugs loaded
> myModel <- jags.model("model.dat", result)
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph Size: 5
Initializing model
> update(myModel, 1000)
|**************************************************| 100%
> x <- jags.samples(myModel, c('phi'), 1000)
|**************************************************| 100%
> x
$phi
mcarray:
[1] 0.3932681
Marginalizing over: iteration(1000),chain(1)
>
So the inferred value of
> chain <- as.mcmc.list(x$phi) > plot(chain)

Where we can see that the values for phi in the chain were centered around the 0.4, the true parameter value.
I don’t understand why you go to all that trouble to call JAGS when you’ve already got the estimate for phi after summing all the 1s? It comes out as 393/1000 for your example which is exactly what your parameter is estimated at the end. Can you explain?
We you are doing bayesian statistics your parameters are random variables not point estimations. A single number, 0.393, gives you no information about how the parameter under study is distributed. I need all the other stuff for knowing how sure I am about picking the median (0.3932681) as the true parameter value (0.4)
I’m trying to learn Bayesian analysis so please bear with me
The information you have acquired from the data is merely N=1000 and y=393, nothing more is known from the data (ie points). We aren’t looking at the distribution of the points after plucking the two summaries (N,y). That means in those two values you have all the information available to you at the beginning of the inference.
So my question is how are we getting more information out of the JAGS model when it doesn’t even look at all the data we know?
First, keep in mind that the motivation of the post was to show how to install JAGS, build a model and query it from R, so here we are dealing with the simplest model it came to my mind. And, given the model and the data, what JAGS does is to run lots of random samples conditioned to the evidence (y=393) and from those points I can infer different moments of the random variable for instance.
Understood. I guess I need to do more reading on Bayesian Analysis.
Btw yours was the best introduction i’ve come across for getting started with Bayesian software. I’ve read quite a few and this was the most accessible. The first one that I actually got to work. The fact that i’ve used it as a complete beginner to try out a whole analysis is amazing. Thanks for writing it up.
Thanks a lot!! it helps to know that your rants are useful to someone