{
 "metadata": {
  "name": "ps09"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# MCMC for the Bradley-Terry Model\n",
      "\n",
      "Consider a model with continuous parameters $\\{\\theta_i\\}$ and integer data $\\{V_{ij}\\}$, with a likelihood function which is\n",
      "$$\n",
      "f(\\{V_{ij}\\}|\\{\\theta_i\\}) \\propto \\prod_{i=1}^n \\prod_{j=1}^n P_{ij}^{V_{ij}}\n",
      "$$\n",
      "where\n",
      "$$\n",
      "P_{ij} = (1+e^{\\theta_j-\\theta_i})^{-1}\n",
      "$$\n",
      "If the prior is uniform in the $\\{\\theta_i\\}$, the posterior will be\n",
      "$$\n",
      "f(\\{\\theta_i\\}|\\{V_{ij}\\}) \\propto \\prod_{i=1}^n \\prod_{j=1}^n P_{ij}^{V_{ij}}\n",
      "$$\n",
      "as well.  We explore this model with a Markov Chain Monte Carlo, which we'll use to produce posterior pdfs.  We'll use numpy array magic to make it easier.\n",
      "\n",
      "We'll also assume a Gaussian proposal distribution, with width $\\sigma$ in each dimension."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%pylab inline\n",
      "from __future__ import division\n",
      "rcParams['figure.figsize'] = (10.0, 8.0)\n",
      "rcParams['font.size'] = 14"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "This is the function that does the jumping around.  Note that if `theta` is an $n$-element array, `theta[None,:]` is $1\\times n$ and `theta[:,None]` is $n\\times 1$.  Ordinarily numpy does subtraction element-by-element, but if you try to subtract an $n\\times 1$ array $B$ from a $1\\times n$ array $A$, you get an $n\\times n$ array with elements $A_{0j}-B_{i0}$, which is just what we want in this case."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def nextpoint(theta,pdf):\n",
      "    # Relies on globals V (matrix of data), nparams (number of parameters) and sigma (width of proposal distribution)\n",
      "    thetanew = theta + sigma * randn(nparams)\n",
      "    P = 1./(1.+exp(thetanew[None,:]-thetanew[:,None]))\n",
      "    pdfnew = prod(P**V)\n",
      "    ratio = pdfnew/pdf\n",
      "    if ratio >= rand():\n",
      "        return (thetanew,pdfnew)\n",
      "    else:\n",
      "        return (theta,pdf)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "For our simple model, we take\n",
      "$\n",
      "\\{V_{ij}\\} =\n",
      "\\begin{pmatrix}\n",
      "0 & 2 & 1 \\\\\\\\\n",
      "0 & 0 & 2 \\\\\\\\\n",
      "1 & 1 & 0\n",
      "\\end{pmatrix}\n",
      "$\n",
      "and assume $\\sigma=1$"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "V = array([[0,2,1],[0,0,2],[1,1,0]])\n",
      "nparams = V.shape[0]\n",
      "\n",
      "theta = zeros(nparams)\n",
      "P = 1./(1.+exp(theta[None,:]-theta[:,None]))\n",
      "pdf = prod(P**V)\n",
      "sigma = 1\n",
      "Nsteps = 10**5\n",
      "thetatrace = zeros((nparams,Nsteps))\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Now we actually do the MCMC.  For convenience we save the chain in the `nparams`$\\times$`Nsteps` array `thetatrace`."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "for step in xrange(Nsteps):\n",
      "    thetatrace[:,step] = theta\n",
      "    (theta,pdf) = nextpoint(theta,pdf)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We want to have a look at the trajectory, or at least the points visited.  The parameter space is three-dimensional, but we can project down onto the $\\theta_1$,$\\theta_2$ plane:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "plot(thetatrace[0,:],thetatrace[1,:],'k,')\n",
      "xlabel(r'$\\theta_1$'); ylabel(r'$\\theta_2$');"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We see a problem here, which is that the likelihood (and thus the posterior) depends only on the differences between parameters, so there's nothing to make the chain prefer one value of $\\sum_{i=1}^n \\theta_i$ over another.  We can use this to our advantage, though, since it means we can just look at the two-dimensional space spanned by $\\theta_1-\\theta_2$ and $\\theta_2-\\theta_3$ (since any difference can be constructed from these two."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "plot(thetatrace[0,:]-thetatrace[1,:],thetatrace[1,:]-thetatrace[2,:],'k,');\n",
      "xlabel(r'$\\theta_1-\\theta_2$'); ylabel(r'$\\theta_2-\\theta_3$');\n",
      "xlim([-15,15]); ylim([-15,15]); grid(True)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Mean and standard deviation\n",
      "\n",
      "We can estimate expectation values like $\\langle\\theta_1-\\theta_2\\rangle$ by taking averages over the chain, and use the `cumsum` (cumulative sum) function to construct the average so far as we go along the chain."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "t12bartrace = cumsum(thetatrace[0,:]-thetatrace[1,:])/(1.+arange(Nsteps))\n",
      "semilogx(t12bartrace,label=r'$\\theta_1-\\theta_2$'); xlabel('MCMC step'); ylabel('Cumulative mean'); legend(loc='upper left');"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**EXERCISE** produce a plot with the means $\\langle\\theta_1-\\theta_2\\rangle$ and $\\langle\\theta_2-\\theta_3\\rangle$ on the same set of axes. "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**EXERCISE** produce a plot with the standard deviations of $\\theta_1-\\theta_2$ and $\\theta_2-\\theta_3$ on the same set of axes. "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Marginal pdfs\n",
      "\n",
      "We can estimate the marginal pdf for any quantity by simply constructing a histogram for that quantity:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "hist(thetatrace[0,:]-thetatrace[1,:],bins=sqrt(Nsteps),normed=True);\n",
      "xlabel(r'$\\theta_1-\\theta_2$'); ylabel('pdf');"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**EXERCISE** Do the same for $\\theta_2-\\theta_3$ and $\\theta_3-\\theta_1$."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Dependence on proposal distribution\n",
      "\n",
      "**EXERCISE** Redo the MCMC simulation with $\\sigma=10$, and again with $\\sigma=0.01$.  (Save the traces in to new matrices because we'll want to compare them.)  Produce a scatter plot of $\\theta_2-\\theta_3$ versys $\\theta_1-\\theta_2$ in each case, making sure to have the same limits on the plot as before.  Comment on the results in each case."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**EXERCISE** Plot the evolution of the cumulative mean $\\theta_1-\\theta_2$ for all three chains on the same set of axes; compare (qualitatively) the rate at which each converges."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [],
     "language": "python",
     "metadata": {},
     "outputs": []
    }
   ],
   "metadata": {}
  }
 ]
}