{ "metadata": { "name": "ps09" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# MCMC for the Bradley-Terry Model\n", "\n", "Consider a model with continuous parameters $\\{\\theta_i\\}$ and integer data $\\{V_{ij}\\}$, with a likelihood function which is\n", "$$\n", "f(\\{V_{ij}\\}|\\{\\theta_i\\}) \\propto \\prod_{i=1}^n \\prod_{j=1}^n P_{ij}^{V_{ij}}\n", "$$\n", "where\n", "$$\n", "P_{ij} = (1+e^{\\theta_j-\\theta_i})^{-1}\n", "$$\n", "If the prior is uniform in the $\\{\\theta_i\\}$, the posterior will be\n", "$$\n", "f(\\{\\theta_i\\}|\\{V_{ij}\\}) \\propto \\prod_{i=1}^n \\prod_{j=1}^n P_{ij}^{V_{ij}}\n", "$$\n", "as well. We explore this model with a Markov Chain Monte Carlo, which we'll use to produce posterior pdfs. We'll use numpy array magic to make it easier.\n", "\n", "We'll also assume a Gaussian proposal distribution, with width $\\sigma$ in each dimension." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%pylab inline\n", "from __future__ import division\n", "rcParams['figure.figsize'] = (10.0, 8.0)\n", "rcParams['font.size'] = 14" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is the function that does the jumping around. Note that if `theta` is an $n$-element array, `theta[None,:]` is $1\\times n$ and `theta[:,None]` is $n\\times 1$. Ordinarily numpy does subtraction element-by-element, but if you try to subtract an $n\\times 1$ array $B$ from a $1\\times n$ array $A$, you get an $n\\times n$ array with elements $A_{0j}-B_{i0}$, which is just what we want in this case." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def nextpoint(theta,pdf):\n", " # Relies on globals V (matrix of data), nparams (number of parameters) and sigma (width of proposal distribution)\n", " thetanew = theta + sigma * randn(nparams)\n", " P = 1./(1.+exp(thetanew[None,:]-thetanew[:,None]))\n", " pdfnew = prod(P**V)\n", " ratio = pdfnew/pdf\n", " if ratio >= rand():\n", " return (thetanew,pdfnew)\n", " else:\n", " return (theta,pdf)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For our simple model, we take\n", "$\n", "\\{V_{ij}\\} =\n", "\\begin{pmatrix}\n", "0 & 2 & 1 \\\\\\\\\n", "0 & 0 & 2 \\\\\\\\\n", "1 & 1 & 0\n", "\\end{pmatrix}\n", "$\n", "and assume $\\sigma=1$" ] }, { "cell_type": "code", "collapsed": false, "input": [ "V = array([[0,2,1],[0,0,2],[1,1,0]])\n", "nparams = V.shape[0]\n", "\n", "theta = zeros(nparams)\n", "P = 1./(1.+exp(theta[None,:]-theta[:,None]))\n", "pdf = prod(P**V)\n", "sigma = 1\n", "Nsteps = 10**5\n", "thetatrace = zeros((nparams,Nsteps))\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we actually do the MCMC. For convenience we save the chain in the `nparams`$\\times$`Nsteps` array `thetatrace`." ] }, { "cell_type": "code", "collapsed": false, "input": [ "for step in xrange(Nsteps):\n", " thetatrace[:,step] = theta\n", " (theta,pdf) = nextpoint(theta,pdf)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We want to have a look at the trajectory, or at least the points visited. The parameter space is three-dimensional, but we can project down onto the $\\theta_1$,$\\theta_2$ plane:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "plot(thetatrace[0,:],thetatrace[1,:],'k,')\n", "xlabel(r'$\\theta_1$'); ylabel(r'$\\theta_2$');" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We see a problem here, which is that the likelihood (and thus the posterior) depends only on the differences between parameters, so there's nothing to make the chain prefer one value of $\\sum_{i=1}^n \\theta_i$ over another. We can use this to our advantage, though, since it means we can just look at the two-dimensional space spanned by $\\theta_1-\\theta_2$ and $\\theta_2-\\theta_3$ (since any difference can be constructed from these two." ] }, { "cell_type": "code", "collapsed": false, "input": [ "plot(thetatrace[0,:]-thetatrace[1,:],thetatrace[1,:]-thetatrace[2,:],'k,');\n", "xlabel(r'$\\theta_1-\\theta_2$'); ylabel(r'$\\theta_2-\\theta_3$');\n", "xlim([-15,15]); ylim([-15,15]); grid(True)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Mean and standard deviation\n", "\n", "We can estimate expectation values like $\\langle\\theta_1-\\theta_2\\rangle$ by taking averages over the chain, and use the `cumsum` (cumulative sum) function to construct the average so far as we go along the chain." ] }, { "cell_type": "code", "collapsed": false, "input": [ "t12bartrace = cumsum(thetatrace[0,:]-thetatrace[1,:])/(1.+arange(Nsteps))\n", "semilogx(t12bartrace,label=r'$\\theta_1-\\theta_2$'); xlabel('MCMC step'); ylabel('Cumulative mean'); legend(loc='upper left');" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**EXERCISE** produce a plot with the means $\\langle\\theta_1-\\theta_2\\rangle$ and $\\langle\\theta_2-\\theta_3\\rangle$ on the same set of axes. " ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**EXERCISE** produce a plot with the standard deviations of $\\theta_1-\\theta_2$ and $\\theta_2-\\theta_3$ on the same set of axes. " ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Marginal pdfs\n", "\n", "We can estimate the marginal pdf for any quantity by simply constructing a histogram for that quantity:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "hist(thetatrace[0,:]-thetatrace[1,:],bins=sqrt(Nsteps),normed=True);\n", "xlabel(r'$\\theta_1-\\theta_2$'); ylabel('pdf');" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**EXERCISE** Do the same for $\\theta_2-\\theta_3$ and $\\theta_3-\\theta_1$." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dependence on proposal distribution\n", "\n", "**EXERCISE** Redo the MCMC simulation with $\\sigma=10$, and again with $\\sigma=0.01$. (Save the traces in to new matrices because we'll want to compare them.) Produce a scatter plot of $\\theta_2-\\theta_3$ versys $\\theta_1-\\theta_2$ in each case, making sure to have the same limits on the plot as before. Comment on the results in each case." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**EXERCISE** Plot the evolution of the cumulative mean $\\theta_1-\\theta_2$ for all three chains on the same set of axes; compare (qualitatively) the rate at which each converges." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }