{ "metadata": { "name": "ps08" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "code", "collapsed": false, "input": [ "%pylab inline\n", "from __future__ import division\n", "rcParams['figure.figsize'] = (8.0, 8.0)\n", "rcParams['font.size'] = 14\n", "from scipy import stats" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# a) Standard Normal Hypothesis\n", "\n", "We wish to make contour plots of the chi-squared statistic\n", "$$\n", "\\chi^2 = \\Bigl(\\frac{z_1-0}{1}\\Bigr)^2 + \\Bigl(\\frac{z_2-0}{1}\\Bigr)^2\n", "$$\n", "for the null hypothesis $\\mathcal{H}_0$ that $\\{z_1,z_2\\}$ are a sample drawn from a standard normal distribution. To do this, we need to make 2d arrays of $z_1$ and $z_2$ values in a grid, which we do with the `meshgrid` function." ] }, { "cell_type": "code", "collapsed": false, "input": [ "?meshgrid" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Given arrays $\\{x_j\\}$ and $\\{y_i\\}$, it creates arrays $\\{X_{ij}\\}$ and $\\{Y_{ij}\\}$ such that $X_{ij}=x_j$ and $Y_{ij}=y_i$." ] }, { "cell_type": "code", "collapsed": false, "input": [ "a=arange(0,8,2); print 'a=',a;\n", "b=arange(1,4); print 'b=',b;\n", "A,B = meshgrid(a,b)\n", "print 'A=',A\n", "print 'B=',B" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is just what we need to make a contour plot of $F_{ij}=f(x_j,y_i)$:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "?contour" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "z1,z2=meshgrid(linspace(-4,4,100),linspace(-4,4,100))" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "chisq = z1**2 + z2**2\n", "chisqlevels = arange(2,20,2)\n", "clabel(contour(z1,z2,chisq,chisqlevels,colors='k'))\n", "xlabel(r'$z_1$'); ylabel(r'$z_2$')\n", "grid(True)\n", "title(r'Contours of $\\chi^2$ for $\\mu=0$');" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we make a test which rejects $\\mathcal{H}_0$ when $\\chi^2>6$, for example, the contour labelled \"6.000\" above divides the space of possible data values into regions where the hypothesis is rejected, or not:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "clabel(contour(z1,z2,chisq,[6],colors='k'),fmt=r'$\\chi^2=%g$')\n", "xlabel(r'$z_1$'); ylabel(r'$z_2$')\n", "text(0,0,\"Don't reject $\\mathcal{H}_0$\",ha='center')\n", "text(0,-3,\"Reject $\\mathcal{H}_0$\",ha='center')\n", "grid(True)\n", "title(r'Critical region for $\\chi^2>6$ test');" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(You could make it even sexier by color-coding the regions using the `contourf` command, but the syntax, which requires defining a custom colormap, is somewhat clunky.)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, use the survival function `stats.chi2.sf` to create a grid of $p$-values corresponding to the $\\chi^2$ values in `chisq`, with the appropriate number of degrees of freedom, and plot it with contours at the following $p$ values:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "pvallevels=[0.1,0.05,0.02,0.01,0.005,0.002,0.001]" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# b) $N(\\theta,1)$ Hypothesis\n", "\n", "Now we allow the mean to be a tunable parameter, which makes the chi-squared statistic\n", "$$\n", "\\chi^2(\\theta) = \\Bigl(\\frac{z_1-\\theta}{1}\\Bigr)^2 + \\Bigl(\\frac{z_2-\\theta}{1}\\Bigr)^2\n", "$$\n", "Find the $\\widehat{\\theta}$ which minimizes this ($\\widehat{\\theta}$ will be a function of $z_1$ and $z_2$), and substitute this into $\\chi^2(\\theta)$ to get $\\chi^2(\\widehat{\\theta})$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Put your calculation and answer in the following box:**\n", "(You can use basic LaTeX synax)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make a contour plot of this reduced chi-squared on the $(z_1,z_2)$ plane. Make the contours blue by using `colors='b'` instead of `colors='k'`." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, use the survival function `stats.chi2.sf` to create a grid of $p$-values corresponding to these reduced $\\chi^2$ values, with the appropriate number of degrees of freedom (remembering that we've used the data to tune one parameter and minimize the $\\chi^2$ with respect to that parameter. Make a contour plot with the original (standard normal) $p$-values in black, and these new $p$-values for the best fit model in blue, using the same set of $p$-value levels for the contours as before." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }