Statistics Explained (by Brad Williamson and Kevin Karplus AP Biology Experts)

Standard Deviation: is a measure of variability IF you know that your data is approximately normally distributed. (your data when plotted in a histogram approaches the bell shaped, normal curve) (there are mathematical tests for normalcy and methods for "normalizing" data as well. ) For many of the questions that students will ask in the AP lab program this (normal curve) may be the case especially those experiments that involve some sort of measurement. If you have data that is not normally distributed then you can use boxplots and quartiles to indicate the variability in the data. The standard deviation should be reported along with sample size and mean. However, don't do what so many people do, use standard deviation for an error bar. There are very few instances where this practice is appropriate, instead use the standard deviation to calculate the more powerful and meaningful parameter, the standard error.

Mean, Median, Mode

Standard Error:The committee that is in charge of determining such things has recommended that this statistic be called Standard Uncertainty which I agree is a much better label. Another name for this particular parameter suggested in the AP redesign is the standard deviation of the sample means. Again, this statistic is for normal curve data. The formula is the square root of the standard deviation divided by the sample size. There's a really cool theorem called the central limits theorem. In a nutshell what this theorem proves is that for an unknown population with almost any kind of distribution if you repeatedly draw out samples and take the mean each time, eventually all the various means (each mean is an estimate of the unknown population mean) will create a normal curve around the unknown population mean. The width or spread of that curve can be estimated from two factors:  the standard deviation of one of your sample and the size of your sample. The larger your sample the narrower your curve. The smaller your standard deviation, the narrower your curve. Stop just a minute and revisit what I just said and think about the consequences.....the standard error is the standard deviation of the sample means Plus or minus one standard error from your sample mean provides you with an estimate of how well your sample estimates the mean of the unknown population. In fact, with some level of precision you could state that you are about 67% confident that the true population means lies between plus one and minus one standard error. Or even better, if you plotted error bars with 2 standard errors (plus and minus) that would bound an area where you would be about 95% confident that the true population mean was bounded by the error bars. When appropriate (normal data) plot error bars with standard error...er I mean standard uncertainty.

MORE on Standard Error and its uses: The standard error is the standard deviation of an estimate of a parameter.For example, the standard error for the estimate of the mean value from a sample is the sample standard deviation of the sample divided by the square root of the sample size.  If you have a large sample, you can get a good estimate of the mean of the population even if the standard deviation of the population is large.  A common use of standard error in biology is for computing the Student t-test to determine if the mean of a sample differs significantly from a fixed value or (more commonly) whether the means of two samples differ significantly from each other. Note that the Student t-test (and variants of it) rely very strongly on the populations being normally distributed.

P-value (p-Value) (http://en.wikipedia.org/wiki/P-value)

Example of a p-value for sample size 1 (the test statistic is the single observed value). The vertical coordinate is probability if the statistic is discrete and in that case we should see a probability histogram rather than a curve. (The curve shown is represented as a probability density function so the label on the vertical axis is misleading.) Data yielding a p-value of .05 means there is only a 5% chance obtaining the observed (or more extreme) result if no real effect exists.[3]This definition is rather counter-intuitive, leading to many misunderstandings and misinterpretations. "Common sense" tells us to judge our hypotheses based on how well they fit observed evidence. This is not what a p-value describes. Instead, it describes the likelihood of observing certain data given that the null hypothesis is true. (http://en.wikipedia.org/wiki/P-value)



t-Test....The t-Test (of which there are several) is calculated from the standard errors. The t-Test is a way of testing a hypothesis and the t-test generates a probability distribution called a t-distribution that varies depending on how many degrees of freedom there are (you didn't ask about degrees of freedom but I don't really want to go into that here). With this distribution you can precisely define a p-value that provides you with a probability that your results or more extreme results were due to chance alone.  Again the t-test is for normal data.... (i.e., non-discrete continuous data).

Chi-square test: In the AP lab manual there are labs that a chi-square test would be more appropriate--say a behavioral choice lab. This test requires “discrete, categorical or parametric” data is very different from a lab that would require something like a t-test.

And more on Chi-square: Pearson's chi-squared test is used for determining the goodness of fit between a set of observations and a parametric distribution. Most often it is used for "categorical" distributions, where there are a finite number of different values that the random variable can have (like the cells of a 2x2 table). It is generally only useful when you have at least 5 counts in each cell—for smaller samples you should probably use a Fisher exact test instead, which requires more computation.  The chi-squared test is a bit old-fashioned these days, since the Fisher exact test is easily done by a computer and does not require the asymptotic approximation of the chi-squared test, but the chi-squared test is still widely used, and is good enough when samples are large.

Finally, some have wondered about the t-test. Will it be on the test.  For now I doubt it but it is part of what the curriculum framework calls for when it calls for appropriate data analysis of laboratories.  Your students should be using t-tests or other such test where appropriate. No--you don't have to teach statistics but many university lab programs include introductory, intuitive stats for specific labs--this includes the t-test, Anova, Mann-Whitney U, and chi-square, correlation, Spearman rank correlation and other tests. Students need to know what a p-value is and how to interpret it when they read scientific work.  



Can you explain this cartoon?

http://imgs.xkcd.com/comics/significant.png