TRC #305: Null Hypothesis Significance Testing + Racist Dogs + UFO Sighting Distribution

Posted on 13 July, 2014 by Pat

Pat’s away but the guys still pull off one of the greatest episodes of all time with #305. Elan starts off the show with an overview of Null Hypothesis Significance Testing (NHST). P-values galore! Next, Adam looks into whether some dogs are racist and why that may be. Darren closes out the show by discussing a recent article that measured the most common times and places that supposed UFO sightings occur.

Download direct: mp3 file

If you like the show, please leave us a review on itunes.

SHOW NOTES

Null Hypothesis Significance Testing

Null Hypothesis Significance Testing (NHST)

Wikipedia – P-value

Wikipedia – Null Hypothesis

Yale Stats – Confidence Intervals

Racist Dogs

Are some dogs racist? – The Straight Dope

Probing Question: Can animals really smell fear?

Can wild animals really “sense” the fear in other animals?

America’s Racist Japanese People Hunting Dogs Of Cat Island – Tofugu

UFO Sightings

The Economist

This entry was posted in The Reality Check Episodes and tagged dogs. Bookmark the permalink.

3 Responses to TRC #305: Null Hypothesis Significance Testing + Racist Dogs + UFO Sighting Distribution

Justin says:

14 July, 2014 at 9:29 am

My dog when it was a puupy was racsist to Blue balloons. Had no problem with the Red, yellow, purple, and Green balloons, but that blue balloon got barking and biting. She destroyed it. What we can conclude from this is my dog is a rascist alien. No problem with the green. Am I right.

Reply
Dallas says:

5 August, 2014 at 12:12 am

Hey Guys,

Nice coverage of null hypothesis significance testing and p-values. Elan, I think you did a great job of covering this tricky topic. There is obviously a lot more that could be said, but I just thought I’d add a few comments:

1. I think you just misspoke here, but you said that “once you have you’re p-value, it’s up to you decide what you consider to be statistically significant”. Obviously it’s important to decide that BEFORE you get your p-value, or else you could adjust your threshold so as to consider any result significant. Hence the commonly used threshold of 0.05.

2. You pointed out that if you get a p-value of 0.05, under the null hypothesis (i.e. assuming it’s true), there’s only a 5% chance that you’d have ended up with an equally or more extreme test statistic than you did. This is basically correct. Importantly, however, the truth of that claim also depends strongly on additional assumptions. P-values are computed from data, but depend on your choice of statistical model, hypothesis, and significance test. In particular, p-values relate to parametric models, where your hypothesis concerns the unknown true values of certain parameters that explain the distribution of the data. For instance, in your example of comparing the IQ of men and women, you might assume that IQ is normally distributed in the population. The normal distribution has two parameters: the mean and the variance, and these two values completely characterize the distribution. In your example you would be comparing the mean IQ of men and women. In this situation, you would also need to make assumptions about the variance (perhaps that it is the same in both men and women). Any claim about the interpretation of a p-value also depends on these assumptions, i.e. in this case that the quantity of interest is in fact normally distributed and the variances are equal, and violations of these assumptions should be considered as an alternative explanation for a small p-value.

3. You guys touch on this frequently, but I think it’s also important to emphasize that a p-value says nothing about the size of an effect. To take an absurd example, if you were comparing a quantity of interest in two populations, A and B, and all members within each population were identical, but the quantity of interest was 1% higher in members of B than in members of A, any reasonable sample would give you overwhelming evidence that the null hypothesis was false (i.e. mean(A) does not equal mean(B)), but this would hold for an arbitrarily small difference, and it’s dubious whether such a difference is meaningful. (Also note that the normality assumption is effectively violated in this example). The same logic extends to less contrived examples.

4. Finally, I think it’s useful to make explicit a common misconception. Despite what many people think, a p-value is NOT the probability of the null hypothesis being true, given the data. As you correctly pointed out, it can be thought of as the probability of obtaining a more unlikely test statistic, assuming the null hypothesis is true (and other assumptions mentioned above).

Looking forward to the follow up, and keep up the good work!

Dallas

Reply
Pingback: china products

TRC #305: Null Hypothesis Significance Testing + Racist Dogs + UFO Sighting Distribution

3 Responses to TRC #305: Null Hypothesis Significance Testing + Racist Dogs + UFO Sighting Distribution

Leave a Reply to Dallas Cancel reply

Social Media