Course Outline

segmentGetting Started (Don't Skip This Part)

segmentIntroduction to Statistics: A Modeling Approach

segmentPART I: EXPLORING VARIATION

segmentChapter 1  Welcome to Statistics: A Modeling Approach

segmentChapter 2  Understanding Data

segmentChapter 3  Examining Distributions

segmentChapter 4  Explaining Variation

segmentPART II: MODELING VARIATION

segmentChapter 5  A Simple Model

segmentChapter 6  Quantifying Error

segmentChapter 7  Adding an Explanatory Variable to the Model

7.7 Measures of Effect Size

segmentChapter 8  Models with a Quantitative Explanatory Variable

segmentPART III: EVALUATING MODELS

segmentChapter 9  Distributions of Estimates

segmentChapter 10  Confidence Intervals and Their Uses

segmentChapter 11  Model Comparison with the F Ratio

segmentChapter 12  What You Have Learned

segmentResources
list Introduction to Statistics: A Modeling Approach
Measures of Effect Size
So far we have been talking a lot about a model using Sex as the explanatory variable. These kind of models are also called a twogroup model because there are two values of sex in this data set and thus two groups of data. We’ve been thinking about trying to measure “how good (or bad) is our model?” but there is another way to think about this.
Given that an explanatory variable (e.g., sex) has an effect on an outcome variable (e.g., thumb length), how big is the effect? We call the answer to this question effect size. We haven’t used the term effect size up to now, but we have, in fact, presented two measures of effect size.
Mean Difference
The most straightforward measure of effect size in the context of the twogroup model is simply the actual difference in means on the outcome variable between the two groups.
L_Ch7_Measures_1
In our data set Fingers, we can see that the size of the sex effect is 6.447 mm: males, on average, have thumbs that are 6.447 mm longer than females.
PRE
PRE is a second measure of effect size. As just discussed, it tells us the proportional reduction in error of the twogroup model over the empty model. PRE is a nice measure of effect size because it is relative: it is a measure of improvement (reduction in error) that results from adding in the explanatory variable. But what counts as a good PRE?
Recall the TinySex.model had a PRE of .66 while the Sex.model had a PRE of .11. Are PREs in general going to be as large as TinySex.model? Probably not. In TinyFingers we stacked the deck for purposes of teaching, creating a data set in which all the females had smaller thumbs than all the males. This resulted in a large PRE.
As with every other statistic, PRE will vary from model to model and situation to situation. Having more experience with making models will give you a sense of what counts as an impressive PRE in your research area.
In the social sciences, at least, there are some generally agreedon ideas about what is considered a strong effect. A PRE of .25 is considered a pretty large effect, .09 is considered medium, and .01 is considered small. So according to these conventions, there is a medium effect of sex on thumb length in the Fingers data set.
Take these conventions with a grain of salt though because effect size ultimately depends on your purpose. For example, if an online retailer found a small effect of changing the color of their “buy” button (e.g., PRE = .01), they might want to do it even though the effect is small. The change is free and easy to make and it might result in a tiny increase in sales.
Cohen’s d
A third measure of effect size that applies especially to the twogroup model (such as the Sex model) is Cohen’s d. Cohen’s d is related to the z score. Recall that z scores tell us how far an individual score is from the mean of a distribution in standard deviation units. Cohen’s d, similarly, indicates the size of a group difference in standard deviation units.
\[d=\frac{\bar{Y}_{1}\bar{Y}_{2}}{s}\]
As with everything else in this class, there is an R function for calculating Cohen’s d.
cohensD(Thumb ~ Sex, data = Fingers)
Try running this code in the DataCamp window.
require(mosaic)
require(ggformula)
require(supernova)
require(lsr)
Fingers < read.csv(file="https://raw.githubusercontent.com/UCLATALL/introstatsmodeling/master/datasets/fingers.csv", header=TRUE, sep=",")
# run this code
cohensD(Thumb ~ Sex, data = Fingers)
# run this code
cohensD(Thumb ~ Sex, data = Fingers)
test_function_result("cohensD")
test_error()
L_Ch7_Cohen_1
We know that there is a 6.447 mm difference between male and female thumb lengths on average. If you think about a standard deviation as a little ruler, that 6.447 mm difference is a little less than one of those rulers (.78 to be exact!).
With something like thumb length, knowing there is about a 6 mm difference is actually pretty meaningful. But for other variables such as Kargle and Spargle scores, people may not be as clear what a straight point difference implies.
Nevertheless, in both cases, it is somewhat illuminating to add the information from Cohen’s d to the mix. Male thumbs are .78 standard deviations longer than female thumbs.
L_Ch7_Cohen_2